Driving Narrative and Emotion Through Music Systems: The Interactive Score of Spirit of the North 2

游戏音频

Spirit of the North 2 is a wondrous, family-friendly adventure where players take the role of a fox, joined by a raven companion, on a journey to heal the wounds of the past and restore the mythical, Nordic-inspired world. Told entirely without dialogue, its narrative unfolds through environmental storytelling, music, and the rediscovered histories of ancient tribes. Many players have compared the experience to Shadow of the Colossus or Journey, to give you a sense of its meditative genre. Players regularly share deeply personal stories about how the game has helped them through difficult times. Some have spoken of finding hope in dark moments in their lives, crediting the game’s tone and music for helping them pause and reflect.

In Spirit of the North 2, the atmosphere, ethereal score and calm exploration form the backbone of the entire experience. The soundtrack carries enormous narrative weight, guiding the player through themes of loss, discovery, and healing. Inspired by Icelandic minimalism, it balances melancholy, wonder, and transcendence — giving players the space to reflect and feel.

Because of this specific type of gameplay, my challenge as composer and audio director was to design a system in Wwise and Unreal Engine 5 — and write music — that could stay deeply connected to the player’s experience throughout the game. It had to scale across more than 40 hours of gameplay, provide variety without fatigue, and preserve the fragile, ethereal tone of the score — all within the constraints of a small indie team. The right tools, combined with a strong spirit of cross-disciplinary collaboration, made it possible.

This article explores how that vision was achieved.

One System to Rule Them All

In Spirit of the North 2, biomes flow seamlessly into one another without loading screens, and players can cross biome boundaries at any time. For this reason, we chose to build a single music system in Wwise, structured around carefully defined states.

One major advantage of building a state-based system was that it enabled collaboration with non-audio team members. Using simple Unreal Blueprints, designers could tag areas of the world with labels describing the intended atmosphere. These tags translated directly into Wwise states, allowing level designers to shape the emotional tone of their spaces and add musical context to environmental storytelling — all while building the world itself. The music system would then respond automatically, reinforcing their creative intent in real time.

This integration created a strong link between world-building and music, which was crucial for such a small development team.

The State Architecture

For the system to work, we needed a universal set of states in Wwise. Once these were defined and the Wwise structure was built, my creative role became one of writing and curating music within that framework. Here are the key state groups:

  • STA_GameState - At the highest level, determined whether the player was in gameplay or menus. 
  • STA_GameplayType  - Divided gameplay into biomes, scripted sequences, or boss encounters — the three core pillars of the system.
  • STA_MUS_WhatBiome - Within gameplay, identified which biome subsystem was active (Fox Island, Misthaven, Mosswood, Stormvale, Frigid Peaks, or Ashlands) 
  • STA_MUS_IsWispChase - Switched the score into dynamic chase music during wisp encounters.
  • STA_MUS_IsRacoonGrotto or STA_MUS_IsLightWeaver - Brought in character playlists during those events. 
  • STA_MUS_WhatSubBiome - Offered curated overrides for unique locations - such as Vista, Palace of Peace, and even some common motifs (eg. ‘relic room’). These allowed the default tone of a biome to be shifted deliberately — perhaps introducing mystery, tension, or awe.
  • STA_MUS_IsCorruption - Determined whether the music should reflect the corrupted environment.  
  • STA_MUS_IsDungeon - Either triggered unique music for dungeon areas or altered the existing biome score by changing stems or mixes.
  • STA_MUS_TOD - Adapted the score to the time of day, subtracting stems to create a more ambient night mix. 
  • STA_MUS_WhatEnvironment (Tribe, Ancient, Divine)  - Reorchestrated music with instruments representing ancient tribes of different areas of the world, or added stems related to the environment— Tagelharpas in Stormvale, didgeridoo in Ashlands, or mythical drones near obelisks. 
  • STA_MUS_WhatEmotion (Sympathy, Friendship) -  states like Sympathy or Friendship reinforced environmental storytelling, layering in cello or choir when the world asked for intimacy or tenderness.
  • STA_BossType, STA_BossVulnerable, and STA_Phase — part of Boss subsystem — which allowed music to escalate intensity across phases, soften during vulnerability windows, and introduce thematic material, including the main theme, during climactic final stages.

Wwise Always Listens: Music System on Autopilot

Once these states were set in Unreal, Wwise could react automatically. Some states were systemic, triggered by the game code, while others were manual, placed by designers to curate the atmosphere.

The responsiveness happened across several layers (See the diagram). 

  • Parent music switch containers defined whether the player was in menus, gameplay, scripted events, or boss fights. 
  • Nested music switch containers inside gameplay handled biomes, subbiomes, scripted cues, or boss-specific tracks, each with their own logic. 
  • Switch tracks, within individual segments, re-orchestrated the score dynamically — stems could be layered, removed, or swapped based on states like time of day, environmental tags, or emotional markers.
  • Mix rules within the States tab fine-tuned how these layers interacted. For example, when a “Sympathy” state was active, a cello stem might need to rise above tribal percussion. Here, a low-pass filter could be applied to the tribal flutes only in that state, ensuring the cello’s line cut through clearly.

This tiered responsiveness gave the soundtrack a fluid, evolving character.

Listen to the examples here:

Tagging the World: Help from Designers

One of the unique aspects of our workflow was involving the level design team directly in the music spotting process.

With a world as large as Spirit of the North 2, it would have been impossible for the audio team alone to play through 40+ hours of content and tag every cave, ruin, or vista for musical cues. The solution was to create simple Unreal blueprints with a box trigger  — things like BP_ChangeMusic_Sympathy, BP_ChangeMusic_Tribe, or BP_ChangeMusic_Vista.

Designers could drop these triggers straight into their levels while building, essentially “tagging” the world with musical meaning. They didn’t need to understand Wwise at all; they only had to think in terms of emotion: should this place feel awe-inspiring, sorrowful, intimate, or dangerous? This meant that music became a part of level design thinking, and emotional storytelling in the environment was instantly mirrored musically.

The Music System in Practice

To give you a sense of how the music system worked in practice, let’s walk through the journey as the player might experience it — moving from biomes and sub-biomes, to tribal echoes, emotional states, scripted moments, and finally cutscenes — with Wwise continuously reshaping the score in real time.

Biomes

Imagine you begin in Misthaven. The system first determines you are in gameplay (not in a cutscene or boss fight), then checks the biome you’re standing in. The STA_MUS_WhatBiome state automatically sets the correct nested switch container — Misthaven, which in turn plays the Misthaven Playlist which randomises segments within that playlist and tracks randomise within the segments presenting different melodies.

Each biome acts as an emotional chapter reflected in the music: Fox Island is pastoral, Misthaven carries grief, Mosswood offers hope, Ashlands embodies desolation, Stormvale speaks of struggle, and Frigid Peaks isolates the player in icy endurance and delivers a ‘being on the road’ vibe.

Then, if you come across a Wisp, a wisp chase override is triggered (STA_MUS_IsWispChase:True), and whimsical action music transitions in at the next bar and then transitions back to biome music when you catch it.

 

Sub-Biomes and Overrides

Let’s say the ‘Raven Temple’ subbiome was picked, adding the grand tone to the player’s experience. When the player moves to the edge of the Cliff, the ‘Vista’ Subbiome is picked, and the new music playlist helps to convey the grand view. 

When a nested music switch container is picked for a specific biome, further adjustments could be made as part of this nested structure. While default biome music conveys the general mood of a region, sub-biome states trigger bespoke “area music” crafted specifically for key locations. This allows us to shift the tone for dramatic effect: adding mystery in ruins, creating tension in the dim-lit chambers, or emphasizing awe with swelling strings when the player reaches a vista. 

When the player moves into catacombs, this nested switch container picks the dungeon playlist curated for that area, and it picks the corruption playlist when plague spores become visible. 

By receiving information that tells us whether the player is in the dungeon (IsDungeon) or within the corruption area (IsCorruption), we could switch the corresponding playlist by overriding the previously picked playlist. The priority hierarchy is ensured by setting a [wildcard] for STA_MUS_WhatSubBiome when IsCorruption is true. 

Reorchestration within segments: Tribal Instrumentation

Now you come across the ancient statue crafted by the raven tribe, and you start hearing a distinct vocal melody - a recording of ‘kulning’ (Nordic herding call) follows the harmony of the existing music and some of the instrumentation gives way to create space for it. 

Each level contains traces of lost tribes, whose cultural imprint shapes the sound. Entering ruins or sacred places triggers state changes that cause reorchestration of the biome music with tribal instruments: Tagelharpas and nyckelharpas in Stormvale, didgeridoo in Ashlands, kulning vocals in Misthaven, or ancient flutes and wooden percussion in Mosswood - to name a few.

They appear only when relevant, giving players a sense of discovery, and reinforcing the presence and history of those tribes. Technically, this was driven by manually-placed tags (via BP_ChangeMusic_Tribe) with the STA_MUS_WhatEvironment state group triggering a change in switch tracks, allowing curated moments of cultural identity to emerge. The same could apply to ‘Ancient’ or ‘Divine’ environmental storytelling - for example, if a player came across an Obelisk or a Portal. Some event music (eg. wisp chase) and scripted music could also receive these additional stems via switch tracks based on the environment.

Reorchestration within segments: Emotional States

Now imagine you stumble upon the ruins of two lovers who died embracing. A manually placed “Sympathy” state (with BP_ChangeMusic_Sympathy) softens the score, weaving in a solo cello, muted choir, or fragile piano. This is a re-orchestration of the current biome piece, fitting harmonically so it feels natural. New stems are added and some of them subtracted.

State-Driven Mix Changes

Because States aren’t exclusive, combinations create nuanced mixes. To keep textures balanced, I used the States tab for filtering and volume adjustments — e.g. low-pass filtering ensuring melody lines cut through. Wwise’s state-based mix rules allowed me to subtly rebalance stems depending on context. For example, when the Sympathy state was active, a solo cello stem could be pushed forward through a volume boost, and a gentle low-pass filter applied to competing instruments. In a different context, if the Tribe state was active in Mosswood, ancient flutes might step forward — but if Sympathy triggered at the same time, those flutes could be filtered to leave space for the cello to carry the emotional line. This gave the score a kind of “emotional mixing desk” that responded in real time to the world and the player’s journey.

Scripted Music

Alongside exploration music, certain sequences required scripted music that could not be interrupted. These were moments where the narrative demanded absolute focus: puzzle completion fanfares, key story cutscenes, or celebratory cues marking transformation. In these cases, music was triggered through the Scripted gameplay type and set to play from start to finish, overriding any active Biome parent switch container. Technically, this was managed with Music Event Cues in Wwise and a transition rule, ensuring the system would only transition back to the previous state after the scripted cue had fully resolved. To reinforce cohesion, many of these short pieces were built around the game’s main theme or its harmonic progression, subtly reminding the player of the overarching musical identity. This approach preserved dramatic impact—whether it was the quiet reverence of a fox’s transformation or the celebration of solving a puzzle—while still tying the moment back into the score’s central motif.

Cutscenes

Technically, cutscenes were handled differently from open-world gameplay. Instead of relying on the state-driven parent system, I opted to score them as dedicated Wwise Events triggered directly from the Unreal Sequencer. This gave me frame-accurate control. 

Transitions and Cohesion

With over eight hours of in-game music, seamless transitions were vital. Creating endless custom transition rules would have been unsustainable, so I solved this at the composition level. Most cues were written in related keys — A major, C# minor, F# minor — which allowed Wwise to crossfade or use pedal-tone drones without harmonic clashes. Tempos were similarly grouped (60–68 bpm or 120–136 bpm), making resequencing easier.

Instrumentation also supported blending. Nearly every track began with an ambient drone layer designed as a “gel” for transitions. These drones could carry across cues, smoothing shifts from one texture to another. The instrumentation itself drew from Icelandic minimalism: ebows on guitars, bowed resonators, prepared pianos, and fragile string textures evoking history and intimacy, combined with shimmering glockenspiel and bowed glockenspiel to suggest frost and northern lights.

Randomisation

Fighting ear fatigue was critical. To keep the experience fresh, Wwise introduced variation at multiple layers:

  • Randomized segments within playlists ensured cues never looped identically.
  • Random switch tracks introduced subtle variations in melodies or stingers (e.g. puzzle completion cues sounding slightly different each time, or Mosswood biome music main melody would be played by a different instrument each time).

Boss Battles: Music as Feedback

Guardians weren’t traditional enemies — they were corrupted beings caught in cycles of suffering. The music had to reflect not just intensity, but also empathy.

Bosses were the only combat-like encounters, and their music systems reflected their unique narrative weight. Each boss fight included multiple phases, moments of vulnerability, and a final transformation. States like STA_BossPhase and STA_BossVulnerable drove these changes.

Example:

Each boss had multiple phases, controlled by states like STA_BossPhase. As intensity increased, the score modulated upwards and expanded in orchestration. Vulnerability phases softened orchestration (via switch tracks), with subtle stingers or cymbal swells signaling opportunities to strike.

A memorable example was the Ram Guardian fight. Here, tremolo swells in the strings were synced via custom cues to the timing of lightning strikes. Players learned to anticipate danger by listening to the music.

The music shifted from bitterness to sorrow as the fight progressed (reported via STA_BossPhase), embodying the idea of letting go. As anger drained away, the score softened, guiding the player to see the encounter not just as a fight, but as an act of healing and main theme would play in the last phase

The final boss brought these systems together, revisiting themes from earlier Guardians and blending them dynamically depending on which story elements were on-screen - so depending on the guardian on-screen the music could play the relevant theme.

Debugging

To aid in system testing, we built a mid-level data reporting system. Our states would be presented as database entries that would be reported to our music debug tool. This would tell us whether the music was being reported correctly in Unreal Engine. If it was, then it would mean that something was at fault on the Wwise side. Conversely, If the debug would show us that a specific state wasn’t reported in UE5, we could go back to debugging Unreal Blueprints straight away. Although you could find similar information via the Wwise Profiler, this custom debug allowed me to iterate quicker. Also, it allowed me to change states in database entries without touching Blueprints at all. 

Reflection

From the very beginning, the Spirit of the North 2 team set out to create something truly special — an experience that could move players emotionally, help them heal, and encourage reflection. The score’s Icelandic-minimalism-inspired orchestration, combined with a deeply responsive technical system — from designer-placed tags to state-driven interactivity — made that vision possible.

Pav Gekko

Composer | Audio Director | Music Implementer | Spirit of the North 2

Freelance

Pav Gekko

Composer | Audio Director | Music Implementer | Spirit of the North 2

Freelance

Pav Gekko is a British-Polish composer and audio director, and a 2025 Music+Sound Awards finalist recognized for his emotionally immersive work in games and media. A BAFTA member and Wwise-certified professional, he is known for bridging orchestral tradition with interactivity. His recent orchestral recording with a Midlands-based symphony, championing live musicianship in game music, was featured by IGN, PlayStation, and Xbox channels. His credits include Smalland: Survive the Wilds, praised for its neo-classical score, and Spirit of the North 2, acclaimed for its emotional, interactive soundtrack. Beyond games, his music has appeared in TV series such as Mythic Quest (Apple TV) and Kung-Fu (HBO) and Rings of Power: Behind The Scenes (Amazon). A winner of the Midlands Movies Best Score Award and finalist for Best Music Artist at the 2025 We Are Creative Awards.

 @pavgekko

评论

留下回复

您的电子邮件地址将不会被公布。


更多文章

游戏声音工作原理与优化的经验分享

前言 我是一名在心动网络从事音频相关工作的技术人员,有的人称我们为音频程序员,有的人称我们为技术音频,也有的称我们为TA(Tech...

2.12.2020 - 作者:吴明辉

游戏音频岗位技能 – 如何谋得游戏音频设计师职位

游戏音频设计技能:对 100 条游戏音频职位招聘信息的分析

9.3.2021 - 作者:布莱恩·施密特 (Brian Schmidt)

关于Wwise插件开发流程

前言...

24.3.2021 - 作者:李昱宸

利用Wwise的基础功能构建空间音频效果

各位同行老师朋友大家好,我是JYUN,给大家拜个晚年。

1.3.2022 - 作者:谢玮

《帝国时代 4(Age of Empires IV)》背后的音乐设计

大家好,我叫林•加德纳,是 Relic Entertainment 的首席音频设计师,也是《Age of Empires IV (AoE...

30.12.2024 - 作者:Lin Gardiner

手游音频:设计难题和解决方案

这篇文章最初发表在 Plarium 官网上。手游的技术和美学要求一直在提高,跟主机和 PC...

21.8.2025 - 作者:Illia Gogoliev

更多文章

游戏声音工作原理与优化的经验分享

前言 我是一名在心动网络从事音频相关工作的技术人员,有的人称我们为音频程序员,有的人称我们为技术音频,也有的称我们为TA(Tech...

游戏音频岗位技能 – 如何谋得游戏音频设计师职位

游戏音频设计技能:对 100 条游戏音频职位招聘信息的分析

关于Wwise插件开发流程

前言...