Wwise SDK 2022.1.1
Tips to Reduce Memory Usage

It can be challenging to trim down memory usage to fit your required limit. Here are a few tips to help reduce memory usage:

Object Memory

Object memory usage is directly impacted by the number of sounds and events loaded in-memory and the amount of game objects. It contains all the properties of the objects in your project necessary to implement the behaviors of the sound design. It also contains all the game objects and their related information (game syncs values, position, orientation, and so on). The more banks are loaded, the more memory needs to be allocated. The size needed depends only on the number of sounds that can possibly be played in one scenario, level, map, game area, or the like. There are practices that can help reduce these allocations:

  • Split big SoundBanks with lots of sound structures and Events into smaller SoundBanks. For guidance, you can check the Advanced Profiler's SoundBanks tab to see in the Object Memory column how much memory each SoundBank is taking. Then you can load and unload SoundBanks dynamically as needed. Think strategically in dividing them. For example, in dividing dialog SoundBanks, avoid creating ones with all lines for a single character. Instead, group the dialog contextually.
  • Use the AK::SoundEngine::ExecuteActionOnEvent API to reduce the number of events. Play/Stop pairs can be replaced by a single Play event and a call to ExecuteActionOnEvent for the Stop and the Pause/Resume (it could be the same Play event).
  • Tightly manage your game objects. Unregister them as soon as their role is finished. Avoid keeping a pool of unused game objects alive; there is absolutely no gain in doing this, but it can cost in memory. For example, imagine an NPC dies. Unregister its game object; do not reuse it for something else. Register a brand new one when needed. As a general rule, if you have thousands of game objects alive, it is too many.
  • Do not use Actor-Mixers only to organize sounds. Folders and Work Units do not take memory, Actor-Mixers do. Unless they DO share similar properties that are not the default, then you save memory because the property is there once only. Of course, this also depends if the Actor-Mixer is referenced by events (such as SetVolume or SetPitch).
  • Try to reduce the size and complexity of large hierarchies. A common example of a large hierarchy would be an "Impact" hierarchy or a "Footstep" hierarchy. With lots of variables, it can grow large and take lots of memory for structure. Here are a few ways to reduce such a hierarchy:
    • Use RTPCs if the only thing changing in the Switch is a simple property (same samples but different volume/pitch/randomizer etc.).
    • Split your Switch Container hierarchy into multiple banks. In the SoundBank Manager, when you include a Switch container, all its sub-branches are also included. However, you can exclude some of the branches manually in the SoundBank Editor view, in the Game Sync tab or in the Edit tab. For example, in a "Footstep" hierarchy, the first Switch variable could be the Surface Type. You could then split the Switch across different SoundBanks and load them depending on the context. You could have a main "Footstep" SoundBank containing the surfaces that are encountered throughout the game, such as concrete and metal stairs in a city environment, and other contextual SoundBanks with specific surfaces, such as mud being used only in one scene/section of the game.
  • Use "external sources" to reduce the overhead of sounds that do not need as much control as offered by the Wwise Actor-Mixer hierarchy. This is usually appropriate for voice-overs.

Processing Memory

The memory in the Processing category is used to play sounds. It contains buffers to decompress, apply effects, and mix the audio sources. It is directly influenced by the number of sounds playing at the same time. It is also influenced by the number and type of effects that are used at the same time. To trim this down, you need to ask yourself how many sounds you want to hear at the same time. Some games will rarely have a scenario where more than 10 sounds are heard, others will have hundreds. You need to consider your worst case.

As a guideline, we have done some profiling on some games (on Xbox One) and had the following numbers:

  • 1 MB will let you play approximately 42 voices
  • 2 MB will let you play approximately 96 voices Although it scales mostly linearly, it really depends on which codec is used, how many effects, and other such factors. For example, using the Vorbis codec will use around 50% more memory per voice depending on the quality settings. Imagine 170 sounds at the same time: it is probably unintelligible and, therefore, useless. However, it takes some experimentation to find an ideal real number of voices for your game. Use the Memory tab of the profiler, profile multiple scenarios in your game, and note how much is used in resources.

To reduce the memory used for processing, you need to reduce the number of simultaneous voices. This can be done using:

  • Playback Limits (Advanced Settings). For example, do you really need to hear 50 bullet ricochets? If not, maybe limit those sounds to say 15.
    Note: You can set limits on busses as well.
  • Priority (Advanced Settings). For example, bullets could be less important than dialog. This means that bullets would get kicked first if there are too many sounds. Use in conjunction with Playback Limits.
  • Distance-based priority offset (Advanced Settings). Objects that are far are usually less important than closer ones. For example, again with bullets, we don't need to hear bullets that are 10 meters away if there are 15 other bullets sounds closer than that.
  • Below Threshold Behavior (Advanced Settings). The least expensive option (CPU and memory) is "Kill Voice", which is useful for non-looping sounds. The second preferred option is "Send To Virtual" "Play from beginning", then "Send To Virtual" "Resume" and then "Send To Virtual" "Play from elapsed time". "Continue to play" and "Play from elapsed time" are the most costly options, and Wwise's default value is "Continue To Play".
  • Volume Threshold (Project Settings). This will help kill the sounds too faint to be heard. This goes hand in hand with the Below Threshold Behavior and also the Attenuation settings (farther usually means fainter).
    Note: You can change the volume threshold programmatically at run-time. You may use this in game locations that are more process-heavy in order to send more voices to their "under volume threshold" state.
  • A change to the used codec settings (Conversion Settings). Vorbis needs extra memory to decompress audio. Different parameters can increase or reduce the amount needed. However, think carefully about the tradeoff: using a different codec or a weaker compression ratio will relieve the Processing memory load but at the cost of larger files in memory, if they're in loaded banks. In some cases it's better to see an extra 500 KB in Processing memory usage to save a few MBs on Media memory and, therefore, on the overall audio budget.
  • Fewer or lower quality. Some effects need a lot of memory to be processed. One very common memory consumer is the Reverb effect, in any of its available flavors. Realistically, your game should have very few reverbs running concurrently. As a rule, we suggest less than 4. Also, reducing the quality or length of the reverb will help.

Media Memory (SoundBanks)

The amount of memory taken by SoundBanks is mostly dictated by the sound data in it. Controlling the amount of memory used by your media can be done through:

  • Splitting big SoundBanks with lots of sound structures and Events into smaller SoundBanks. Load and unload dynamically as needed.
  • Streaming more sounds from disk (Sounds' properties). Sounds that are latency-sensitive can use prefetch media, which can be pre-loaded, or be streamed into cache on-demand using the PinEventInStreamCache API.
  • Using the PrepareEvent() API.
  • Compressing the audio more (Conversion Settings, codec, and so on).
  • Using a lower sampling rate. Also look at the Automatic Sample Rate Detection feature of the Conversion Settings.
  • Replacing wind-type sounds with a Soundseed Wind/Woosh plug-in equivalent. Wind ambiance tends to be long loops, which can take a lot of media space. Blades woosh, propellers, wind rushing in a car with open windows, ventilation noises, and so on can be modelled with this plug-in. Also consider non-windy applications: any noisy sound could be modeled. Examples: Ocean waves or the sound of a highway in the distance.