Blog homepage

Wwise HDR: Overview and Best Practices for Game Mixing

Wwise Tips & Tools

Introduction

HDR (High Dynamic Range) is a feature within Wwise, which is a very powerful tool for mixing and managing the output dynamic range of your project. I have been fortunate enough to use Wwise for most of my game audio career, and have been slowly delving into the world of HDR more and more recently. While this feature has been around since at least Wwise 2013.1, I believe there are still ways to discover its usability and customize the way it’s utilized for projects. Hopefully from reading this article, you are able to understand how HDR works a little more, and hopefully walk away with a few ideas of how to manage and implement it in a way that benefits the games you work on.

Why Wwise HDR?

You might be wondering yourself - why should you use Wwise HDR for your project, when you already have other means of mixing at your disposal? The way that HDR works is that once the system is set up and the parameters are defined, the game can basically mix itself! It can also sound a lot more transparent than using traditional sidechaining and compression systems that rely on signal detection, attack and release to achieve dynamic range compression. It also allows you to easily establish and manage relative mix relationships between sounds instead of doing absolute and sometimes unintentionally additive mixing.

HDR Threshold and Ratio

In order to determine what participates in the HDR system and how it gets affected, there are two extremely important parameters in Wwise - HDR Threshold and HDR Ratio. These are set in the Master-Mixer hierarchy wherever the HDR system is enabled.

img1

I would recommend enabling HDR and setting these parameters in as few locations as possible. In every single project I have worked on has only had one instance of HDR active in the Master-Mixer hierarchy. This makes it far easier to manage and understand the HDR behavior - as long as the sound playing is being routed to this bus or a child bus under this HDR parent, it is adhering to this single set of HDR rules with no complicated exceptions. Of course, every project has its own needs, but I’ve found it best to make the behavior as predictable and simple as possible, which is extremely important as game scope (and the audio complexity) continues to grow in recent years.

HDR Threshold

This sets the threshold at which the active sound will start to compress the dynamic range of your game’s mix, or rather duck other sounds below it. Wwise uses the sound’s final Voice Volume to determine whether it’s over the threshold or not. This includes downstream adjustments to Voice Volume and Bus Volume, but does not include Make-Up Gain. I’ll talk more about the best ways to manage Voice Volume and Make-Up Gain later in the article.

HDR Ratio

This determines how much the final bus output of the sound is reduced once it goes above the HDR Threshold. In this way, HDR functions like a traditional compressor! For example, if you have a sound that has a final Voice Volume value of 20, and your HDR Threshold is 0, the sound is relative +20dB over the HDR Threshold. If the HDR Ratio is set to 2:1, this means that the sound will only actually be 10dB louder. So in this case, what happens to the other 10dB? This is where HDR Envelopes come into play, as well as the relative relationship between other sounds in your game. I will go more into this later.

Voice Volume as Absolute Loudness

Wwise utilizes a sound’s Voice Volume as a way to determine how it behaves within the HDR system. This is inherently tied to the gain staging of the sound, so it’s very easy to dive into the HDR system with the mindset that “loudness = importance”. However, depending on the normalization of your assets (or lack thereof), two different sounds both being at 20 Voice Volume may not have the same actual loudness. If you’re thinking of using HDR as a way to represent a sound’s actual loudness, then it will be best to make use of Wwise’s Loudness Normalization feature to ensure that Voice Volume is an absolute representation of a sound’s loudness rather than an arbitrary value. Wwise’s default is -23dB LUFS, so you can easily enable this normalization across the project so all of your content will be at -23dB LUFS when the Voice Volume is set to 0.

img2

Using Momentary Max normalization might be a good idea for many sounds, especially those that are short in nature or have more transient content (like explosions, weapon fire, magic spells, etc.). Using Integrated normalization might be better for longer or more static content (like projectile flight loops, ambient beds and emitters, etc.). From this point forward, choosing your HDR Threshold becomes an absolute decibel value instead of an arbitrary number. So for example, if your HDR Threshold is set to +5 Voice Volume, this would mean that any sound above -18dB LUFS would be having its gain reduced by the HDR Ratio and also ducking other sounds that are quieter than it. This closely approximates the way our ears hear things in real life and will help create an intuitive mixing environment. 

Rethinking Voice Volume

While the previous method of utilizing Wwise’s HDR system might be the fastest approach, it might not be what your game needs. What if there was a way to rethink the way Voice Volume is used in your project to push the boundaries of how HDR can be used to achieve your game’s mix?

When using Wwise HDR, you might want to consider using Voice Volume as a way to determine sound importance, which doesn’t always equal loudness. For example, there could be a sound that needs to be ducked out of the way for a more important sound, but no ducking will occur if their absolute loudness levels are equal. This happens quite often in multiplayer games where teammates can be firing weapons and using abilities that should have high loudness to match the impact of the action and visuals, but an enemy could use the same ability or fire the same weapon. If you were using the Voice Volume as Absolute Loudness method of HDR, you would either have to turn down the teammate abilities and weapons quite a bit or turn up the enemy versions even more in order to achieve the relative mixing relationship. This creates lots of volume disparities between sounds and becomes increasingly difficult the larger the hierarchy of sounds becomes. Also, depending on the player’s listening environment, they might not be able to take full advantage of game mixes that have higher dynamic range. 

Using Make-Up Gain

In order to work in this way, whatever Voice Volume you assign to a sound now becomes an arbitrary “importance” value instead of a means to dictate the loudness or gain of a sound. If you choose to go with this method, from here on out it is best to use Make-Up Gain as your primary method of getting a sound’s output loudness to where you want it. Make-Up Gain (like Loudness Normalization) has no bearing on the Voice Volume of a sound, and therefore has no effect on a sound’s behavior in the HDR system.

When you choose to use Voice Volume in this way, you will often be setting Voice Volume values above the HDR Threshold. In order to keep the final output loudness the same as it was before, it is recommended to apply negative Make-Up Gain equivalent to the relative post-HDR-Ratio gain increase for the sound. For example, if your HDR Threshold is 0 and your HDR Ratio is 2:1 and you have a sound set to 20 Voice Volume, the final output gain increase of the sound will be equal to +10dB. In order to ensure that the gain is the same as before setting the Voice Volume value, you would then set your Make-Up Gain to be -10dB lower. 

img3

 

img4

One important tip in this regard is remembering that relative value adjustments can be made in Wwise by putting “+” or “-” after the number. So for example, if the sound already has a Make-Up Gain value of -8, instead of having to do the math in your head for the new number, you can just type in “10-” and the new value will be 10dB lower than the previous value.

gif1

Relative Mixing

One of the most important things to keep in mind while determining Voice Volume values is how the HDR Ratio comes into play. The HDR Ratio determines the amount of gain reduction that  is applied to a sound whose Voice Volume over the HDR Threshold, but any gain reduction applied has a direct effect on how much that sound ducks other sounds out of the way. One way to think about the balance between the increase in loudness and ducking applied to other sounds is like an iceberg floating on the surface of the water. Whatever is above the water is actually perceivably louder, where whatever is below the water serves to displace other sounds. 

img5

In the above example, the sound in question is a Voice Volume of 20. With an HDR Ratio of 1:1, no gain reduction is applied to the sound, so it is +20dB louder. With a ratio of 2:1, the sound is only +10dB louder, but ducks other sounds out of the way by -10dB. With a ratio of 4:1, the sound is only +5dB louder and ducks other sounds by -15dB. With a ratio of 100:1 (which is the maximum that Wwise allows), the sound is essentially 0dB louder and will duck other sounds by -20dB. When determining the HDR Ratio settings on the bus, it is advised to choose a ratio that makes the sound mixing process easier. For this reason, 2:1 is a ratio where you are able to still take advantage of the ducking behavior, but making the Make-Up Gain calculations is very easy to do on the fly. 

So what happens when multiple sounds are playing above the HDR Threshold? When the HDR system is determining how to duck other sounds out of the way, it is looking for the sound’s relative difference of Voice Volume instead of using a sound’s absolute voice volume to dictate this. Let’s say that there are three sounds playing at the same time - a player lightning spell (40 Voice Volume), an enemy explosion (30 Voice Volume) and an ambient bed (0 Voice Volume).

img6

In the above illustration, the HDR Ratio is 2:1 and the HDR Threshold is 0. Since the player lightning spell has a Voice Volume of 40, it will only duck the enemy explosion by -5dB since the difference between their Voice Volumes is 10. However, the ambience will be ducked by -20dB since the difference between it and the most important sound is 40. If the enemy explosion was playing by itself, it would duck the ambience by -15dB since the difference in Voice Volume is 30

HDR Window

It’s important to note here that the presence of multiple sounds with higher Voice Volumes will not create situations of additive ducking. Sounds above the HDR Threshold will move the HDR Window, which is described in more detail in the Understanding HDR documentation. The amount of ducking happening to a sound can only go as far as is dictated by the top of the HDR Window, which is essentially the highest active Voice Volume at any given time. This makes HDR an ideal option compared to more traditional Wwise Meter/RTPC based ducking, which can be additive in nature depending on the complexity of the systems at play. Keep in mind that moving the HDR Window also causes the Volume Threshold to move upwards in proportion, which can momentarily change the point at which a quieter sound can be sent to Virtual Voice or killed. 

img7

 

img8

Keep in mind that as long as a sound is at or below the HDR Threshold, sounds above the threshold will calculate the difference as if the sound was at the HDR Threshold. So if the 40 Voice Volume player lightning spell is playing at the same time as a -15 Voice Volume sound, it will duck the sound by -20dB and treat the relative Voice Volume as if it was 40. For this reason and the Voice Threshold/HDR Window behavior detailed above, it is advisable to use Voice Volume for mixing if the sound is going to remain below the HDR Threshold. This makes sure it will still be ducked in the way that’s intended, but take advantage of Voice Volume based voice optimization. 

Keep the Pipeline Clean

Because Wwise uses a sound’s final Voice Volume as the means to determine its behavior within the HDR system, you must be aware of any sort of cumulative changes that could be applied to your sound as it travels through the voice pipeline. This includes things such as parent Actor-Mixers, RTPCs, States, Randomizers, and other methods of changing Voice Volume. Great care must be taken to make sure that there aren’t any unintentional Voice Volume changes being made that could be affecting the HDR behavior of your sound. Using the Voice Inspector is a great way to debug what is happening to your sound and see everything that’s affecting your sound’s Voice Volume. This can help you get insight into all of the ways Voice Volume is being affected in your project, which you can either clean up or even use to your advantage!

Another thing to note is that relative Bus Volume and Output Bus Volume changes also have an effect on a sound’s final values as far as HDR is concerned, if these changes are taking place before the signal reaches the HDR bus. For example, if your sound is at 20 Voice Volume but its Output Bus (or a bus downstream from it) has a Bus Volume of -4, the final value of the sound will be 16 instead. This also applies to things like Auxiliary Buses! So if you end up sending your HDR sound to an Auxiliary Bus that has a positive or negative Bus Volume associated with it or downstream from it, the aux send for that sound might be scaled differently depending on its value, the HDR Ratio, and your HDR Threshold.

For the above reasons, it’s best to try and decide about using HDR as early as possible, so you have more control over your Voice Volume gain staging and can set a precedent for the rest of the team. If you are implementing HDR in the middle of production, you might have a long battle ahead of you in making sure that your sound’s routing is clean and consistent in the way your pipeline affects the Voice Volume. Being able to make the most of Queries and writing your own custom scripts can go a long way into converting unwanted Voice Volume changes into Make-Up Gain so you can start from square one and have all of your Voice Volumes reset to where you need them to be.

If you want to keep the sonic integrity of your HDR system intact but still alter the output loudness of these sounds, using Make-Up Gain can be very useful! For example, if you have an RTPC tied to the game’s audio setting SFX Volume slider, you might want to use Make-Up Gain instead of Voice Volume so the relative mixing relationships stay intact even if the overall amplitude is lowered. You can also use the Wwise Gain plugin to achieve the similar effect, but keep in mind that if this is placed in the Master-Mixer hierarchy, the effects will only process the Bus Output of the sound instead, which will have no effect on Voice-related routing such as Game-Defined Auxiliary Sends. We will go over the Voice Pipeline more in the next section. Keep in mind that Wwise’s Loudness Normalization feature also has no effect on HDR.

As far as HDR is concerned, if you choose to do any additional Voice Volume changes for whatever reason, it is best to choose to do subtractive Voice Volume changes as opposed to additive Voice Volume changes. It’s one thing to take a sound above the HDR Threshold and dynamically lower its value, as you might want to purposefully change a sound’s position in the relative mix hierarchy. However, doing additive Voice Volume can potentially raise a sound to be above the HDR Threshold, and if this sound was never intended on playing above the HDR Threshold in the first place, there can be unintended side effects. 

Speaking of your pipeline’s gain staging, you might opt to purposefully alter the Voice Volume in order to dynamically change how a sound or even a category of sounds behaves in the mix depending on certain variables. One such example is the concept of enemy “threat”. If you have a static Voice Volume assigned to an enemy sound, but you can be fighting many of these enemies at once, not all of the instances of this sound need to be equally as important within the mix. Using various gameplay variables tied to RTPCs means that you have flexibility to change the Voice Volume and Make-Up Gain in ways that help give mix priority to sounds that need it, and lower the “sound importance” of sounds that don’t.

img9

In this picture, you can see that the “enemy threat” level turns down the Voice Volume as the enemy gets less threatening, but the Make-Up Gain turns it up. Based on the HDR Ratio, this ensures that the sound doesn’t actually get audibly quieter in the mix, but it ducks other sounds less and in turn gets ducked more.

HDR and Auxiliary Sends

One thing that’s important to keep in mind is how HDR can affect your auxiliary sends. Recall that the HDR settings are set at the bus level within the Master-Mixer hierarchy. This means that the HDR Threshold and HDR Ratio only applies to signal being routed through the bus that has HDR enabled. Game-Defined and User-Defined Auxiliary Sends that are set within the Actor-Mixer hierarchy will end up being routed to signal paths and buses that exist outside of your main HDR-enabled bus. Because of this, you can run into the unintended effect of your dry path having its gain reduced by the HDR Ratio, while the wet path does not. For example, if you have a 40 Voice Volume sound being routed through an HDR-enabled bus that has an HDR Ratio of 2:1, the actual output gain of the sound will be +20dB louder. But if that same sound is being sent to an auxiliary bus that does not have HDR enabled on the bus or on its parent bus, the full 40 Voice Volume signal will be sent here, meaning that the wet signal of the sound will be +20dB louder than the dry signal.

img10

In order to solve this issue, it is not practical to override and offset the auxiliary send amounts in the same way you would with the Make-Up Gain. Instead, you can simply just enable HDR with the same HDR Threshold and HDR Ratio settings on the auxiliary bus’ parent, and any sound sent to a child of that parent bus will obey the HDR rules and have its signal’s gain reduced in the same exact way as the dry path. You can also move your auxiliary buses to live under the same parent bus as your other HDR related buses, so that they will have the same HDR settings by having the same parent bus. One thing to note is that if you happen to be sending non-HDR content to these HDR-enabled auxiliary buses, Wwise will recognize that the dry path does not have HDR enabled and will not apply HDR behavior to those sounds’ auxiliary routing. 

Sensitivity and Active Range

Once a sound is above the HDR Threshold, how it ducks other sounds over the course of the sound is dictated by its HDR Envelope. If the relative Voice Volume relationship dictates the amount that another sound is ducked, the HDR Envelope determines how this unfolds over time. If a sound does not have Envelope Tracking enabled but it has a Voice Volume that’s over the HDR Threshold, it will use the entire flat duration of the sound to duck other sounds.

img11

In order to have ducking behavior that more closely matches the properties of the sound, you will want to enable Envelope Tracking in order to customize this. If you navigate to the sound or Actor-Mixer where you want this property to be set and go to the HDR section of the Property editor, click the Enable checkbox.

img12

Sensitivity

This determines how many points are automatically drawn on the sound’s HDR envelope. Lower values will be easier to edit and control but have less granular detail, where higher values will have more granular detail, but be harder to edit after the fact and might capture parts of the sound that you don’t want to use in the envelope. You should find a default Sensitivity value that is closest to the envelope you want to hear, then edit it manually from there. I found that Sensitivity values from 4-12 tend to offer the best results, but what works best for your project may vary.

img13

 

img14

In order to successfully edit a sound’s HDR Envelope in the most representative way, be sure to switch the Source view from Peak to RMS, as the sound’s RMS is what the system uses to automatically generate the points of the envelope. This also makes it easier to see the overall energy of the sound over time if you choose to edit the envelope in a way that more accurately captures what you need.

gif2

Active Range

An HDR Envelope’s Active Range is a threshold that dictates how much of the HDR Envelope to use. You can think of it like chopping off the top of a mountain, and the Active Range dictates how much you want to cut off, starting from the peak of the mountain.

Let's start with this automatically generated HDR Envelope. You can see that the peak of the envelope is sitting at about -3.5dB. This envelope’s position is relative and is not absolute. It does not dictate the amount of ducking that the sound does - that is only defined by the sound's final Voice Volume. Instead, you can think of the relative gap between the lowest and highest point of the envelope as 0-100% of the ducking that is possible. So if your Voice Volume is set to 20, at the bottom of the envelope will be 0dB of ducking, and the top of the envelope will be 10dB of ducking (given an HDR Ratio of 2:1).

img15

 

img16

By adjusting the sound’s Active Range, you might want to set a value that allows you to use the entire envelope. For example, if the active range is set to somewhere 30 or higher, it will essentially use all of the envelope. If you start with the peak of the envelope (-3.5) and subtract 30, you get -33.5. The RMS view in the Source Editor only goes down to about -35dB, so we are basically using all of the envelope that was drawn. If you choose to set a lower Active Range value, the HDR system uses far less of the envelope to achieve the ducking effect. You can see this in the below example - the white line is the Voice Volume and generated envelope of the sound, where the blue shape is the HDR Window as dictated by the HDR Threshold and HDR Ratio. The sound with an active range of 40 uses all of the generated HDR Envelope’s duration, whereas the same sound with an Active Range of 4 uses far less of the envelope’s duration. This can be useful if your automatically generated envelopes tend to capture a lot of sound tails that you don’t necessarily want to use to duck other sounds.

img17

Another useful tool in the toolbox is assigning an Active Range of 0 to sounds. This is very useful if you determine that a sound is important enough to have a Voice Volume above the threshold, but you don’t want it to actively duck other sounds. You can sort of think of it as an “HDR bypass”. This is especially useful for high priority sounds like ambient loops that are important for narrative or gameplay moments, or sounds like enemy footsteps where you don’t necessarily want them ducking other sounds, but you might assign them a Voice Volume value that makes them ducked less than other sounds. Keep in mind that simply disabling Envelope Tracking does not disable the sound’s ability to duck other sounds - it just makes it use the flat duration of the sound to duck instead. Setting the Active Range to 0 makes it so none of the envelope is used. In this case, it doesn’t necessarily matter what the envelope looks like, but it might be a good idea to author a good envelope anyway just in case you want the sound to have Active Range again later.

Envelope Authoring

Wwise does a great job of automatically generating HDR Envelopes for sounds, but you might want to take it one step further and manually author these envelopes for ultimate control over the mixing experience. In order to create a ducking system that is as transparent as possible, being able to subjectively determine what the most important parts of the sound are and how that changes over time can be an indispensable part of your toolkit. 

In the Source editor, you should try starting with an HDR Envelope that is closest to what matches the overall energy of the sound. As mentioned above, a Sensitivity value of somewhere between 4 and 12 might be a good starting point. Here are some pointers to think about when manually adjusting your HDR Envelopes:

Example #1

img18

For the same explosion sound, this envelope starts immediately with no attack to ease into the ducking. This might result in the gain reduction of other sounds being audible before the energy of the sound takes over. Even if it’s just for a few milliseconds, it can be enough to make the ducking effect noticeable. This envelope also ends very early, so it might not spend enough time ducking other sounds to feel the gravity of the explosion sound. Everything is context specific, but there is not an easy way to get more ducking out of this sound without editing the envelope. In the reverse case, at least it’s possible to lower the Active Range to use less of a longer envelope.

Example #2

img19

In this example, the attack of the envelope is too slow, so there’s not enough room made in the mix in a timely fashion to carve out space for the explosion’s transient. Even though you might not notice the ducking occurring, the transient might get masked by other sounds here. For the rest of the sound’s duration, the amount of ducking that is happening might be too much relative to the RMS of the sound. I always find that it’s best to “hide” the ducking behind the RMS of the sound in order to hide the gain reduction that’s being done. Lastly, the tail of the sound is still ducking other sounds well after the important part of the sound is over, and this might result in other sounds being reduced for longer than intended. 

Example #3

img20

For this explosion sound, this envelope is ideal. As you can see, the attack of the sound isn’t immediate. Having even just a slight attack on the envelope can make the ducking effect more transparent, which is especially nice if you are attenuating other sounds by -20dB or more. Although it’s not as noticeable in just a screenshot, most of the full-frequency content of this sound is weighted towards the beginning, so the majority of the envelope’s shape is focused here. Ducking is usually most transparent when there’s a lot of frequency content to mask the gain reduction, so trying to focus the envelope where the energy is most prominent can help mask the gain reduction that’s being done to other sounds. Towards the end of the sound, even though the tail is relatively audible, having the HDR envelope purposefully end early helps bring the other sounds’ gain back up under the cover of the more important sound so you never noticed them go away, or come back either. 

Switching it Up

HDR can be a very transparent system, but sometimes you might want to use it in ways that are decidedly not transparent. Take for example this charge up and attack sound. Normally, you would want to try to follow the shape of the RMS as closely as possible.

img21

However, you can use the envelope to your advantage and create an obvious ducking moment for dramatic effect. By choosing to extend the peak of the envelope to where there’s no signal to hide it, you can create a vacuum before the attack sound occurs, creating a more dynamic moment than if you let the ducking stop for the moment before the attack, which would lessen the impact of the sound.

img22

As they say, you have to know the rules to break the rules. As long as it sounds good, you can use HDR for more expressive mix decisions rather than purely focusing on transparency.

A Layered Approach

There will be many scenarios where you have a single sound event composed of many layers. Maybe this is a single blend container with many child sounds, or maybe it is an event that is calling multiple sounds from across the Actor-Mixer hierarchy. In any case, it can get messy if all of the layers have their own HDR Envelopes, with the final combined effect being messy and hard to control. In this case, it is best to designate one sound layer as the “main” one, and that will receive an Active Range value. The other layers, while meant to play in parallel, do not need to be ducking anything, so you can designate them as “support” layers and set their Active Range to 0.

img23

Different contexts will require different approaches here, but feel free to experiment with not only what makes the most sense in your workflow, but also with what sounds the best! Maybe you want to intentionally duck one of the layers (such as “gun_fire_tail”), so intentionally giving it a relatively lower Voice Volume than the other parallel layers could achieve better results.

The Pink Noise Test

In order to hone in on your ducking behavior as well as your HDR envelopes, it can be helpful to test how a sound will duck other sounds in context. In order to test against the worst case scenario, it can be useful to set up a bed of looping pink noise to play that’s either at or under your HDR Threshold. If you use the Soundcaster to play this sound or even implement it in-game, you can listen to how other sounds get attenuated when sounds above the HDR Threshold are playing. This can also help you identify issues with your HDR envelopes and make sure that even in the least ideal situations, your HDR-based ducking will sound as intended.

You can hear an example of this test method by watching the following segment of the presentation I gave at Airwiggles’ AirCon24.

HDR and Attenuation

Because Wwise’s HDR system uses Voice Volume, distance Attenuation has a direct effect on HDR. This can be beneficial, as an explosion sound that is far away will naturally be lower in the HDR hierarchy due to its Voice Volume being reduced over distance, but you must always be mindful of how your HDR Ratio affects gain reduction when the Voice Volume value is still over the threshold. When designing attenuation based on the audible gain reduction, you might design a curve that has a specific Voice Volume reduction over a set distance. But as soon as you assign that Attenuation shareset to a sound that is above the HDR Threshold, the audible gain reduction over distance will be lessened by the HDR Ratio up until the point where the sound finally reaches the HDR Threshold. You can read more about this phenomenon in the official documentation.

img24

For example, if you have a sound that has a Voice Volume of 20, and your project has an HDR ratio of 10:1, for the first 40 units of distance of the sound will only result in a -2dB reduction of the apparent gain. As soon as the sound gets to the 40 unit mark, it will continue to attenuate over distance as expected. This might be another deciding factor that influences what HDR Ratio you end up choosing to use. Lower ratios will yield much more predictable behavior when dealing with distance attenuations, so be sure to consider this. Sounds being routed through buses with higher HDR Ratios might need to create more specific Attenuation sharesets to deal with this phenomenon, so be aware of this potential impact to your workflow. 

Managing the Hierarchy

In order to manage Voice Volume in this way, it’s best to devise a way to easily track and manage various sound categories and where they stand relative to each other in regards to sound importance. One of the best ways to do this is to create a spreadsheet to track all of your sounds’ Voice Volume values. 

img25_v2

In order to easily determine how your game’s sounds will react relative to each other, it’s good to come up with a finite amount of broader sound categories rather than working on a sound-by-sound basis. It’s much easier to manage broader categories especially if you wish to modify their Voice Volume/”sound importance” later on. It may take some time to determine which categories of sounds you need to represent on this spreadsheet, and you will constantly find edge cases where you might need to make new categories. However, simply just being to represent this hierarchy visually is extremely helpful in terms of planning the big picture. It’s easy to get lost in the weeds when mixing games, so being able to have a bird’s eye view of the mix relationship between sounds is an extremely helpful tool. 

Using Notes for HDR Categories

In order to make the best use of this, you should find a way to easily search for sounds that belong to a specific category. One way this can be done is utilizing the Notes field. For example, if you write something like “hdrTag_player_spell_cast” in the Notes field, then simply search for that exact string in the Search bar in Wwise, it will show all of the objects where this Note is written. To make the most of this workflow, it’s advised to use the Notes field in this way at the object level where you intend on setting the Voice Volume for the sound. In some cases this could be a parent Actor-Mixer, a Random Container, or an individual sound. 

img26

img27

You can also use the Query Editor to search for these “hdrTag” Notes as well.

img28

Once you have a list of objects that match this Notes field entry, you can easily select them all and view their Voice Volume and Make-Up Gain values in either the List View or Query Results, or open the Multi-Editor window. In the case of the List View, it is best to configure the columns to only show what is relevant for the purposes of HDR. In this case, having just Voice Volume and Make-Up Gain visible is most convenient. Below, you can see how to use the List View to your advantage to set the HDR “importance” very quickly for a large batch of sounds.

img29

Due to the nature of how projects are organized, many sounds can share an HDR sound category but not be located in the same section of the Actor-Mixer hierarchy, so having a way to globally manage shared properties in this fashion can save a lot of time and make the game’s mix more consistent. This also makes iteration a lot easier, as a Voice Volume that you thought would work early on might change later on during production. Having a reliable and fast way to quickly change where a sound category sits in your mix hierarchy will save you lots of trouble when you are iterating on the game’s mix.

Using Buses for HDR Categories

Another alternate method you can use in order to manage your different HDR categories is to use Wwise’s Master-Mixer Hierarchy in order to manage the Voice Volume and Make-Up Gain settings, as opposed to using the above Notes-based method. This means that each HDR category you create will have a bus associated with it. Instead of using the List View in order to manage the Voice Volume and Make-Up gain for each sound associated with the HDRtag, you can instead simply assign a sound to the correct HDR category bus, and set the Voice Volume and Make-Up gain at the bus level.

img30

You can still organize your Master-Mixer hierarchy in a way that makes the most sense for mixing your project, but you can simply make new children buses to act as the point where the sound’s Voice Volume and Make-Up Gain is set. Keep in mind that you still have to take note of any downstream Voice and Bus Volume changes above these HDR sub-buses, but it might be easier to manage this if everything is in the same place. In order to define the HDR category’s priority or importance, you can simply set the Voice Volume and Make-Up Gain settings as desired on the bus’ properties. It might also be a good idea to name and/or color the buses in a way that informs other sound designers of their functionality, because if a sound is assigned to one of these buses, its Voice Volume will be set above the threshold. Because of this, make sure that any sounds assigned to one of these HDR sub-buses have their HDR envelopes enabled. It’s also a good idea to check the sound’s final Voice Volume before assigning it to these buses, in order to make sure no unintentional cumulative Voice Volume values are being set that would conflict with the bus’ values.

img31

Because the Voice Volume and Make-Up Gain are now being set in one single location as opposed to many individual locations, iterating on these values after the fact can be much faster and easier, with less room for error. This also makes it easier to dynamically adjust the Voice Volume and Make-Up Gain at runtime since you are 100% certain of the absolute Voice Volume values you will be altering with RTPCs, States, etc. With the other methods, if you choose to change Voice Volume (and thus HDR “importance) at the bus level, sounds being output to that bus may have differing Voice Volume values, meaning that the mix and HDR behavior changes will potentially be inconsistent across these sounds.

Finishing Touches

As you begin to implement Wwise HDR on your project, there are a few things to consider in order to make the process as smooth as possible.

Since HDR can replicate how our ears perceive sound or even be used for something more subjective, it’s usually best to focus the HDR system on SFX only. Since HDR uses volume to mix sounds dynamically, this might not sound as transparent when mixing other content like dialog and music - so it might be best to leave those out of HDR. Furthermore, certain non-diegetic SFX like UI/HUD sounds and cinematics might not work well with HDR, so try to define this as early as possible so you’re set on which content is included in this system, and which content is excluded from the system.

It’s also not uncommon to use various methods of mixing within a project. While most of the SFX in a game could be participating in the HDR system, it’s not unusual to still use Wwise Meter based sidechain systems, especially when you’re dealing with how SFX interacts with dialog and music. There’s no one way to do things, so feel free to experiment with any combination of methods that works for your project!

Because the usage of Wwise’s HDR system can get complex, it’s best to not only keep things as organized as possible, but also constantly work with and communicate with your team about the system, what to expect, and how to use it to your advantage. Once things are set up correctly, the game will be able to handle a multitude of mixing situations with ease! However, getting to this point requires a lot of planning and organization, so make sure to constantly check the integrity of your HDR setup to ensure it will serve your game’s audio mix.

Continue Learning

If you found this helpful, be sure to check out the Helldivers 2 deep-dive video on dynamic mixing for supplemental info about HDR and more!

 

Alex Previty

Senior Sound Designer II

PlayStation Studios, Creative Arts

Alex Previty

Senior Sound Designer II

PlayStation Studios, Creative Arts

Alex has been working in game audio since 2013, and has worked on such titles as Marvel's Spider-Man 1 & 2, Demon's Souls, God of War Ragnarök, and Death Stranding 2. He has been working with Wwise for almost ten years and has spent most of his career leveraging its capabilities to help create cohesive and expressive audio experiences in the AAA space. Alex and his family are currently located in Tokyo, where he works on the Japan Sound team at PlayStation Studios - Creative Arts.

Bluesky

LinkedIn

 @alexprevity

Comments

Leave a Reply

Your email address will not be published.

More articles

Tips from the Wwise Dev. Team!

This article covers some lesser-known Wwise features through a collection of tips targeting...

24.1.2017 - By Bernard Rodrigue

Wwise 2019.1 is live!

Wwise 2019.1 is out and available for you to download from the Wwise Launcher. This version...

2.4.2019 - By Audiokinetic

Developing ReaWwise | Part 1 - Pre-Production

Now that ReaWwise has been released, we thought it would be a good time to share a bit about the...

6.10.2022 - By Bernard Rodrigue

WAAPI in ReaScript (Lua) with ReaWwise

A lesser-known feature of ReaWwise is that it exposes raw WAAPI functions to REAPER, which you can...

13.1.2023 - By Andrew Costa

Adventures With AudioLink

At GameSoundCon last October, I was having lunch with Damian at the fancy sandwich place around the...

20.3.2023 - By Peter "pdx" Drescher

New in Wwise Spatial Audio 2023.1 | Reverb Zones

An Intro To Reverb Zones Wwise 23.1 adds a new tool to Wwise Spatial Audio called Reverb Zones. A...

10.1.2024 - By Thomas Hansen

More articles

Tips from the Wwise Dev. Team!

This article covers some lesser-known Wwise features through a collection of tips targeting...

Wwise 2019.1 is live!

Wwise 2019.1 is out and available for you to download from the Wwise Launcher. This version...

Developing ReaWwise | Part 1 - Pre-Production

Now that ReaWwise has been released, we thought it would be a good time to share a bit about the...