Table of Contents
Android audio latency has always been an issue for all applications. Some devices exhibit up to 150 ms of latency, although the usual latency is below 100 ms.
In recent OS versions, Google introduced the "fast audio path." This bypasses some of the internal processing if some conditions are met, drastically reducing the latency of the OS and hardware processing. Overall latency can be reduced to a few dozen milliseconds. It is important to note that it is not mandatory for the hardware manufacturers to implement the fast audio path; so, many do not. Furthermore, some devices report a fast path without it actually being implemented. Therefore, a game cannot rely on the existence of the fast audio path on the end-user's devices. If your game is for a wide target market, you should design your game audio without using this feature.
Generally, a lower latency will incur a higher CPU cost. The processing of RTPC updates, game object positions, and other game inputs is done on a per-frame basis. Therefore, with a smaller frame, this is processed more often. The balance between latency and CPU usage can only be found through experimentation. There is no hard rule.
A few settings will control the latency of audio in the Wwise SDK:
AkPlatformInitSettings::uSampleRate: Sample rate to use. If set to the hardware preferred rate (and samples per frame) the audio fast path can be selected.
AkPlatformInitSettings::uNumRefillsInVoice: Number of buffers to pre-process. This is protection against CPU availability issues or interruption.
AkInitSettings::uNumSamplesPerFrame: Number of samples per buffer. If set to the hardware preferred size (and samples rate) the audio fast path can be selected.
The default settings returned by
AK::SoundEngine::GetDefaultPlatformInitSettings() are "safe" settings for most devices in most conditions. Using those will set:
uSampleRateto the hardware preferred rate. This is usually 48 kHz or 44.1 kHz.
uNumSamplesPerFrameto the hardware preferred rate. This is usually in the order of 128, 192, or 240 samples, but can be higher depending on the device.
Advantages: This setup will select the fast path on devices that support it. There are also many fewer constraints on the CPU usage of the audio design.
Disadvantages: Four frames of audio are pre-processed to allow some room in CPU variation, which can cause higher latency on devices that may not need the headroom.
To initialize Wwise with the lowest latency, call
AK::SoundEngine::GetDefaultPlatformSettings(), then manually change
uNumRefillsInVoice to 2.
Advantages: Lowest latency possible.
Disadvantages: Available CPU is limited. The time allotted to rendering audio is very short and can be easily disturbed by other events on the system. CPU overhead of audio designs (such as RTPCs, Switches, Positioning, and containers) is a concern and needs to be monitored carefully.
When the time required to process an audio frame is lower than what the CPU can handle, audio starvation occurs. At this point, users may hear audible pops and glitches if the game is not outputting silence.
On Android, when this occurs, Wwise will automatically increase the size of its internal buffer by a small amount in order to avoid further starvation. It will do so for every frame in which audio starvation occurs, until the buffer is just large enough to no longer cause starvation. At that point, Wwise has found the optimal balance between latency and CPU usage.
This process is relatively fast; audio glitches are usually heard only for a few milliseconds after the game first starts, or after the app is resumed from a background state.
If this glitching is unacceptable, you may opt out of this behavior by making sure to set
uNumRefillsInVoice to an initial value large enough to never cause starvation.
uNumRefillsInVoice will need to be set to at least 3, usually 4 in most games, to have some headroom for CPU usage variations.
As said in Android's programmer's notes, Bluetooth devices cannot be low latency or use the fast path. In fact, to have glitchless audio with a Bluetooth device, the buffering must be a lot higher and the frames larger. Wwise will detect the usage of a Bluetooth headset and will automatically reset the audio hardware to have 8,192 samples of latency, which is roughly 180 ms.