Table of Contents

Target Platform(s):

Understanding Audio Sessions

The iOS SDK relies on the Apple audio session API to comply with the operating system audio handling mechanism at the app and inter-app level. The audio session main goal is to provide a set of known behaviors for an audio app when different system events or user actions occur. In other words, audio session is an intermediary layer between the audio app and the iOS operating system, which provides predictable audio behavior.

Application audio behavior on the system and the other apps is controlled through a set of audio session categories and subcategories called category options. Every iOS app that uses or produces audio is expected to register itself on the system in a given audio session category and in an optional set of category options in order to state its expected audio behavior.

Audio session is also used to handle and respond to iOS system interruption and audio route changes.

Finally, audio session is also the mechanic used by iOS to share the low-level hardware audio components between the different audio apps and system sounds. It is, therefore, mandatory to comply with audio session requirements in order to prevent app audio from being lost or the app crashing due to improper system event handling.

Each time an audio app requires some audio to be captured or produced, it must request and activate its audio session on the system. For example, when a Music app user presses the 'play' button, the app should request its audio session be activated after the button is pressed and should deactivate it when the user presses the 'pause' button. Many iOS built-in apps use audio session to either output or capture audio, which may have consequences on the user app audio behavior:

  • Calendar app (through audio alert)
  • Camera (video)
  • Clock app (alarm, stopwatch, timer)
  • FaceTime
  • Garage Band
  • iTunes Store
  • Maps
  • Music app
  • Phone
  • Podcasts
  • SIRI
  • Voice Memos

At any moment, iOS can deactivate the app's audio session to respond to various audio events, such as an incoming phone call, a scheduled audio alarm from a timer, or a calendar event. These events are known as interruptions, and audio apps are to be ready to react to such audio session priority changes and properly resume playing or recording after the interruption has ended. Application switching, backgrounding the app when pressing the Home button, and locking and unlocking the device are also situations where audio session activation and deactivation occur. The following sections detail how these events are managed by Wwise and what has to be done by the client app to ensure proper audio.

Audio Session Categories and Options

All audio apps running on iOS must use an audio session to process, acquire, or output sounds. Since all audio apps might not use audio for the same goal or purposes (an audio recording app might need audio to be acquired in background execution while a game is entirely paused when another audio app is started), iOS defines a certain number of audio session categories that group expected audio behaviors when user actions or system events, such as the ones presented above, occur. Each audio app must register itself to use a given category before being able to process audio on the device.

The main audio behaviors that differ between the categories are:

  • Whether or not the app audio is silenced by the Ring/Silent switch (often referred to as the 'Lock' button) on the device;
  • Whether or not an audio app will stop audio of another app that is configured as a non mixable audio app; or
  • If the app allows audio input (recording), audio output (playback), or both.

Each audio app can belong to only one category at a time. If no category is explicitly chosen, the default AVAudioSessionCategorySoloAmbient category is used. Three main categories are available in the iOS Wwise SDK:

  • AkAudioSessionCategoryAmbient: Equivalent to the iOS audio session AVAudioSessionCategoryAmbient category
    • Used for audio playback only (no recording available).
    • App audio is silenced when the Ring/Silent switch is activated (when 'Lock' is pressed).
    • App audio will be interrupted if other audio app activates its audio session in the foreground.
    • App audio will be mixed with other audio if a another app activates its audio session in the background when the latter is configured with play and record using AkAudioSessionCategoryOptionMixWithOthers option (see below).
    • App audio can include many types of sounds such as SFX, dialogue, or music. Unless the app audio is specifically tagged as BGM that can be muted in Wwise (see Handling User Music (BGM) and DVR for details), all app audio in the Ambient category will be mixed with other apps' audio. The most typical case is when the user starts the Music app and starts playing music before switching back to your app.
    • This category is well suited for games where audio adds an interesting experience to the app while allowing another app's audio to be included in the experience. Using this category, a user can, for example, choose to listen to music using the Music app while your app outputs SFX or dialogues.
    • Ambient does not allow audio background playing from your app. If a user switches from your app to another audio app, your app audio will be silenced, even if the other app uses a mixable category.
  • AkAudioSessionCategorySoloAmbient: Equivalent to the iOS audio session AVAudioSessionCategorySoloAmbient category
    • Used for audio playback only (no recording available).
    • App audio is silenced when the Ring/Silent switch is activated (when 'Lock' is pressed).
    • The main difference with the Ambient category is that when the audio session is activated, audio from other apps is interrupted.
    • App audio will be interrupted if another audio app activates its audio session in the foreground.
    • App audio will be interrupted if another audio app activates its audio session in the background under a mixable or non-mixable category.
    • This category is also recommended for game apps which use audio that should not be mixed with other audio apps running in the background. For example, users that listen to music with the Music app and switch back to your app will have their music silenced when your app resumes.
  • AkAudioSessionCategoryPlayAndRecord: Equivalent to the iOS audio session AVAudioSessionCategoryPlayAndRecord category
    • Used for audio playback and recording.
    • App audio is not silenced when the Ring/Silent switch is activated (when 'Lock' is pressed).
    • App audio will be interrupted if another audio app activates its audio session in the foreground.
    • By default, this category does not use the overridden AkAudioSessionCategoryOptionMixWithOthers option (see below) and other app's audio such as the music from the Music app or another audio streaming app (such as Spotify) will not be mixed with your app audio when it plays.
    • If using the category option AkAudioSessionCategoryOptionMixWithOthers (see below), app audio will be mixed with other sources if another audio app activates its audio session in the background. For example, users listening to music with the Music app who switch to your app will have their music mixed with your app's audio. However, like AkAudioSessionCategoryAmbient, BGM from your app that is set to be muted in Wwise (see Handling User Music (BGM) and DVR for details) would still not be output with this option.
    • This category is recommended for apps that put emphasis on music playback or recording where audio is of primary importance (such as a Karaoke app, an audio streaming app, or an audio recording app).
    • This category, used in conjunction with the properly configured app plist allows audio to be played/recorded while the app is switched to the background. If your app needs to support background audio processing such as in a voice recording app or a music player app, this mode is the only Wwise-supported one that allows it. To do so, however, the app option plist must be modified to add the required background mode key set to 1 for the 'App plays audio or streams audio/video using Airplay' field. In addition to the audio session categories listed above, some categories' default behavior can be overridden using category options.

Audio session category options can be configured with enum AkAudioSessionCategoryOptions:

  • AkAudioSessionCategoryOptionMixWithOthers: Equivalent to the iOS audio session AVAudioSessionCategoryOptionMixWithOthers category option
    • Use only with the AkAudioSessionCategoryPlayAndRecord category.
    • Overrides the default category behavior to allow the audio to be mixed with other mixable audio apps.
  • AkAudioSessionCategoryOptionDuckOthers: Equivalent to the iOS audio session AVAudioSessionCategoryOptionDuckOthers category option
    • Use only with the AkAudioSessionCategoryPlayAndRecord category.
    • Allows audio from other apps to be 'ducked' when the running app audio session is active.
  • AkAudioSessionCategoryOptionAllowBluetooth: Equivalent to the iOS audio session AVAudioSessionCategoryOptionAllowBluetooth category option
    • Use only with the AkAudioSessionCategoryPlayAndRecord category.
    • Allows Bluetooth devices to be available input routes when recording.
  • AkAudioSessionCategoryOptionDefaultToSpeaker : Equivalent to the iOS audio session AVAudioSessionCategoryOptionDefaultToSpeaker category option
    • Use only with the AkAudioSessionCategoryPlayAndRecord category.
    • If this option is used and no other audio route is available, the app audio will play through the device's built-in speakers. If not used and no other audio route is available, the app audio will play through the receiver.
    • This option is only relevant for iPhones where a receiver is present. It has no effect on iPads and iPods.

iOS Audio Event Mechanisms with Wwise

On iOS devices, a certain number of user actions or system events can occur while your app is using the device's audio hardware. For example, an incoming phone call, a user switching your app for another app, an audible timer alert that goes off, a user connecting/disconnecting headphones into/from the jack, or a user pressing the Ring/Silent switch are possible events that have an impact on your app's audio behavior. This section presents the different mechanisms found in iOS for such events and how its communicated to your app through Wwise.

Three main families of audio events exist in iOS: interruptions, source changes, and route changes.

- Interruption

Different system events or user actions can lead to the deactivation of the app's audio session. This is also referred to as an audio session interruption. Such interruptions occur, for example, when receiving an incoming phone call or when another audio app activates its audio session under a category that is not mixable with other sessions. The behavior of the sound engine when the application is sent to background by the OS depends on the Audio Session mode.

In Solo Ambient category, the audio is not mixable, therefore the Sound Engine will suspend itself. This means no audio will be processed and no API calls will be serviced. This is done to save on battery, as per Apple's guidelines. Therefore, your game should suspend processing as well and avoid calling any of the AK::SoundEngine calls. When the application comes back in foreground, the sound engine will be resumed and any queued API calls will be processed. All these operations will happen automatically.

In Ambient or Play and Record category, or any other mode where the audio is mixable with other applications, the sound engine will continue processing audio normally.

If your application needs to be aware of those system events, and your game engine does not expose them, you can set a callback function in AkPlatformInitSettings.audioCallbacks.interruptionCallback. Otherwise, there is nothing to do to handle these events from the sound engine perspective.

When the audio session interruption occurs, the app that sees its audio session deactivated can be notified through a message which is translated into an application callback in the Wwise SDK. An audio session interruption is split into two events: a notification when the interruption begins and a notification when the interruption ends. In the Wwise SDK callback, the beginning/ending of the interruption can be obtained through the callback parameter.

Wwise takes care of suspending/resuming the sound engine components when receiving these interruptions and no further action is required by the user to ensure proper audio session activation/deactivation. If specific processing is required in the SDK client app when the audio session is either interrupted or resumed (for example, to update UI elements or give a visual feedback for the interruption), the code must be added to the application callback described above. When a 'begin' interruption is received in the Wwise SDK, the user callback is called before the sound engine is interrupted. On the other hand, when an 'end' interruption is received in the Wwise SDK, the sound engine is resumed before calling the user callback method.

Caution.gif
Caution: Since the 'begin' audio interruption callback is called just before internally suspending the sound engine, it must not be used for playing any kind of audio, even short sounds, because the expected result can't be guaranteed.
Caution.gif
Caution: User audio interruption callback is only meant to do simple tasks such as UI or object property updates. It must never be used to do CPU intensive computation or time consuming operations. Expected results in such cases are not guaranteed.
Caution.gif
Caution: Apple iOS documentation warns that a 'begin' interruption notification might not always be matched with an end of interruption notification and that the only way of detecting when the audio session can be resumed is to rely on the delegate methods called when the app will switch back to the foreground. Audio interruption callbacks should, therefore, not be the only mechanism for detecting audio session recovery in an audio app.

- Source change

When an audio app starts playing audio, iOS can notify other apps that are currently in the foreground with an active audio session about this new session activation. Under session categories other than ambient or play and record with the mixable option activated for your app, this would result in an audio session deactivation by iOS. So, this notification is only really useful when a mixable category is used. The main purpose is to let an app decide whether or not some or all of its audio should be muted/unmuted when audio is played from another app. This iOS notification is further translated into a Wwise SDK callback that can be registered into your app like the audio interruption described above. The source change callback also carries the 'begin' or 'end' event to let the application know if another audio app has activated or deactivated its audio session.

The member sourceChangeCallback in struct AkInitSettings is the function pointer that holds the user callback for the audio source change callback described earlier. The Integration Demo illustrates this with the static AKRESULT DemoInterruptCallback(bool in_bEnterInterruption, void* in_pCookie ) method. The selected method is registered as the source change callback by setting the function address to the callback method address: initSettings.sourceChangeCallback = DemoAudioSourceChangeCallback. The bool in_bEnterInterruption parameter reflects the 'begin' or 'end' notification state of the other app audio session activation/deactivation. An optional callback cookie defined by the user can also be set to the struct "AkInitSettings BGMCallbackCookie" member.

Once registered, every source change notification or explicit verification of the audio source in the Wwise SDK will trigger the callback.

If a mixable category is used for your app (ambient or play and record with a mixable option override), Wwise will internally mute or unmute the background music bus when receiving respectively a 'begin' or 'end' source change notification. When a 'begin' source change notification is received, the internal background music bus is muted just before calling the user callback method. Similarly, when an 'end' source change notification is received, the background music bus is unmuted before calling the user callback.

Under a non-mixable category (ambient solo or play and record without the mixable option override), when another app activates its audio session in the background, iOS sends an audio interruption 'begin' instead of a source change notification, so the user should not expect the source change callback to be called with a 'begin' parameter in such a case. If the audio interruption callback is also registered, however, the user will receive this callback with a 'begin' parameter instead of the source change callback.

Note.gif
Note: Since, as previously described, the audio interruption 'begin' callback is followed by an internal sound engine suspension, the matching source change 'end' callback received when another app deactivates its audio session will take care of resuming the sound engine internally and no further user action is required.

- Route change

An audio route is a given pathway of audio through the different audio hardware components in the device. When a user plugs or unplugs a headset for example, the audio route is changed since it is expected that the sound output should now either be sent to the headset or suspended instead of coming out of the device speaker. iOS can notify route changes to apps that register the AVAudioSessionRouteChangeNotification.

An Apple audio session automatically takes care of activating/deactivating a running app audio session when basic route changes such as headphone plug/unplug events occur. The Wwise SDK does not offer any callback mechanism as opposed to the interruption and audio source change cases described above. If a user app needs to detect such route changes to reflect specific elements in their app, refer to the Integration Demo to see an example of how this can be easily achieved.