- News Center|2025/11/07
Immersive/object-based audio recording techniques 1
Audio formats have developed over time. Starting with narrow bandwidth mono, moving on to various versions of two-channel stereo and finally to full-band, multi-channel immersive audio. The sound is reproduced in many ways, ranging from personal headphones to multi-channel systems in cinemas or other big venues. Immersive audio can be described as a group of recording- and reproduction formats that involve more than a basic two-channel stereo
Immersive audio encompasses all surround formats:
Immersive audio encompasses all surround formats:
- Channel-based formats reproduced in 5.0/5.1*), 7.1*), 9.1*), etc.
- Formats that include height information, either channel-based or object-based
*) the .1 indicates an individual sound channel that only contains a fraction of the full frequency range, namely the range from 20 Hz to 120 Hz.
There are many ways to record immersive audio. In this article, you will find descriptions of microphone setups for the majority of immersive audio formats. It is important to define the listening setup before selecting the recording setup. In broadcast and in music production, the starting point is the ITU-775 standard listening configuration.
There are many ways to record immersive audio. In this article, you will find descriptions of microphone setups for the majority of immersive audio formats. It is important to define the listening setup before selecting the recording setup. In broadcast and in music production, the starting point is the ITU-775 standard listening configuration.
Coincident arrays vs. spaced arrays
A microphone array is just a physical arrangement of microphones. The array may consist of individual microphones mounted on one single microphone stand or perhaps on several stands or holders. In some cases, the microphones are built into one single unit (like the 5100 Surround Microphone).
In a coincident array, the microphones are mounted extremely close to each other. In principle, all microphones in this type of array receive sound simultaneously.
In the coincident technique, localization cues are based only on level differences between signals. This technique can create proper localization accuracy but will, to some degree, lack envelopment and have a small sweet spot (in two dimensions: left/right and front/rear). The advantage of a coincident array, however, is that it is compact, portable and mono compatible. It is easy to down mix the channels to one single mono channel without coloration from comb filtering and other artefacts.
A spaced array creates a three-dimensional enveloping audio sensation by providing an adequate amount of decorrelation between the signals (localization cues are based on time-of-arrival differences). When adapting the microphone placement (distance and angle) to the sound field, spaced arrays still provide proper localization accuracy.
The spaced techniques, in general, give a nice, large sweet spot and give listeners the sense of an enlarged and enveloping sound stage in a larger listening field. The disadvantage is their size and, in some situations, setup time. In addition, it is not advisable to collapse the signals to a mono signal – instead, one signal can be used.
| Envelopment | Size of listening area | Size and portability | Localization accuracy | |
| Coincident arrays | - | - | + | + |
| Spaced arrays | + | + | - | - |
5. X
The basic and simple setup for channel-based 5.x (5.0/5.1/5.2) surround sound is the application of five microphones in a spaced array. There are different ways to select and arrange the microphones; it depends on many factors like the acoustic qualities of the recording room (i.e. a concert hall/jazz club/church), the layout of the sound sources present, the directivity of the microphones applied, or maybe, just taste. The setups may vary from strictly mathematically-calculated, psycho-acoustically verified to more "feel-like" configurations.
One way of thinking about the coverage of a 360° circle around the listening position is to consider each two neighboring microphones as a stereo pair. Each pair covers a specific segment of the circle. Sometimes the segments overlap, sometimes they "underlap". Another way to look at it, is to consider the frontal microphones as providing the main soundstage and the rear microphones establishing a sense of surround/atmos.
The following setups are not exhaustive but can be seen as inspiration and are examples of best practice:
The omni-based surround array
Five omnidirectional microphones arranged in a spaced array provides a good tonal balance. The low-frequency content is reproduced very convincingly. This setup also provides an excellent envelopment – when reproduced, the listener is surrounded by sound. The drawback of this setup can be the lack of isolation between channels.
The three frontal microphones – often called the frontal triplet – are arranged as a Decca-Tree. The positions are chosen in accordance with the optimum recording angle of the given sound source.
The position of the rear microphones is chosen independent of the surrounding soundfield. Normally, the rear microphones should not be placed too far from the front microphones. If the distance is too large, the delay may become audible. Furthermore, some directivity might be preferred for the surround pickup. This can be provided by acoustic pressure equalizers (APEs), which ensure directivity at higher frequencies but keep the advantages of the omnis, for good low-frequency response.
A starting point for this setup could look like this:
The basic and simple setup for channel-based 5.x (5.0/5.1/5.2) surround sound is the application of five microphones in a spaced array. There are different ways to select and arrange the microphones; it depends on many factors like the acoustic qualities of the recording room (i.e. a concert hall/jazz club/church), the layout of the sound sources present, the directivity of the microphones applied, or maybe, just taste. The setups may vary from strictly mathematically-calculated, psycho-acoustically verified to more "feel-like" configurations.
One way of thinking about the coverage of a 360° circle around the listening position is to consider each two neighboring microphones as a stereo pair. Each pair covers a specific segment of the circle. Sometimes the segments overlap, sometimes they "underlap". Another way to look at it, is to consider the frontal microphones as providing the main soundstage and the rear microphones establishing a sense of surround/atmos.
The following setups are not exhaustive but can be seen as inspiration and are examples of best practice:
The omni-based surround array
Five omnidirectional microphones arranged in a spaced array provides a good tonal balance. The low-frequency content is reproduced very convincingly. This setup also provides an excellent envelopment – when reproduced, the listener is surrounded by sound. The drawback of this setup can be the lack of isolation between channels.
The three frontal microphones – often called the frontal triplet – are arranged as a Decca-Tree. The positions are chosen in accordance with the optimum recording angle of the given sound source.
The position of the rear microphones is chosen independent of the surrounding soundfield. Normally, the rear microphones should not be placed too far from the front microphones. If the distance is too large, the delay may become audible. Furthermore, some directivity might be preferred for the surround pickup. This can be provided by acoustic pressure equalizers (APEs), which ensure directivity at higher frequencies but keep the advantages of the omnis, for good low-frequency response.
A starting point for this setup could look like this:
- L-R 60-120 cm (24-47 in)
- L-C 30-60 cm (12-24 in)l
- R-C 30-60 cm (12-24 in)
- C-LR 15-45 cm (6-8 in)
- Front-Rear: 200-500 cm (80-200 in)
- LS-RS: 200-300 cm (80-118 in)
Distance between the outer frontal microphones: 60-120 cm (24-47 in). The wider the width of the source, the narrower the spacing of the mics should be. The center microphone is approximately 15-45 cm (6-8 in) in front of the L/R-pair.
The two rear microphones are placed 2-5 m (80-200 in) behind the frontal triplet. The distance between the rear microphones should be in the range of 2-3 m (80-118 in). As mentioned, APEs can be used to avoid frontal impulsive sounds being reproduced by the rear channels.
The Scottish sound engineer, recording specialist and lecturer, Michael Williams, has done intensive studies on Multichannel Microphone Array Design (MMAD). Look up the literature from Michael to find a precise setup for any given situation. Two publications are mentioned below and further references can be found there.
Literature
[1] Williams, Michael; Guillaume Le Dû: Multichannel sound recording, Multichannel Microphone Array Design (MMAD). 2010.
[2] Williams, Michael: Microphone Arrays for Stereo and Multichannel Sound Recording Vol II. ISBN 978-88-7365-104-8. Milano 2013.









