Broadcast News

Bookmark and Share

What Is The Future For Immersive Audio?

News Image
Peter Poers, Managing Director at Jünger Audio, looks at production efforts versus consumer experience.

Along with the evolution of higher resolution in video images, a new way of creating and delivering audio content will be required and is already on the way. All the changes for future audio systems are covered by the general title "Next Generation Audio" (NGA). In addition to the very common existing channel based or fix mixed audio formats, some more audio channels will need to be added to make a difference.

Obviously, the creation and delivery of one layer surround sound (horizontally, with the listener surrounded by audio elements) will not meet future expectations anymore. Following the successful introduction of 3D audio in cinemas by Dolby® Atmos or in VR applications we can expect that immersive audio programs will become part of the delivery for future TV formats. Of course, the client at home can't expect to have a listening experience similar to that of a large scale multi-speaker cinema theatre. However, spatial audio effects can be delivered by using additional height channels reproduced by separate speakers or special 3D sound bars (better called sound projectors as opposed to ordinary external amplifier/loudspeaker combinations), or by using headphones driven by 3D virtualization software. And that will give immersive audio a realistic chance to become a standard feature of home entertainment systems in the near future.

The future NGA based surround sound formats adopted by TV Broadcast and OTT will typically be a maximum of 7.1 + 4H channels – in total, up to 11.1 speakers (as referenced in DVB NGA survey May 2015), arranged as a mid layer surround array of up to seven speaker positions and up to four height speakers on a top layer plus the sub-woofer for low frequency effects.

There are more elements than just a higher channel count that will define the next generation of audio format technology though – the presence of audio objects. At the moment, audio programs are typically produced and mixed in their final reproduction surround sound audio format. That can be for example 5.1 or 5.1 + 2H or even 7.1 + 4H. The mix is created and finalised and is then ready for delivery. These types of program mix we can call a channel based immersive audio format. For the NGA formats, there will be the additional introduction of audio objects. Audio objects are typically discrete mono or stereo audio channels that will be rendered to the reproduction audio mix in the final receiver audio decoder. With this method, these audio elements can remain as objects with individual changes applied just before the final end of the audio delivery path.

Another element for defining the NGA formats is the use of metadata. All of the existing new audio codec systems (e.g. MPEG-H, Dolby AC-4, DTS:X) use an extensive set of metadata to describe audio program details, to optimize production workflow, to control audio encoding and to allow optimum audio performance at the final receiver device decoder. Besides controlling and monitoring audio content in the process of program production, the generation of metadata is a most important step for introducing and launching next generation audio formats. Working with metadata will be essential to "authoring" audio programs in new formats.

Workflow considerations – introduction of a "side car" device
The next generation immersive and personalized audio formats will require changes in the audio production workflow. New procedures for managing object based encoded content and also for the personalization of services through the selection of alternative audio objects (such as commentator languages) needs to be defined. Of course, loudness control during production and the loudness definition for the final output formats are other aspects to consider. The NGA formats will offer a new surround sound experience and the use of upmix, format rendering and downmix algorithms will be essential for creating and monitoring the audio programs.

Some additional tools and changes to existing production environments will be required to be able to create audio content for these new audio formats. One of the important aspects to give the new formats a good chance to succeed will be to minimize the cost of transition. Production costs on the professional side cannot be raised significantly without running the risk of the industry rejecting the new formats. The use of existing digital production infrastructures will be essential to begin content creation for new formats in the near future.

One particular new supporting tool will become most important for different workflow areas – the Multichannel Monitoring & Authoring Unit MMA. This tool must combine audio interfacing, audio computing and metadata authoring in a unique way and will be the key to start production for immersive audio encoding systems or technologies. It will host intellectual property elements from the codec vendor of choice to perform codec specific features and processing. In addition, and depending on the workflow situation, additional sophisticated audio processing features such as surround upmix and loudness control may be options that could be integrated.

Monitoring the immersive audio content will require rendering and downmix. Especially if the local speaker setup is not capable of reproducing the higher order audio formats. It is strongly recommended to also to monitor (or emulate) lower order speaker setups to verify the result of rendering and downmix for home reproduction in environments with different speaker installations. Also, metadata controlling the processes must be verified so that the settings are correct for optimum performance.

Immersive – by the introduction of audio objects
The addition of audio objects is the key for delivering a personalized audio experience. In the case of personalized audio, certain separated audio tracks will be mixed to the final receiver audio format based on decisions made by the end user. The user might select certain objects to use and might also define the mixing ratio between the audio bed and the objects. One example of this application will be dialogue enhancement. There will also be advantages for multi-language programs from object based technology. Several commentary tracks – not just different languages but also different presenters and perspectives – can be delivered within the same audio mix. Additional descriptive audio tracks can be mixed to any possible output format. Of course just one mix can be monitored at any one time. The limits for possible gain changes available for the viewer must be set and will be part of the metadata structure. The final audio format will be determined by the channel count of the audio bed. Depending on the channel order of the audio bed, a rendering procedure and downmix will be required for lower order audio formats. If the final format isn't 3D immersive, the personalized objects will typically be mixed to the center channel and/or to the front stereo side channels.

Conclusion 1
It will take more time for the market for NGA formats at home to become established. The time frame will be set by codec releases from known vendors, by technical preparation of professional production and of distribution networks (content creation and delivery). And finally by support from the consumer industry regarding the implementation of codecs and offering sophisticated reproduction systems (home theater systems, 3D sound projectors, 3D binaural headphones virtualization).

But never forget – many people are still quite happy with the "easy listening" experience with no interaction needed on their part to select from a list of available audio tracks. Another limiting factor is also the practical implications of receiving immersive audio for viewers globally! Just a fraction of consumers will have the chance to use the higher order audio formats in their home or when out and about using mobile devices. For the majority of countries, we should expect that just 5... 10% of households will be prepared and capable of using real 5.1 or higher order audio formats (17% of German households had 5.1 AV home systems in 2014, by Verband Deutscher Musikindustrie). All the others will get immersive and surround sound content just as (rendered) stereo downmix.

Conclusion 2
One question remains. What is the real definition of immersive with the new audio formats? And who will get the most out of it? Immersive can mean very different things to different people and not necessarily just a case of hearing sounds from above! Simple, well done audio recordings can be really immersive! In a simple format that delivers a meaningful audio experience! In recent years, the quality of audio productions has not improved in terms of natural and good sounding audio content. We are living in a world where many audio programs no longer represent the dynamic range and the structure that such content should typically offer. Whilst in previous decades, audio professionals did their best to overcome the technical limitations, now that we have all digital technologies, we cannot maintain the audio quality of the content anymore! Loudness is largely solved, but as we see in many cases now, speech intelligibilty is often worse than ever.

Yes, there is some audio from above and it is surrounding us, but by nature we do not focus on listening from above. So I guess the third dimension in audio cannot be the motivation to move to modern and new codec systems alone. Many common codecs in use today are from an old generation. New codecs can bring technical improvements and higher audio quality level at lower bit rates. The aspect of object based audio (OBA) and the option for personalization of delivered audio content is maybe more attractive for many consumers even if it does not really improve the delivery and performance of audio programs. Three dimensional audio and object based audio – both formats will require changes to production and delivery. Now is the time to discuss and explore how to move forward in the direction of creating a new audio experience.

This article is also available to read at BFV online as part of this issue's Audio feature here, page 33.

Solidmate Ltd Memory Card Hire London

Top Related Stories
Click here for the latest broadcast news stories.

Jünger Audio Demos Audio Processors At CABSAT 2017
Jünger Audio will be promoting its Smart Audio concept at this year's CABSAT exhibition in Dubai (Booth 102, Hall 1) by focusing attention on effectiv
Jünger Audio Prototype For IBC 2015
Jünger Audio will use IBC 2015 to showcase a prototype audio monitoring solution that will allow broadcasters to check the quality of all immersive au
Sonic introduces DVD-Audio Centre LE
Sonic Solutions have introduced the DVD-Audio Creator LE – a highly-affordable DVD-Audio authoring system with advanced features. Incorporating core t
Calrec Audio Unveils New Artemis Console
Calrec Audio has announced that it will launch a new Artemis console at NAB 2017, Booth C3118. Artemis Ray has 456 fully featured input channels and c
Krotus Audio Teams Up With Signum Audio
Krotos Audio and loudness plugin manufacturer Signum Audio have teamed up to offer BUTE Loudness Solutions with a 50% OFF sale until 31 January. Aside
Calrec Audio Sells 40 Brio Audio Consoles In Japan
In 2018, Calrec Audio generated serious momentum in the Japanese broadcasting market with the sale of 40 of its Brio audio consoles to a number of Jap
BCD Audio Launches Big Sale Of Audio Black Box Interfaces
BCD Audio are rationalising their warehouse and have some stock of earlier products as well as the latest. The earlier units are great bargains at cut
BCD Audio's Big Audio Black Box Interfaces Sale
BCD Audio are rationalising their warehouse and have some stock of earlier products as well as the latest. These earlier units are great bargains at c
NUGEN Audio Unveils Audio Management Batch Processor Extension
NUGEN Audio has unveiled its new Audio Management Batch processor extension (AMB) that adds enterprise functionality to its modular software. To be sh
Bexel Launches New Sideline Audio/Video Cart
Bexel has launched its new Sideline Audio/Video Cart, a plug-and-play solution for streamlining the acquisition of field audio and video feeds in stad
NUGEN Audio Previews New AMB Dolby Module
At the recent IBC Show 2017, NUGEN Audio showcased a beta version of its AMB Dolby Module. AMB is designed to improve workflow efficiency for a range
Genelec, IDA Audio Announce Partnership
Genelec has announced a partnership with IDA Audio to create an immersive audio experience for professional headphone users. Accessible via a software
Calrec Audio Name New Distributor
Calrec Audio has announced JBD S.A. as the exclusive distributor of its Calrec Audio console range throughout Poland. It is understood that Calrec's c
NUGEN Audio Unveils New Halo Downmix
At NAB 2017, NUGEN Audio unveiled its new Halo Downmix, a highly creative solution for precise downmixing of feature-film and 5.1 mixes to stereo. Hal
Jünger Audio Introduces New High Performance Audio Processing Products At Broadcast Asia 2011
Dynamics processing specialist Jünger Audio will be showing a number of new projects at Broadcast Asia 2011 (Stand: 4U3-01), including the award-winni