name.png




Home       Research      Publications      Teaching
rdmg1 at dmu.ac.uk


Spatial audio


The synthesis of spatial audio environments can be approached in several ways. In a purely physical approach we aim to recreate as accurately as possible the sound in the ear canals, the binaural signals, that would result from real sources. This can be done either by recreating the sound field in a region around the head, allowing freedom of head movement, or just at the ear canals, in which case movement is restricted or requires active tracking. In the first case delivery is via loudspeakers  (eg High Order Ambisonics or Wavefield Synthesis), whereas the second case can be via delivered via loudspeakers (eg transaural stereo), or via headphones (binaural synthesis). Psychoacoustic methods allow the binaural signals to be simplified by focusing  on aspects that are most relevant for spatial perception.

The research summarized here has investigated different aspects of these approaches, including source synthesis and effects in low resolution sound field synthesis, complex source synthesis in high resolution systems and studio environments, and binaural synthesis of near sources using virtual sound field methods. The following link to more information-




Quasi Wavefield Synthesis (QWFS)

Horizontal reproduction with point sources is not a well conditioned problem. Distributed constraints can be used to find optimal compromise solutions that are much improved when compared with standard 2.5D WFS, particularly for plane waves. These can then be used to modify WFS driving functions, without any increase in computational cost. The resulting solutions are surprisingly close to the optimal ones, and therefore worthy of attention.

Menzies, D. 'Quasi Wave Field Synthesis : Efficient Driving Functions for Improved 2.5D Sound Field Reproduction', AES 52nd International Conference, University of Surrey September 2013. More in slides



Distributed Modal Constraints : Sound field control with general boundaries

The problem of controlling the interior field of a given continuous boundary of monopole drivers is solved formally using simple source method, which involves the exterior Helmholtz problem, equivalent to a scattering problem. A more practical and general method was sought, that solves directly for a discrete boundary and can specify a sub-region of the interior as target.

A solution has been developed by extending the Ambisonic decoding process from controlling a single region to multiple overlapping regions simultaneously. This allows the interior of arbitrary boundaries, or arbitrary sweet spots, to be controlled completely. This provides a much more flexible tool than standard HOA or wavefield approaches.

Menzies, D. ‘Sound field Control with Distributed Modal Constraints', Acta Acustica united with Acustica, to appear.
Menzies, D.
'Sound Synthesis for General Enclosures', Ambisonic Symposium, IRCAM Paris, May 5-7 2010.


Some examples-

Speakers on an L shaped boundary, filling out the interior with a planewave. The bottom plot shows the absolute relative error.

 

Speakers on a dome with a squashed, raised sweet area. Waves from two directions shown. The energy is minimised and accuracy is focused on the desired region:

dome.jpg

dome.jpg


Control of four regions independently. This becomes increasingly difficult as the regions become closer and more opposed:

indep.jpg


opposed Interior point sources can be represented in exotic ways, over continuous regions that can extend even over 180 degrees from the source, and at surrounding islands:


Near-field binaural synthesis

An area where binaural systems are still developing is the provision for source distance variation up to the near-field. If this can be achieved accurately and practically it would enable a variety of applications involving objects that are within manual interaction distance - arms length.

One approach to this is to derive near HRTFs from distant HRTFs using physical principles. Although promising, there are a number of complications. By varying the construction of the virtual sound field used it is possible to improve the HRTFs calculated. This line of research also opens up more generally the question of representing point sources with freefields, and has implications for real sound field construction, for example using Wavefield methods.

Menzies, D. 'Near-field HRTFs from Point Source Representations', Ambisonic Symposium, IEM Graz, June 24-28 2009.
Menzies, D.
and Al-Akaidi, M. 'Nearfield Binaural Synthesis and Ambisonics', J.Acous.Soc.Am. March 2007.


The following diagrams show the approximation error of a source located at the origin represented by a focused source that has been further re-expanded using a Fourier Bessel expansion with maximum order N. This allows far-field accuracy to be traded for near-field accuracy.

3DFSEerr.jpg


Production with complex sources in the studio

In the studio environment sources are frequently recorded with a single microphone, thus loosing directional source information. Even when multiple microphones are used, the information is treated in a way that does not reproduce the directive qualities of the object fully. The direct signal from a source is usually narrowly spatially confined. On the other hand the reverberant signal comes from all directions. The pattern of reverberation is dependent on the direction the sound leaves the source. Since many sources have complex patterns of directivity that change rapidely with time, the resulting reverberant field changes too.  A simple method is found to approximate features of this field using multiple microphone recordings processed with reverberators, resulting in a stereo image that appears natural like a stereo crossed-pair recording of the source in a real room. A more compact parametric representation of the complex source is also considered.

Menzies, D. ‘Parametric Representation of Complex Sources in Reflective Environments’, Proc. AES 128th International Convention, Paris, May 2010.

Studio test materials are available here.

The following diagram illustrates how different reflections originate from different parts of the source, in a simple room shape.

room.jpg


Sound field synthesis of complex sources

A general source with directivity can be represented by a spherical harmonic encoding. In order to render the source with high order sound field reconstruction, the free-field expansion of the source field is required about any point, even in the near-field of the source. A solution if found, providing the natural extension to 'O-format' considered in earlier work.

Menzies, D. and Al-Akaidi, M. 'Ambisonic Synthesis of Complex Sources', J. Audio Eng. Soc, October 2007

The images below shows the direct field of a complex source at O, and part of this field accurately resynthesized at B around the listener, using a high-order Ambisonic encoding derived from the source.


Source location and effects in 1st order Ambisonics

1st order Ambisonics is an early sound field encoding and reconstruction system that works efficiently on low numbers of speakers, and was originally designed to address deficiencies in quad systems. A suite of source rendering and processing methods were developed to address the lack of digital tools. These include through centre panning, using a technique called W-panning to simulate extended object width with independently controlled gain. A frequency spreading option was also added. A sound field feedback network was designed for creating spatially accurate echos and reflections, using rotation in the feedback loop to spread reflections in different directions.

objectModel.jpg

O-format
is a 1st order encoding of source directionality, in a sense the reverse of the 1st order Ambisonics encoding B-format. A simple approximation was found for rendering O-format sources in terms of the dominance operation.

oformat.jpg


Menzies, D. 'W-panning and O-format, Tools for object spatialization', AES 22nd International Conference on Virtual, Synthetic and Entertainment Audio, June 2002.
Menzies, D. 'New Performance Instruments for Electroacoustic Music', PhD thesis, University of York Electronics Dept, 1998. (British Library 1999)

A real-time application, LAmb, was built for silicon graphics machines incorporating these sound processes with graphical controls and additional features for object control using an external midi keyboard, and a recorder. It was designed for live diffusion as well as studio mixing.

lamb.tar.gz

LAmb tutorial (pdf)

Menzies, D. 'LAmb, an Introduction and Tutorial', Bourges Synthese, June 1997.


lamb.jpg