synthesis of spatial
audio environments can be approached in several ways. In a purely
physical approach we aim to recreate as accurately as possible the
sound in the ear canals, the binaural signals, that would result from
real sources. This can be done either by recreating the sound field in
a region around the head, allowing freedom of head movement, or just at
the ear canals, in which case movement is restricted or requires active
tracking. In the first case delivery is via loudspeakers (eg High
Order Ambisonics or Wavefield Synthesis), whereas the second case can
be via delivered via loudspeakers (eg transaural
stereo), or via headphones (binaural synthesis). Psychoacoustic methods
allow the binaural signals to be simplified by focusing on
aspects that are most relevant for spatial perception.
research summarized here has investigated different aspects of these
including source synthesis and effects in low resolution sound field
synthesis, complex source synthesis in high resolution systems and
studio environments, and binaural synthesis of near sources using
virtual sound field methods. The following link to more information-
Quasi Wavefield Synthesis (QWFS)Horizontal reproduction with point sources is not a well conditioned problem. Distributed constraints can be used to find optimal compromise solutions that are much improved when compared with standard 2.5D WFS, particularly for plane waves. These can then be used to modify WFS driving functions, without any increase in computational cost. The resulting solutions are surprisingly close to the optimal ones, and therefore worthy of attention.
Menzies, D. 'Quasi Wave Field Synthesis : Efficient Driving Functions for Improved 2.5D Sound Field Reproduction', AES 52nd International Conference, University of Surrey September 2013. More in slides
Distributed Modal Constraints : Sound field control with general boundaries
problem of controlling the interior field of a given continuous
boundary of monopole drivers is solved formally using simple source method,
which involves the exterior Helmholtz problem, equivalent to a
scattering problem. A more practical and general method was sought,
that solves directly for a discrete boundary and can specify a
sub-region of the interior as target.
has been developed by extending the Ambisonic decoding process
from controlling a single region to multiple overlapping regions
simultaneously. This allows the interior of arbitrary boundaries, or
arbitrary sweet spots, to be
controlled completely. This provides a much more flexible tool than
standard HOA or wavefield approaches.
field Control with Distributed Modal Constraints', Acta Acustica
united with Acustica, to appear.
Speakers on an L shaped boundary, filling out the interior with a planewave. The bottom plot shows the absolute relative error.
on a dome with a squashed, raised sweet area. Waves from two directions
shown. The energy is minimised and accuracy is focused on the desired
four regions independently. This becomes increasingly difficult as the
regions become closer and more opposed:
opposed Interior point sources can be represented in exotic ways, over continuous regions that can extend even over 180 degrees from the source, and at surrounding islands:
Near-field binaural synthesis
An area where binaural systems are still developing is the provision for source distance variation up to the near-field. If this can be achieved accurately and practically it would enable a variety of applications involving objects that are within manual interaction distance - arms length.
approach to this is to derive near HRTFs from distant HRTFs using
physical principles. Although promising, there are a number of
complications. By varying the construction of the virtual sound field
used it is possible to improve the HRTFs calculated. This line of
research also opens up more generally the question of representing
point sources with freefields, and has implications for real sound
construction, for example using Wavefield methods.
Point Source Representations', Ambisonic Symposium, IEM Graz, June
The following diagrams show the approximation error of a source located at the origin represented by a focused source that has been further re-expanded using a Fourier Bessel expansion with maximum order N. This allows far-field accuracy to be traded for near-field accuracy.
Production with complex sources in the studio
studio environment sources are frequently recorded with a single
microphone, thus loosing directional source information. Even when
multiple microphones are used, the information is treated in a way that
does not reproduce the directive qualities of the object fully. The
direct signal from a source is usually narrowly spatially confined. On
the other hand the reverberant signal comes from all directions. The
pattern of reverberation is dependent on the direction the sound leaves
the source. Since many sources have complex patterns of directivity
that change rapidely with time, the resulting reverberant field changes
too. A simple method is found to approximate features of this
field using multiple microphone recordings processed with
reverberators, resulting in a stereo image that appears natural like a
stereo crossed-pair recording of the source in a real room. A more
compact parametric representation of the complex source is also
Representation of Complex Sources in Reflective Environments’,
Proc. AES 128th International Convention, Paris, May 2010.
Studio test materials are available here.
following diagram illustrates how different reflections originate from
different parts of the source, in a simple room shape.
Sound field synthesis of complex sources
A general source with directivity can be represented by a spherical harmonic encoding. In order to render the source with high order sound field reconstruction, the free-field expansion of the source field is required about any point, even in the near-field of the source. A solution if found, providing the natural extension to 'O-format' considered in earlier work.
Menzies, D. and Al-Akaidi, M. 'Ambisonic Synthesis of Complex Sources', J. Audio Eng. Soc, October 2007
The images below shows the direct field of a complex source at O, and part of this field accurately resynthesized at B around the listener, using a high-order Ambisonic encoding derived from the source.
Source location and effects in 1st order Ambisonics
1st order Ambisonics is an early sound field encoding and reconstruction system that works efficiently on low numbers of speakers, and was originally designed to address deficiencies in quad systems. A suite of source rendering and processing methods were developed to address the lack of digital tools. These include through centre panning, using a technique called W-panning to simulate extended object width with independently controlled gain. A frequency spreading option was also added. A sound field feedback network was designed for creating spatially accurate echos and reflections, using rotation in the feedback loop to spread reflections in different directions.
O-format is a 1st order encoding of source directionality, in a sense the reverse of the 1st order Ambisonics encoding B-format. A simple approximation was found for rendering O-format sources in terms of the dominance operation.
Menzies, D. 'W-panning and O-format, Tools for object spatialization', AES 22nd International Conference on Virtual, Synthetic and Entertainment Audio, June 2002.
Menzies, D. 'New Performance Instruments for Electroacoustic Music', PhD thesis, University of York Electronics Dept, 1998. (British Library 1999)
A real-time application, LAmb, was built for silicon graphics machines incorporating these sound processes with graphical controls and additional features for object control using an external midi keyboard, and a recorder. It was designed for live diffusion as well as studio mixing.
Menzies, D. 'LAmb, an Introduction and Tutorial', Bourges Synthese, June 1997.