Headphone listening and binaural recording

You only need to look around a train carriage or a busy street to see lots of people listening to music on headphones. With the ever-increasing availability of portable music, from MP3 players to smartphones, many of us now have a frequent musical accompaniment to our daily travels, delivered to us via headphones. And yet, despite so many people listening using headphones, music is still almost exclusively mixed for listening using loudspeakers.

If you listen to music over headphones, you get a very different sonic image than when using loudspeakers: with loudspeakers you usually perceive the sounds to be in front of or around you; with headphones it now sounds to be within your head. That’s very different to how we normally hear sound, and can get a bit annoying or distracting.

If we know that a listener will be using headphones, surely we can do something to make it sound better?

How we hear

If you close your eyes, you can still determine where noisy things are, as well as something about the room you’re in. You’re able to do this because you have two ears, one on each side of your head. A sound coming from one side of your head will reach the closer ear before the further one, and having your head in the way of the further ear will cause an acoustic ‘shadow’. The shape of your ears also helps to differentiate where sounds come from due to the way the sound reflects within them. Your brain makes use of the way the room and your head and ears affect the sound to create this mental image of your surroundings.

ITDILD.graffle

When we mix music for loudspeakers, we make use of how our hearing works to position the sounds between the loudspeakers. We know what changes happen as the sound travels from the loudspeakers to your ears, so we know what to do to correctly position the sounds. However, if we then feed these loudspeaker signals directly to your ears via headphones, the effect of the room and your head is removed, and the sound at the ears is no longer correct. Hence, the sound tends to collapse into your head.

We call a sound that includes all these subtle and complex cues caused by having an ear on each side of a head “binaural” (i.e. two-ear). Wouldn’t it be good if we could modify our existing recordings to create suitable binaural signals for headphone reproduction to allow us to perceive the sounds to be in their natural positions?

Turning loudspeaker sound into binaural sound

The good news is that we can! We can process the signals after they have been mixed for loudspeakers: this is what Focusrite’s VRM Box does. By measuring what happens to the sound between the loudspeakers and your ears, we can make processing algorithms to attempt to recreate this. If we add this processing before your headphones, we can simulate listening to loudspeakers in a room.

vrm

However, by using this method, you’re still limited to the kind of sonic images you can create using loudspeakers: for 2-channel stereo the sounds will only be in front of you. Compare this to real life: if you walk into a concert hall and close your eyes, you’ll be able to hear sounds from all around. The performers will usually be on the stage in front. For some things you may have extra performers at the side or behind. The reverberation of the hall is all around and above and below. And if you’re unlucky, you can hear the person in the row behind eating crisps.

True 3-dimensional sound

So how can we make things even better, and have a truly three-dimensional sonic image? To do this, we need to forget about how we usually record things for listening on loudspeakers, at least for now. If we record the sound using microphones positioned at the ears of a real or artificial head, we inherently capture a binaural recording with all the appropriate cues, which should give a full three-dimensional spatial image. If you recorded it with microphones in your own ears and later listened to this over headphones, you should have a similar audio experience to when you were in the room.

cortexhead

This would work very well if it was recorded with your head, though admittedly it’s more difficult to do this for a wider audience. The main problem is that your head and ears are different from mine, which are different from hers, and so on. These differences mean that something recorded with my head isn’t quite correct for anyone else. By listening all your life, you have effectively learnt the response of your own ears (and in fact research has shown that over the course of a few weeks you can adjust to someone else’s!). However, it means that any binaural reproduction is unlikely to be perfectly accurate for your ears, unless you’ve used your ears for the recording in the first place. But, despite us not having standardised heads and ears, there are plenty of similarities, so you should get some spatial effect.

We made the following recording to show this off. We used an expensive dummy head for it, but you don’t need to – a small microphone in each ear would be a good start, or even a microphone on each side of a football may work. When you’re listening to this recording, it helps to try to image the sounds being around you rather than in your head, and it also helps if you have a large enough screen for the picture to match the sonic image.

The video is shot from the listener’s perspective and therefore, through the use of the binaural recording technique, you will be able to hear the position of the musicians ‘move’ around you as the viewpoint of the camera changes.

To emphasise the difference, the recording starts off mono – the same sound in each ear. This is the same as a conventional recording if all the sounds were positioned in the middle, and it should sound in the centre of your head. At about 1 minute in, the sound switches to binaural – you should hear the sound stage widen, and things should begin to sound outside your head.

We think that binaural reproduction is a massive untapped opportunity. We hope that you enjoy this demo, we hope it makes you believe that we could make better recordings for headphones, and we hope that it inspires you to give it a try.

Video created in collaboration with Focusrite and Tape Club Records.

Credits

  • Featuring Peter and Kerry: www.facebook.com/peterandkerry
  • Additional musicians: Mark Rainbow (Piano), Sarah Bateson (Vibraphone), Maddie Cutter (Cello), Joel Grainger (Violin).
  • Produced by Jon Alexander at the Institute of Sound Recording (University of Surrey): www.tonmeister.co.uk

Recorded using:

  • Cortex Manikin Mk2 Binaural Head and Torso Simulator
  • Focusrite VRM Box
  • Reaper (tracking)
  • ProTools 8 (edit and post)
  • Canon 6D

Technical Credits:

  • Jon Alexander: Audio and Video Recording
  • Michael Cousins: Recording Assistant
  • Stephen Savory: Video Post-Production