What is a “virtual” loudspeaker? Part 3

#91.3 in a series of articles about the technology behind Bang & Olufsen

In Part 1 of this series, I talked about how a binaural audio signal can (hypothetically, with HRTFs that match your personal ones) be used to simulate the sound of a source (like a loudspeaker, for example) in space. However, to work, you have to make sure that the left and right ears get completely isolated signals (using earphones, for example).

In Part 2, I showed how, with enough processing power, a large amount of luck (using HRTFs that match your personal ones PLUS the promise that you’re in exactly the correct location), and a room that has no walls, floor or ceiling, you can get a pair of loudspeakers to behave like a pair of headphones using crosstalk cancellation.

There’s not much left to do to create a virtual loudspeaker. All we need to do is to:

Take the signal that should be sent to a right surround loudspeaker (for example) and filter it using the HRTFs that correspond to a sound source in the location that this loudspeaker would be in. REMEMBER that this signal has to get to your two ears since you would have used your two ears to hear an actual loudspeaker in that location.
Send those two signals through a crosstalk cancellation processing system that causes your two loudspeakers to behave more like a pair of headphones.

Figure 1: A block diagram of the system described above.

One nice thing about this system is that the crosstalk cancellation is only there to ensure that the actual loudspeakers behave more like headphones. So, if you want to create more virtual channels, you don’t need to duplicate the crosstalk cancellation processor. You only need to create the binaurally-processed versions of each input signal and mix those together before sending the total result to the crosstalk cancellation processor, as shown below.

Figure 2: You only need one crosstalk cancellation system for any number of virtual channels.

This is good because it saves on processing power.

So, there are some important things to realise after having read this series:

All “virtual” loudspeakers’ signals are actually produced by the left and right loudspeakers in the system. In the case of the Beosound Theatre, these are the Left and Right Front-firing outputs.
Any single virtual loudspeaker (for example, the Left Surround) requires BOTH output channels to produce sound.
If the delays (aka Speaker Distance) and gains (aka Speaker Levels) of the REAL outputs are incorrect at the listening position, then the crosstalk cancellation will not work and the virtual loudspeaker simulation system won’t work. How badly is doesn’t work depends on how wrong the delays and gains are.
The virtual loudspeaker effect will be experienced differently by different persons because it’s depending on how closely your actual personal HRTFs match those predicted in the processor. So, don’t get into fights with your friends on the sofa about where you hear the helicopter…
The listening room’s acoustical behaviour will also have an effect on the crosstalk cancellation. For example, strong early reflections will “infect” the signals at the listening position and may/will cause the cancellation to not work as well. So, the results will vary not only with changes in rooms but also speaker locations.

Finally, it’s worth nothing that, in the specific case of the Beosound Theatre, by setting the Speaker Distances and Speaker Levels for the Left and Right Front-firing outputs for your listening position, then you have automatically calibrated the virtual outputs. This is because the Speaker Distances and Speaker Levels are compensations for the ACTUAL outputs of the system, which are the ones producing the signal that simulate the virtual loudspeakers. This is the reason why the four virtual loudspeakers do not have individual Speaker Distances and Speaker Levels. If they did, they would have to be identical to the Left and Right Front-firing outputs’ values.

What is a “virtual” loudspeaker? Part 2

#91.2 in a series of articles about the technology behind Bang & Olufsen

In Part 1, I talked at how a binaural recording is made, and I also mentioned that the spatial effects may or may not work well for you for a number of different reasons.

Let’s go back to the free field with a single “perfect” microphone to measure what’s happening, but this time, we’ll send sound out of two identical “perfect” loudspeakers. The distances from the loudspeakers to the microphone are identical. The only difference in this hypothetical world is that the two loudspeakers are in different positions (measuring as a rotational angle) as shown in Figure 1.

Figure 1: Two identical, “perfect” loudspeakers in a free field with a single “perfect” microphone.

In this example, because everything is perfect, and the space is a free field, then output of the microphone will be the sum of the outputs of the two loudspeakers. (In the same way that if your dog and your cat are both asking for dinner simultaneously, you’ll hear dog+cat and have to decide which is more annoying and therefore gets fed first…)

Figure 2: The output from the microphone is the sum of the outputs from the two loudspeakers. At any moment in time, the value of the top plot + the value of the middle plot = the value of the bottom plot.

IF the system is perfect as I described above, then we can play some tricks that could be useful. For example, since the output of the microphone is the sum of the outputs of the two loudspeakers, what happens if the output of one loudspeaker is identical to the other loudspeaker, but reversed in polarity?

Figure 3: If the output of Loudspeaker 1 is exactly the same as the output of Loudspeaker 2 except for polarity, then the sum (the output of the microphone) is always 0.

In this example, we’re manipulating the signals so that, when they add together, you nothing at the output. This is because, at any moment in time, the value of Loudspeaker 2’s output is the value of Loudspeaker 1’s output * -1. So, in other words, we’re just subtracting the signal from itself at the microphone and we get something called “perfect cancellation” because the two signals cancel each other at all times.

Of course, if anything changes, then this perfect cancellation won’t work. For example, if one of the loudspeakers moves a little farther away than the other, then the system is broken, as shown below.

Figure 4: A small shift in time in the output of Loudspeaker 2 cases the cancellation to stop working so well.

Again, everything that I’ve said above only works when everything is perfect, and the loudspeakers and the microphone are in a free field; so there are no reflections coming in and ruining everything.

We can now combine these two concepts:

using binaural signals to simulate a sound source in a location (although this would normally be done using playback over earphones to keep it simple) and
using signals from loudspeakers to cancel each other at some location in space as a

to create a system for making virtual loudspeakers.

Let’s suspend our adherence to reality and continue with this hypothetical world where everything works as we want… We’ll replace the microphone with a person and consider what happens. To start, let’s just think about the output of the left loudspeaker.

Figure 5: The output of the left loudspeaker reaches both ears with different time/frequency characteristics caused by the HRTF associated with that sound source location.

If we plot the impulse responses at the two ears (the “click” sound from the loudspeaker after it’s been modified by the HRTFs for that loudspeaker location), they’ll look like this:

Figure 6: The impulse responses of the HRTFs for a sound source at 30º left of centre.

What if were were able to send a signal out of the right loudspeaker so that it cancels the signal from the left loudspeaker at the location of the right eardrum?

Figure 7: What if we could cancel the signal from the left loudspeaker at the right ear using the right loudspeaker?

Unfortunately, this is not quite as easy as it sounds, since the HRTF of the right loudspeaker at the right ear is also in the picture, so we have to be a bit clever about this.

So, in order for this to work we:

Send a signal out of the left loudspeaker.
We know that this will get to the right eardrum after it’s been messed up by the HRTF. This is what we want to cancel…
…so we take that same signal, and
- filter it with the inverse of the HRTF of the right loudspeaker
  (to undo the effects of the HRTF of the right loudspeaker’s signal at the right ear)
- filter that with the HRTF of the left loudspeaker at the right ear
  (to match the filtering that’s done by your head and pinna)
- multiply by -1
  (so that it will cancel when everything comes together at your right eardrum)
- and send it out the right loudspeaker.

Hypothetically, that signal (from the right loudspeaker) will reach your right eardrum at the same time as the unprocessed signal from the left loudspeaker and the two will cancel each other, just like the simple example shown in Figure 3. This effect is called crosstalk cancellation, because we use the signal from one loudspeaker to cancel the sound from the other loudspeaker that crosses to the wrong side of your head.

This then means that we have started to build a system where the output of the left loudspeaker is heard ONLY in your left ear. Of course, it’s not perfect because that cancellation signal that I sent out of the right loudspeaker gets to the left ear a little later, so we have to cancel the cancellation signal using the left loudspeaker, and back and forth forever.

If, at the same time, we’re doing the same thing for the other channel, then we’ve built a system where you have the left loudspeaker’s signal in the left ear and the right loudspeaker’s signal in the right ear; just like a pair of headphones!

However, if you get any of these elements wrong, the system will start to under-perform. For example, if the HRTFs that I use to predict your HRTFs are incorrect, then it won’t work as well. Or, if things aren’t time-aligned correctly (because you moved) then the cancellation won’t work.

on to Part 3

What is a “virtual” loudspeaker? Part 1

#91.1 in a series of articles about the technology behind Bang & Olufsen

Without connecting external loudspeakers, Bang & Olufsen’s Beosound Theatre has a total of 11 independent outputs, each of which can be assigned any Speaker Role (or input channel). Four of these are called “virtual” loudspeakers – but what does this mean? There’s a brief explanation of this concept in the Technical Sound Guide for the Theatre (you’ll find the link at the bottom of this page), which I’ve duplicated in a previous posting. However, let’s dig into this concept a little more deeply.

To begin, let’s put a “perfect” loudspeaker in a free field. This means that it’s in a space that has no surfaces to reflect the sound – so it’s an acoustic field where the sound wave is free to travel outwards forever without hitting anything (or at least appear as this is the case). We’ll also put a “perfect” microphone in the same space.

Figure 1: A loudspeaker and a microphone (the circle) in a free field: an infinite space completely free of reflective surfaces.

We then send an impulse; a very short, very loud “click” to the loudspeaker. (Actually a perfect impulse is infinitely short and infinitely loud, but this is not only inadvisable but impossible, and probably illegal.)

Figure 2: The “click” signal that’s sent to the input of the loudspeaker.

That sound radiates outwards through the free field and reaches the microphone which converts the acoustic signal back to an electrical one so we can look at it.

Figure 3: The “click” signal that is received at the microphone’s location and sent out as an electrical signal.

There are three things to notice when you compare Figure 3 to Figure 2:

The signal’s level is lower. This is because the microphone is some distance from the loudspeaker.
The signal is later. This is because the microphone is some distance from the loudspeaker and sound waves travel pretty slowly.
The general shape of the signals are identical. This is because I said that the loudspeaker and the microphone were both “perfect” and we’re in a space that is completely free of reflections.

What happens if we take away the microphone and put you in the same place instead?

Figure 4: The microphone has been replaced by something more familiar.

If we now send the same click to the loudspeaker and look at the “outputs” of your two eardrums (the signals that are sent to your brain), these will look something like this:

Figure 5: The outputs of your two eardrums with the same “click” signal from the loudspeaker.

These two signals are obviously very different from the one that the microphone “hears” which should not be a surprise: ears aren’t microphones. However, there are some specific things of which we should take note:

The output of the left eardrum is lower than that of the right eardrum. This is largely because of an effect called “head shadowing” which is exactly what it sounds like. The sound is quieter in your left ear because your head is in the way.
The signal at the right eardrum is earlier than at the left eardrum. This is because the left eardrum is not only farther away, but the sound has to go around your head to get there.
The signal at the right eardrum is earlier than the output of the microphone output (in Figure 3) because it’s closer to the loudspeaker. (I put the microphone at the location of the centre of the simulated head.) Similarly the left ear output is later because it’s farther away.
The signal at the right eardrum is full of spikes. This is mostly caused by reflections off the pinna (the flappy thing on the side of your head that you call your “ear”) that arrive at slightly different times, and all add together to make a mess.
The signal at the left eardrum is “smoother”. This is because the head itself acts as a filter reducing the levels of the high frequency content, which tends to make things less “spiky”.
Both signals last longer in time. This is the effect of the ear canal (the “hole” in the side of your head that you should NOT stick a pencil in) resonating like a little organ pipe.

The difference between the signals in Figures 2 and 4 is a measurement of the effect that your head (including your shoulders, ears/pinnae) has on the transfer of the sound from the loudspeaker to your eardrums. Consequently, we geeks call it a “head-related transfer function” or HRTF. I’ve plotted this HRTF as a measurement of an impulse in time – but I could have converted it to a frequency response instead (which would include the changes in magnitude and phase for different frequencies).

Here’s the cool thing: If I put a pair of headphones on you and played those two signals in Figure 5 to your two ears, you might be able to convince yourself that you hear the click coming from the same place as where that loudspeaker is located.

Although this sounds magical, don’t get too excited right away. Unfortunately, as with most things in life, reality tends to get in the way for a number of reasons:

Your head and ears aren’t the same shape as anyone else’s. Your brain has lived with your head and your ears for a long time, and it’s learned to correlate your HRTFs with the locations of sound sources. If I suddenly feed you a signal that uses my HRTFs, then this trick may or may not work, depending on how similar we are. This is just like borrowing someone else’s glasses. If you have roughly the same prescription, then you can see. However, if the prescriptions are very different, you’ll get a headache very quickly.
In reality, you’re always moving. So, even if the sound source is not moving, the specific details of the HRTFs are always changing (because the relative positions and angles to your ears are changing) but my system doesn’t know about this – so I’m simulating a system where the loudspeaker moves around you as you rotate your head. Since this never happens in real life, it tends to break the simulation.
The stuff I showed above doesn’t include reflections, which is how you determine distance to sources. If I wanted to include reflections, each reflection would have to have its own HRTF processing, depending on its angle relative to your head.

However, hypothetically, this can work, and lots of people have tried. The easiest way to do this is to not bother measuring anything. You just take a “dummy head” -a thing that is the same size as an average human head (maybe with an average torso) and average pinnae* – but with microphones where the eardrums are – and you plunk it down in a seat in a concert hall and record the outputs of the two “ears”. You then listen to this over earphones (we don’t use headphones because we want to remove your pinnae from the equation) and you get a “you are there” experience (assuming that the dummy head’s dimensions and shape are about the same as yours). This is what’s known as a binaural recording because it’s a recording that’s done with two ears (instead of two or more “simple” microphones).

If you want to experience this for yourself, plug a pair of headphones into your computer and do a search for the “Virtual Barber Shop” video. However, if you find that it doesn’t work for you, don’t be upset. It just means that you’re different: just like everyone else.* Typically, recordings like this have a strange effect of things sounding very close in the front, and farther away as sources go to the sides. (Personally, I typically don’t hear anything in the front. All of the sources sound like they’re sitting on the back of my neck and shoulders. This might be because I have a fat head (yes, yes… I know…) and small pinnae (yes, yes…. I know…) – or it might indicate some inherent paranoia of which I am not conscious.)

* Of course, depressingly typically, it goes without saying that the sizes and shapes of commercially-available dummy heads are based on averages of measurements of men only. Neither women nor children are interested in binaural recordings or have any relevance to such things, apparently…

on to Part 2

Beosound Theatre: Virtual loudspeakers

#90 in a series of articles about the technology behind Bang & Olufsen

Devices such as the ‘stereoscope’ for representing photographs (and films) in three-dimensions have been around since the 1850s. These work by presenting two different photographs with slightly different perspectives two the two eyes. If the differences in the photographs are the same as the differences your eyes would have seen had you ‘been there’, then your brain interprets into a 3D image.

A similar trick can be done with sound sources. If two different sounds that exactly match the signals that you would have heard had you ‘been there’ are presented at your two ears (using a binaural recording) , then your brain will interpret the signals and give you the auditory impression of a sound source in some position in space. The easiest way to do this is to ensure that the signals arriving at your ears are completely independent using headphones.

The problem with attempting this with loudspeaker reproduction is that there is ‘crosstalk’ or ‘bleeding of the signals to the opposite ears’. For example, the sound from a correctly-positioned Left Front loudspeaker can be heard by your left ear and your right ear (slightly later, and with a different response). This interference destroys the spatial illusion that is encoded in the two audio channels of a binaural recording.

However, it might be possible to overcome this issue with some careful processing and assumptions. For example, if the exact locations of the left and right loudspeakers and your left and right ears are known by the system, then it’s (hypothetically) possible to produce a signal from the right loudspeaker that cancels the sound of the left loudspeaker in the right ear, and therefore you only hear the left channel in the left ear. (Of course, the cancelling signal of the right loudspeaker also bleeds to the left ear, so the left loudspeaker has to be used to cancel the cancellation signal of the right loudspeaker in the left ear, and so on…)

Using this ‘crosstalk cancellation’ processing, it becomes (hypothetically) possible to make a pair of loudspeakers behave more like a pair of headphones, with only the left channel in the left ear and the right in the right. Therefore, if this system is combined with the binaural recording / reproduction system, then it becomes (hypothetically) possible to give a listener the impression of a sound source placed at any location in space, regardless of the actual location of the loudspeakers.

Theory vs. Reality

It’s been said that the difference between theory and practice is that, in theory, there is no difference between theory and practice, whereas in practice, there is. This is certainly true both of binaural recordings (or processing) and crosstalk cancellation.

In the case of binaural processing, in order to produce a convincing simulation of a sound source in a position around the listener, the simulation of the acoustical characteristics of a particular listener’s head, torso, and (most importantly) pinnae (a.k.a. ‘ears’) must be both accurate and precise. (For the same reason that someone else should not try to wear my glasses.)

Similarly, a crosstalk cancellation system must also have accurate and precise ‘knowledge’ of the listener’s physical characteristics in order to cancel the signals correctly; but this information also crucially includes the exact locations of the

loudspeakers and the listener (we’ll conveniently pretend that the room you’re sitting in does not exist).

In the end, this means that a system with adequate processing power can use two loudspeakers to simulate a ‘virtual’ loudspeaker in another location. However, the details of that spatial effect will be slightly different from person to person (because we’re all shaped differently). Also, more importantly, the effect will only be experienced by a listener who is positioned correctly in front of the loudspeakers. Slight movements (especially from side-to-side, which destroys the symmetrical time-of-arrival matching of the two incoming signals) will cause the illusion to collapse.

Beosound Theatre gives you the option to choose Virtual Loudspeakers that appear to be located in four different positions: Left and Right Wide, and Left and Right Elevated. These signals are actually produced using the Left and Right front-firing outputs of the device using this combination of binaural processing and crosstalk cancellation in the Dolby Atmos processing system. If you are a single listener in the correct position (with the Speaker Distances and Speaker Levels adjusted correctly) then the Virtual outputs come very close to producing the illusion of correctly-located Surround and Front Height loudspeakers.

However, in cases where there is more than one listener, or where a single listener may be incorrectly located, it may be preferable to use the ‘side-firing’ and ‘up-firing’ outputs instead.

Dialogues on the Prospects of Recording

This is a radio show by Glenn Gould from 1965 that is the audio version which was expanded by Gould into an article written for High Fidelity magazine’s 15th anniversary edition (which can be downloaded from this site. The article starts on page 46.)

The High Fidelity Magazine archive

Canadiana

Keep your needle clean

One of my jobs at Bang & Olufsen is to do the final measurements on each bespoke Beogram 4000c turntable before it’s sent to the customer. Those measurements include checking the end-to-end magnitude response, playing from a vinyl record with a sine sweep on it (one per channel), recording that from the turntable’s line-level output, and analysing it to make sure that it’s as expected. Part of that analysis is to very that the magnitude responses of the left and right channel outputs are the same (or, same enough… it’s analogue, a world where nothing is perfect…)

Today, I was surprised to see this result on a turntable that was being inspected part-way through its restoration process :

Taken at face value, this should have resulted in a rejection – or at least some very serious questions. This is a terrible result, with unacceptable differences in output level between the two channels. When I looked at the raw measurements, I could easily see that the left channel was behaving – it was the right channel that was all over the place.

The black curve looks very much like what I would expect to see. This is the result of playing a track that is a sine sweep from 20 Hz to 20 kHz, where the signal below 1 kHz follows the RIAA curve, whereas the signal above 1 kHz does not. This is why, after it’s been filtered using a RIAA preamp, the low frequency portion has a flat response, but the upper frequency band rolls off (following the RIAA curve).

Notice that the right channel (the red curve) is a mess…

A quick inspection revealed what might have been the problem: a small ball of fluff collected around the stylus. (This was a pickup that was being used to verify that the turntable was behaving through the restoration – not the one intended for the final customer – and so had been used multiple times on multiple turntables.)

So, we used a stylus brush to clean off the fluff and ran the measurement again. The result immediately afterwards looked like this:

which is more like it! A left-right channel difference of something like ± 0.5 dB is perfectly acceptable.

The moral of the story: keep your pickup clean. But do it carefully! That cantilever is not difficult to snap.

WUSC / EUMC

I was listening to an episode of People Fixing the World on the BBC today and learned for the first time about The Student Refugee Program. Turns out that, since 1978, it has supplied placements for refugees to come to Canada and study in universities across the country. It’s primarily funded by students as a small surcharge built into the tuition fees, so it’s not contingent on governmental budgets and therefore passing trends in attitudes about immigration.

Listening to the interviews with participants and the organisers made me proud to be Canadian…

More information at wusc.ca.

How clever are you?

N.B. I updated this page on 2023 04 05 based on new information from our suppliers…

We have two cars. One is a fully-electric car, and the other is a diesel.

Originally, the plan we had with our electricity supplier for the electric car was a flat fee per month, and an “all you can eat” plan. This made the choice of which car to drive a no-brainer: take the electric car whenever possible.

However, due to the rising price of energy, our supplier is changing their plan to a new pricing structure. The new price will be

799 DKK per month flat fee + kWh * (average electrical price – 0.89)

The reasoning behind this pricing is explained on their website – I won’t bother getting into that.

Note that they define the “average electrical price” as the average monthly price for both DK1 and DK2 (Denmark is split into two regions for electricity prices). The calculation is done on a charge-by-charge basis, where the month that’s chosen for the calculation is the month when you unplug the cable at the end of charging your car.

Our problem is that it made the decision of which car to drive (looking at it from a purely economic point of view) complicated. If we park the electric car, it still costs us 799 DKK / month + the price of diesel in the other car. On the other hand, if we drive the electric car, it costs us something that’s difficult to calculate when you’re heading out to the car in the morning with only one cup of coffee in you…

One thing that makes it even more complicated is the fact that, if we charge the electric car at home, we first pay our normal electricity supplier for the power we used, and we then get reimbursed by the electricity supplier for the electric car by some amount per kWh.

The way the electricity supplier for the electric car calculates this reimbursement is also complicated: They use the average monthly electricity price between 11:00 p.m. and 6:00 a.m. including charges. That number changes but it’s currently defaulting to 1.33 DKK / kWh on this page – look for the “Tilbagebetalingssats” amount in the sidebar on the right called “Tilbagebetaling”. (Note that this value is difficult if not impossible to determine using the NordPool information. The webpage linked above calculates it from the “forventet indkøbspris” that you can change yourself on their calculator.

It turned out that figuring out this problem was the most interesting math that I did this week. I ran the calculations first in Matlab, and then duplicated them in Excel (for compatibility’s sake) to find out how to deal with this.

The variables are:

Electrical supplier for the electric car:
- Flat monthly rate for our subscription
- The amount that they subtract from the average Danish price, per kWh for charging the car (currently 0.89 DKK)
- The amount that they pay us back to cover a portion of the electrical costs when we charge the car at home
The price we pay for electricity for the house
Average electricity price in DKK / MWh
(available from this page. Select the DK1 and DK2 prices for the month of interest. The Excel spreadsheet finds the average of those two values, and adds 25% tax. shown at the bottom in cell B17 in DKK/kWh)
Fossil fuel Price in DKK/litre (in my case, that’s diesel)
Consumption of the two cars
- Average consumption of the electric car in kWh/100 km
- Average consumption of the fossil-fueled car in litres/100 km
Total number of km driven per month

The result is two plots:

The one on the left shows the price of driving each car individually, based on the total number of km driven in the month, as a function of how many of those km are driven in the electric car.
- The green line shows the cost of driving the electric car if we charge it at a station away from home
- The red line shows the cost of driving the electric car if we charge it at home
- The black line shows the cost of driving the fossil-fuel car
The one on the right shows our total price, as a function of how many of the total number of km driven are driven in the electric car.

So, as you can see in the plots above, at the current prices, and using the average consumption values for our two cars, the more we drive the electric car, the more money we save, and we’ll save a lot more money if we don’t charge at home.

Looking at the plot on the right, if we park the electric car (0 km on the X-axis) we’ll spend about 2700 DKK per month. If we only drive the electric car (2000 km on the X-axis) and charge away from home at charging stations, then we’ll spend less than 1000 DKK (green line on the right-hand plot). Quite a savings! If we charge at home, we’ll spend about 2200 DKK (red line on the right-hand plot) – still cheaper than the diesel, but more than double the price of NOT charging at home.

In case you are in the same position as we are, and the little Excel calculator I made might be useful, you can download it here. However, I make no promises about its reliability. Don’t send me an email because I screwed up the math – fix it yourself. :-)

2023 05 19 update: We switched to “spot pricing” for the house electricity. So, this calculation has become dependent on the time of day when we charge the car. As a result, I’ve given up trying to understand it…

earfluff and eyecandy

audio, photography, and other stuff

Author: geoff