Excruciating minutiae: Part 3

In Part 2 of this series, I wrote the following sentence:

The easiest (and possibly best) way to do this is to create white noise with a triangular probability distribution function and a peak-to-peak amplitude of ± 1 quantisation level.

That’s a very busy sentence, so let’s unpack it a little.

Rolling the dice

If you roll one die, you have an equal probability of rolling any number between 1 and 6 (inclusive). Let’s roll one die 100 times counting the number of times we get a 1, or a 2, or a 3, and so on up to 6.

Number rolled	Number of times the number was rolled	Percentage of times the number was rolled
1	17	17
2	14	14
3	15	15
4	15	15
5	21	21
6	18	18

(Note that the percentage of times each number was rolled is the same as the number of times each number was rolled only because I rolled the die 100 times.)

If I plot those results, it looks like Figure 1.

Figure 1. The results of rolling 1 die 100 times.

It may be weird, but I’ve plotted the number of times I rolled -5 or 13 (for example). These are 0 times because it’s impossible to get those numbers by rolling one die. But the reason I put those results in there will make more sense later.

Let’s keep rolling the die. If I do it 1,000,000 times instead of 100, I get these results:

ed	Number of times the number was rolled	Percentage of times the number was rolled
1	166225	16.6225
2	166400	16.6400
3	166930	16.6930
4	167055	16.7055
5	166501	16.6501
6	166889	16.6889

Now, since I rolled many, many, more times, it’s more obvious that the six results have an equal probability. The more I roll the die, the more those numbers get closer and closer to each other.

Take a look at the shape of the plot above. The area under the line from 1 to 6 (inclusive) is almost a rectangle because the six numbers are all almost the same.

The shape of that plot shows us the probability of rolling the six numbers on the die, so we call it a probability density function or PDF. In this case, we see a rectangular PDF.

But what happens if we roll two dice instead? Now things get a little more complicated, since there is more than one way to get a total result, as shown in the table below.

Total
2	1+1
3	1+2	2+1
4	1+3	2+2	3+1
5	1+4	2+3	3+2	4+1
6	1+5	2+4	3+3	4+2	5+1
7	1+6	2+5	3+4	4+3	5+2	6+1
8		2+6	3+5	4+4	5+3	6+2
9			3+6	4+5	5+4	6+3
10				4+6	5+5	6+4
11					5+6	6+5
12						6+6

As can be (hopefully) seen in the table, there is only one way to roll a 2, and there’s only one way to roll a 12. But there are 6 different ways to roll a 7. Therefore, if you’re rolling two dice, it’s 6 times more likely that you’ll roll a 7 than a 12, for example.

If I were to roll two dice 1,000,000 times, I would get a PDF like the one shown in Figure 3.

I won’t explain why this would be considered to be a triangular PDF.

Whether you roll one die or two dice, the number you get is random. In other words, you can’t use the past results to predict what the next number will be. However, if you are rolling one die, and you bet that you’ll roll a 6 every time, you’ll be right about 16.7% of the time. If you’re rolling two dice and you bet that you’ll roll a 12 every time, you’ll only be right about 2.8% of the time.

Let’s take two dice of different colours, say, one red die and one blue die. We’ll roll both dice again, but instead of adding the two values, we’ll subtract the blue value from the red one. If we do this 1,000,000 times, we’ll get something like the results shown below in Figure 4.

Notice that the probability density function keeps the same shape, it’s just moved down to a range of ±5 instead of 2 to 12.

Generating noise

In audio, noise is a sound that is completely random. In other words, just like the example with the dice, in a digital audio signal, you can’t predict what the next sample value will be based on the past sample values. However, there are many different ways of generating that random number and manipulating its characteristics.

Let’s start with a computer algorithm that can generate a random number between 0 and 1 (inclusive) with a rectangular PDF. We’ll then ask the algorithm to spit out 1,000,000 values. If the numbers really are random, and the computer has infinite precision, then we’ll probably get 1,000,000 different numbers. However, we’re not really interested in the numbers themselves – we’re interested in how they’re distributed between 0.00 and 1.00. Let’s say we divide up that range into 100 steps (or “buckets”) that are 0.01 wide and count how many of our random numbers fall into each group. So, we’ll count how many are between 0.0 and 0.01, between 0.01 and 0.02, and so on up to 0.99 to 1.00. We’ll get something like Figure 5.

I’ve only plotted the probabilities of the possible values: 0 to 1, which winds up showing only the top of the rectangle in the rectangular PDF.

If I generate 1,000,000 random numbers with that algorithm, and then subtract 1,000,000 other random numbers, one by one, and find the probabilities of the result, the answer will be familiar.

So, this is how we make the noise that’s added to the signal. If, for each sample, you generate two random numbers (making sure that your algorithm has a rectangular PDF) and subtract one from the other, you have the dither signal that will have a maximum level of ±1 quantisation level.

The signal (with a maximum range of ±1) is scaled up by multiplying it by 2^{(NumberOfBits-1)}-2
then you add the result of the dither generator
then the total is rounded to the nearest integer value
and then the result is scaled back down by a factor of 2^{(NumberOfBits-1)} to bring its back down to a range of ±1 to get it ready for exporting to a standard audio file format like .wav or .flac.

In other words, assuming that you have an audio signal called “Signal” that has a range of ±1 and consists of floating point values:

ScaleUp = 2^(Bitdepth-1)-2
ScaleDown = 2^(Bitdepth-1)

TpdfDither = rand(LengthOfSignal) - rand(LengthOfSignal) 

QuantisedDitheredSignal = round(Signal * ScaleUp + TpdfDither) / ScaleDown;

Excruciating minutiae: Part 2

In Part 1, I talked about how an audio signal is quantised, and how the world that the quantised signal lives in is slightly asymmetrical.

Let’s stay in a 3-bit world (to keep things comprehensible on a human scale) and do some recreational quantisation. We’ll start by making a sine wave with a peak amplitude of 1. This means that the total range will be ±1.

Figure 1. A sine wave with an amplitude of 1.

Notice that I put two scales on the plot in Figure 1. On the left, we have the “floating point” amplitude scale. On the right, we have the 8 quantisation levels.

If we are a bit dumb, and we just quantise that sine wave directly, making sure that I’ve aligned the scaling to use ALL possible quantisation values, we get the result in Figure 2.

Figure 2. The original signal in blue and the quantised representation in red.

Notice that, because the original signal is symmetrical (with respect to positive and negative amplitudes) but the quantisation steps are not, we wind up getting a different result for the positive values than the negative values. In other words, after quantisation, I’ve clipped the positive peaks of the original signal.

Okay, so this is a dumb way to do this. A slightly less dumb way is to adjust the scaling so that the original wave does not use all possible quantisation values, as shown in Figure 3.

Notice that I’ve set the sine wave to a slightly lower level, so that it rounds to the top-most positive quantisation level, but this means that it doesn’t use the lowest negative quantisation level. If we’re being really picky, I could have made the sine wave just a little higher in amplitude: by 1/2 of a quantisation step, and the quantised result would still not have clipped asymmetrically.

Dither

As you can see in Figures 2 and 3 above, just taking a signal and quantising it generates an error. The more bits you have in the word length, the more quantisation levels you have, and the smaller the error. However, that error will always be correlated with the signal somehow, and as a result, it’s distortion, which is easy to learn to hear.

If, however, we add a little noise to the signal before we quantise it, then we can randomise the error, which changes the error from producing distortion to a constant signal-independent noise floor. Since the noise makes the quantiser appear to be indecisive, we call it dither.

The easiest (and possibly best) way to do this is to create white noise with a triangular probability distribution function and a peak-to-peak amplitude of ± 1 quantisation level. I’ll explain what that last sentence means in Part 3 of this series.

If we do this, then we

take the signal
add a little noise to it
quantise it

and the result might look like Figure 4.

It should be easy to see that we still have quantisation, and also that I’ve added some random element to the signal.

However, let’s look at the mistake I made in Figure 4. The noise that was added to the signal has an amplitude of ±1 quantisation level. So, we should see cases where the signal looks like it should be rounding to the closest level, but it might be either 1 above or 1 below. (For example, take a look at Time = 70, 71, and 72 as an example of this.)

However, take a look around Time = 20 to 30. Notice that the original signal is close to the top quantisation level. This means that, although a negative value in the dither in those samples can bring the quantisation level down, a positive value cannot bring it up because we don’t have any room for it. This will, again, result in a small amount of asymmetrical clipping. This is a VERY small amount. (Remember that, in the real world we’re probably using 2¹⁶ (= 65,536) or 2²⁴ (= 16,777,216) quantisation values, not 2³ (= 8).

So, if we’re going to avoid this clipping, we need to adjust the scaling of the signal once more, as shown in Figure 5.

This shows a signal that is scaled so that, without dither, it would round to one level away from the top-most quantisation level. When you add the dither, it can go up to that top quantisation level. (In fact, I happened to use the same dither signal for Figures 4 and 5. The only difference is the scaling of the signal.)

Now, I know that if you’re not used to looking at 3-bit signals, and/or if dither is a new concept, the red signal in Figure 5 might make you a little upset. However (and you have to believe me on this…) this is the correct way to encode digital audio. Just because it looks crazy doesn’t mean that it is.

NB: The math

If you want to make the plots above, here’s a simplified version of the math to try it out. Note: I live in a world where a % symbol precedes a comment.

Some Constants

Bitdepth = 3
Fs = 100 % sampling rate in Hz
Fc = 1 % frequency of the sine wave in Hz
TimeInSamples = [0:Fs] % This will make the TimeInSamples all of the integer values from 0 to Fs (therefore, 1 second of audio)

Figure 1

Signal = sin(2 * pi * Fc/Fs * TimeInSamples)

Figure 2

ScaleUp = 2^(Bitdepth-1)
ScaleDown = 2^(Bitdepth-1)

QuantisedSignal = round(Signal * ScaleUp) / ScaleDown;

% Then apply a clipper to remove the top quantisation level.
% You can do this yourself.

Figure 3

ScaleUp = 2^(Bitdepth-1)-1
ScaleDown = 2^(Bitdepth-1)

QuantisedSignal = round(Signal * ScaleUp) / ScaleDown;

Figure 4

ScaleUp = 2^(Bitdepth-1)-1
ScaleDown = 2^(Bitdepth-1)
TpdfDither = rand(LengthOfSignal) - rand(LengthOfSignal)

QuantisedDitheredSignal = round(Signal * ScaleUp + TpdfDither) / ScaleDown;

% Then apply a clipper to remove the top quantisation level.

Figure 5

ScaleUp = 2^(Bitdepth-1)-2
ScaleDown = 2^(Bitdepth-1)
TpdfDither = rand(LengthOfSignal) - rand(LengthOfSignal)

QuantisedDitheredSignal = round(Signal * ScaleUp + TpdfDither) / ScaleDown;

Excruciating minutiae: Part 1

This past week I found a very small oddity in the behaviour of one of the functions in Matlab. This led me down a rabbit hole that I’m still following, but the stuff I’ve learned along the way has proven to be interesting.

The summary

The short version of the story is that I made a test tone which consisted of a sine wave that had a frequency that matched an FFT bin centre so that I could test a thing. In order to get the sine wave through the thing, I had to export the audio signal as something the thing could play. So, I exported it as both a .wav and a .flac file, both with 24-bit word lengths and matching sampling rates.

Once the two signals came back from the thing, they looked different on an FFT analysis. Not very different, but different enough to raise questions. So, I ran the FFT on the .wav and .flac files that I created to do the test and found out that THEY were different, which I didn’t expect, because I know that FLAC is lossless.

The question that came up first was “why are they different?”, and that was just the entrance to the rabbit hole.

The long version

In order to explain what happened, we have to following some advice given by Carl Sagan who said

‘If you wish to make an apple pie from scratch, you must first invent the universe.’

We won’t invent the universe, but we’re going to dig down into the basics of LPCM digital audio in order to come back up to talk about where I wound up last Thursday.

Quantisation

Linear Pulse Code Modulation (LPCM) is a way of encoding signals (like an audio signal) by saving the waveform as a series of measurements of the instantaneous amplitude. However, when you do this, you can’t have a measurement with an infinite resolution, so you have to round off the value to the nearest one you can encode. This is just like measuring something with a ruler that has millimetres marked on it. You can’t really measuring something with a precision of less than the nearest millimetre, so you round off the value to something you know. Whether or not that’s good enough depends on what the measurement is for.

In LPCM digital audio, we call the steps that you can round the values to ‘quantisation levels’ because you’re dividing up the amplitude into discrete quanta. Since the values of those quantisation levels are stored or transmitted using a binary number (containing only 0s and 1s), the number of quantisation levels is a power of 2. For example, if you have a 16-bit (bit = Binary digIT) value, then you can count from

0000 0000 0000 0000 = 0
to
1111 1111 1111 1111 = 2¹⁶ = 65,536

However, since audio signals go above and below 0 (we need to represent positive and negative values) we need a way to split up those options above (a range of 0 to 65,536) to do this.

Let’s take a simple example with a 3-bit long word. Since there are 3 bits, we have 2³ = 8 quantisation levels. It would be nice if 000 in the binary representation referred to a signal value of 0, like this:

All we need to do now is to figure out what binary values to put on the other quantisation levels. To do this, we use a system like the one shown in Figure 2.

If you start at the top, and follow the blue circular arrow going clockwise, you count from 000 ( = 0) all the way to 111 (= 7). However, if you look at the red arrows, you can see that we can assign the binary values to the positive and negative quantisation levels by looking at the circle clockwise for positive values and counter-clockwise for negative ones. This means that we wind up with the assignments shown in Figure 3.

This way of using ‘wrapping’ the values around the circle into number assignments on a one-dimensional (in this case, vertical) scale is called a ‘two’s complement’ method.

There are two nice things about this system:

the middle value of 0 is assigned an actual value of 0, which makes sense to us humans
the first bit (digit) in the binary value tells you whether the level is positive (if it’s a 0) or negative (if it’s a 1).

There is at least one slightly annoying thing about this system: it’s asymmetrical. Notice in Figure 3 that there are 3 available positive quantisation levels, but 4 negative ones. This is because we have an even number of values to use (because it’s a power of 2) but one of the values is 0, leaving an odd, and therefore asymmetrical number of remaining values for the non-0 quantisation levels.

This will come back to be a pain in the arse later…

This week’s weird FFT

This week, I was testing a device that required that I look WAY down into the floor caused by the noise+distortion artefacts in the presence of a signal.

One trick to do this is to play a sinusoidal wave through the system and do an FFT of the output. However, as I described in this posting a long time ago, there is an interaction between the frequency you choose and the behaviour of an FFT on a digital signal (yes… I know it’s really a DFT – but let’s not be pedantic…)

For example, if I do a 65536-point FFT on a 997 Hz sine tone in a 48 kHz sampling rate (with all the floating point precision I have available…) I get a magnitude response that looks like this:

Figure 1. The magnitude response of a 997 Hz sine tone, but is it really?

Obviously, this is NOT the magnitude response of a sinusoidal wave. The “skirts” on either side of 997 Hz are artefacts caused by the fact that I’m using a rectangular window, and the sine wave’s last sample does not line up perfectly with its first when the FFT “wraps” it around to meet itself (read this leading up to Figure 10 for an explanation). That sharp discontinuity causes the extra energy in the other frequency bins as shown above.

If, however, I find out the frequency of the closest FFT bin, and make my sine wave THAT frequency instead, THEN I do an FFT and look at the magnitude response, it looks like Figure 2.

Figure 2. The magnitude response of a 996.8261718750000 Hz sine tone.

Notice that this is not a 997 Hz tone, but a 996.8261718750000 Hz tone instead.

Now the “noise floor” that you see there is the error in my sine wave caused by the precision of my calculator (Matlab). -300 dB is VERY low, and gives me plenty of room to see the errors in the thing that I might be testing (assuming that I can actually get that signal out to my Device Under Test or “DUT” and back in again from it).

Let’s say I were to represent the same sine wave using a 24-bit LPCM signal that has been correctly dithered with TPDF dither, and THEN I do the FFT and calculate the magnitude response. That would look like Figure 3.

Figure 3. The magnitude response of a 996.996.8261718750000 Hz sine tone that has been dithered and quantised with 24 bit precision.

Now, the energy at all the frequencies other than 996.8-ish Hz is the energy in the noise floor generated by the dither. (If you’re wondering why it’s almost 200 dB down, and not 141 dB down (6*24-3), it’s because the total energy in all those FFT bins add up to a noise floor that’s 141 dB below the sine tone.)

Okay. All of those plots show things that I’ve seen before – and are things that I would expect to see when measuring a device.

But then, this week, I did a measurement that produced the magnitude response shown in Figure 4.

Figure 4. A 996.996.8261718750000 Hz sine tone and something else…

This is NOT something I’ve seen before, so it raised one of my two eyebrows. In retrospect, I should have known what would cause this, but at the time, I was very confused. It’s not a noise floor because it’s too flat. It’s not distortion because it doesn’t have harmonics. So what is it?

The answer is actually really simple.

The sine tone is visible as the spike in the magnitude plot, just like in all the others.
The flat horizontal line is the result of a single-sample click that happened sometime in the 65536 samples that I used to do the FFT.

The sum (or mix) of the sine + click results in the magnitude response plot you see above. If you’re looking at the signal itself, it just means that one of the 65536 samples has an error, and isn’t sitting on the sine curve. I’ve shown an example of this in Figure 5.

Figure 5. The sample with the error is shown in red. All other 65535 samples are behaving as they should

The greater the error of that one sample value, the higher the floor in Figure 4.

Of course, for these plots, I simulated everything in Matlab. However, the actual result was even more interesting / confusing, since the DUT didn’t have a flat magnitude response. So, instead of a nice, horizontal line like the one I’ve shown in Figure 4, I could see something like the response of the system as well, but I’ll stay away from the details of that to keep things simple here.

Mixing closed and ported cabinets: Part 6

As I showed in Part 5, the phase response of a loudspeaker driver in a closed cabinet is different from one in a ported cabinet in the low frequency region because, the low frequency output of the ported system is actually coming from the port, not the driver.

If we take the phase response plots from the two systems shown in Part 5 and put them on the same graph, the result is Figure 1.

If we calculate the difference in these two plots by subtracting the blue curve from the red curve at each frequency then we can see that a ported cabinet is increasingly out of phase relative to a sealed cabinet as you go lower and lower in frequency. This difference is shown in Figure 2.

Now, don’t look at that graph and say “but you never get to 180º so what’s the problem?” All of the plots I’ve shown in this series are for one specific driver in one specific enclosure, with and without a port of one specific diameter and length. I could have been more careful and designed two different enclosures (with and without a port) that does get to 180º (or something else up to 180º).

In other words: “results may vary”. Every loudspeaker in every cabinet has some magnitude response and some phase response (these are directly related to each other), and they’ll all be different by different amounts. (This is also the reason why I’m neglecting to talk about the fact that, as you go lower in frequency, the ported loudspeaker also drops faster in output level, so even if it were a full 180º out of phase, it would cancel less and less when combined with the sealed cabinet loudspeaker.)

The point of all of this was to show that, if you take two different loudspeakers with two different enclosure types, you get two different phase responses, particularly in the low frequency region.

This means that if you take those two loudspeaker types (the original question that inspired this series was specifically about mixing Beolab 9, Beolab 20, and Beolab 2 in a system where all of those loudspeakers are “helping” to produce the bass) and play identical signals from them in the same room, it’s not only possible, but highly likely that they will wind up cancelling each other. This results in LESS bass instead of MORE, ignoring all other effects like loudspeaker placement, room modes, and so on.

But Beolab 2 has slave drivers, not ports…

Take a look at Figure 3. I’ve shown a conceptual drawing of a ported loudspeaker (showing the mass of the air in the port as a red rectangle) on the left and a loudspeaker with a slave driver (on the bottom – notice it’s missing a former and voice coil, and the diaphragm is thicker to make it heavy) on the right.

This should make it intuitively obvious that a ported loudspeaker and an enclosure with a slave driver are effectively identical. This raises the question of why you would do one rather than the other.

The advantages of using a port instead of a slave driver is that a port will be more “stable” on a production line (since all of the ports on all the loudspeakers you make will be identical in size) and they’ve very cheap to make. The disadvantage of a port is that if the velocity of the air moving in and out of it is too high, then you hear it “chuffing”, which is a noise caused by turbulence around the edges of the port. (If you blow across the top of a wine bottle, you don’t hear a perfect sine wave, you hear a very noisy “breathy” one. The noise is the chuffing.)

The advantage of a slave driver is that you don’t get any turbulence, and therefore no chuffing. A slave driver can also be heavier than the air in a port in a smaller space, so you can get the response of a large port in a smaller loudspeaker. There is a small disadvantage in the fact that there will be production line tolerance variations (but this is not really a big worry), and then there’s the price, which is much higher than a hole in a box.

This means that if you take anything I’ve said above about ported loudspeakers, and replace the word “port” with “slave driver” then it’s still true.

P.S.

If you do have a surround system that not only has a bass management system, but is also capable of re-directing the bass to more loudspeakers than just your subwoofer (as is the case with all current Bang & Olfusen surround processors in the televisions), then all of this is important to remember. You can’t just send the bass to more loudspeakers and expect to get more output. You might get less.

This is true unless you have a Beosound Theatre. This is because the Theatre has an extra bit of processing in the signal path called “Phase Compensation” which applies an allpass filter to the outputs, compensating for the phase differences between loudspeakers in the low frequency region. So, in this one particular case, you should expect to get more output from more loudspeakers.

Mixing closed and ported cabinets: Part 5

Let’s build a ported box and put a woofer in it. If we measure the magnitude responses of the individual outputs of the driver and the port as well as the total output of the entire loudspeaker, they might look like the three curves shown in Figure 1.

If you take a look at the curves at 1 kHz, you can see that the total output (the blue curve) is the same as the woofer’s output (the red curve) because the port’s output (the yellow curve) is so low that it’s not contributing anything.

As we come down in frequency, we see the output of the port coming up and the output of the driver coming down. At around 20 Hz, the port reaches its maximum output and the woofer reaches its minimum as a result. In fact that woofer’s output is about 15 dB lower than the port’s at that frequency.

As we go farther down in frequency, we can see that the woofer comes up and then starts to drop again, but the port just drops in level the lower we go.

Now look at the total output (the blue curve) from 20 Hz and down. Notice that the total output of the system from 20 Hz down to about 15 Hz is LOWER than the output of the port alone. As you go below about 15 Hz, you can see that the total output is lower than either the woofer or the port.

This means that the port and the woofer are cancelling each other, just like I described in the previous part in this series. This can be seen when we look at their respective phase responses, shown in the middle plot in Figure 2. I’ve also plotted the difference in the woofer and the port phase responses in the bottom plot.

Notice that, below 20 Hz, the woofer and the port are about 180º apart. So, as the woofer moves out of the enclosure, the air in the port moves inwards, and the total sum is less than either of the two individual outputs.

What happens when you put a woofer in a sealed enclosure instead of one with a port? The responses from this kind of system are shown below in Figure 3.

The first thing that you’ll notice in the plots in Figure 3 is that there is only one curve in each graph. This is because the total output is the driver output.

You’ll also notice in the top plot that a woofer in a cabinet acts as a second-order high-pass filter because the cabinet is not too small for the driver. If the cabinet were smaller, then you’d see a peak in the response, but let’s say that I’m not that dumb…

Because it’s a second-order high-pass filter, it has a phase response that approaches 180º as you go down in frequency.

Now, compare that phase response in the low end of Figure 3 to the phase response of the low end in Figure 2. This is where we’re headed, since the purpose of all of this discussion is to talk about what happens when you have a system that combines sealed enclosures with ported ones. That brings us to Part 6.

Mixing closed and ported cabinets: Part 4

In Part 1, I showed how a wine bottle behaves exactly like a mass on a spring where the mass is the cylinder of air in the bottle’s neck and the spring is the air inside the bottle itself.

I also showed how a loudspeaker driver (like a woofer) in a closed box is the same thing, where the spring is the combination of the surround, the spider and the air in the box.

But what happens if the speaker enclosure is not sealed, but instead is open to the outside world through a “port” which is another way of saying “a tube”. Then, conceptually, you are combining the loudspeaker driver with the wine bottle like I’ve shown in Figure 3.

If I were to show this with all the masses in red and all the springs in blue, it would look like Figure 4.

Now things are getting a little complicated, so let’s take things slowly… literally.

If the loudspeaker driver in Figure 4 moves into the cabinet very slowly (say, you push it with your fingers or you play a very low frequency with an electrical signal), then the air that it displaces in of the bottle (the enclosure) will just push the plug of air out the bottle’s neck (the port). The opposite will happen if you pull the driver out of the enclosure: you’ll suck air into the port.

If, instead you move the driver back and forth very quickly (by playing a very high frequency) then the inertia of the air inside the cabinet (shown as the big blue spring in the middle) prevents it from moving down near the port. In fact, if the frequency is high enough, then the air at the entrance of the port doesn’t move at all. This means that, for very high frequencies, the system will behave exactly the same as if the enclosure were sealed.

But somewhere between the very low frequencies and the very high frequencies, there is a “magic” frequency where the air in the port resonates, and there, things don’t behave intuitively. At that frequency, whenever the driver is trying to move into the enclosure, the air in the port is also moving into the enclosure. And, although the air has less mass than the driver, it’s free to move more. The end result is that, at the port’s resonant frequency, the driver (in theory) doesn’t move at all*, and the air in the port is moving a lot.**

In other words, you can think of a single driver in a ported cabinet as being basically the same as a two-way loudspeaker, where the woofer (for example) is one driver and the port is the other “driver”.

At high frequencies, the sound is only coming out of the woofer (for example).
As you come down in frequency and get closer to the port’s resonance, you get less and less from the woofer and more and more from the port.
At the port’s resonant frequency, all* of the sound is coming from the movement of the air in and out of the port
As you go lower than the port’s resonant frequency, the woofer starts working again, but now as the woofer moves out of the enclosure (making a positive pressure) it sucks air into the port (making a negative pressure). So, at very low frequencies, the woofer is working very hard, but you get very little sound output because the port cancels it out.

If you look at this as a magnitude response (the correct term for “frequency response” for this discussion), you can think of the woofer having one response, the port having a different response, and the two adding together somehow to produce a total response for the entire loudspeaker.

However, as you can see from the short 4-point list above, something happens with the phase of the signal at different frequencies. This is most obvious in the “very low frequency” part, where the woofer’s and the port’s outputs are 180º out of phase with each other.

In Part 5 we’ll look at these different components of the total output separately, both in terms of magnitude and phase responses (which, combined are the frequency response).

* Okay okay…. I say “the driver (in theory) doesn’t move at all” and “all of the sound is coming from the movement of the air in and out of the port” which is a bit of an exaggeration. But it’s not MUCH of an exaggeration…

** This is an oversimplified explanation. The slightly less simplified version is that the air inside the cabinet is acting like a spring that’s getting squeezed from two sides: the driver and the air in the port. The driver “sees” the “spring” (the air in the box) as pushing and pulling on it just as much as its pulling and pushing, so it can’t move (very much…).

Mixing closed and ported cabinets: Part 3

Before starting on this portion of the series, I’ll ask you to think about how little energy (or movement) it takes to get a resonant system oscillating. For example, if you have a child on a swing, a series of very gentle pushes at the right times can result in them swinging very high. Also, once the child is swinging back and forth, it takes a lot of effort to stop them quickly.

Moving onwards…

So far, we’ve seen that a loudspeaker driver in a closed cabinet can be thought of as just a mass on a spring, and, as a result, it has some natural resonance where it will oscillate at some frequency.

The driver is normally moved by sending an electrical signal into its voice coil. This causes the coil to produce a magnetic field and, since it’s already sitting in the magnetic field of a permanent magnet, it moves. The surround and spider prevent it from moving sideways, so it can only move outwards (if we send electrical current in one direction) or inwards (if we send current in the other direction).

When you try to move the driver, you’re working against a number of things:

the inertia of the mass of the moving parts
Pick up a heavy book, for example, and try to push and pull it back and forth. It’s hard work!
the inertia of the air directly in front of and behind the driver
Pick up a big sheet of stiff plastic (like the thing you put on the floor under an office chair) and try to push it back and forth. It’s also hard work!
the compliance (springiness) of the surround, spider, and air trapped in the cabinet behind the driver
Blow up a ballon, and use your two hands to squeeze it repeatedly. It’s also hard work!

These three things can be considered separately from each other as a static effect. In other words:

It’s hard work to pick up a book or push a car that’s broken down (forget about pushing-and-pulling – just push OR pull)
It’s hard work to run into a headwind with that big piece of stiff plastic
It’s hard work to squeeze a balloon and keep it compressed

But, if you’re pushing AND pulling the loudspeaker driver there is another effect that’s dynamic.

When you’re moving the driver at a VERY low frequency, you’re mostly working against the “spring” which is probably quite easy to do. So, at a low frequency, the driver is pretty easy to move, and it’s moving so slowly that it doesn’t push back electrically. So, it does not impede the flow of current through the voice coil.

When you’re moving the driver at a VERY high frequency, you’re mostly working against the inertia of the moving parts and the adjacent air molecules. The higher the frequency, the harder it is to move the driver.

However, when you’re trying to moving the driver at exactly the resonant frequency of the driver, you don’t need much energy at all because it “wants” to move at that same rate. However, at that frequency, the voice coil is moving in the magnetic field of the permanent magnet, and it generates electricity that is trying to move current in the opposite direction of what your amp is going. In other words, at the driver’s resonant frequency, when you’re trying to push current into the voice coil, it generates a current that pushes back. When you try to pull current out of the voice coil, it generates a current that pulls back.

In other words, at the driver’s resonant frequency, your amplifier “sees” the driver as as a thing that is trying to impede the flow of electrical current. This means that you get a lot of movement with only a little electrical current; just like the child on the swing gets to go high with only a little effort – but only at one frequency.

This is a nice, simple case where you have a moving mass (the moving parts of the driver) and a spring (the surround, spider, and air in the sealed box). But what happens when the speaker has a port?

On to Part 4…

Mixing closed and ported cabinets: Part 2

Let’s look at a typical moving coil loudspeaker driver like the woofer shown in Figure 1.

If I were to draw a cross-section of this and display it upside-down, it would look like Figure 2.

Typically, if we send a positive voltage/current signal to a driver (say, the attack of a kick drum to a woofer) then it moves “forwards” or “outwards” (from the cabinet, for example). It then returns to the rest position. If we send it a negative signal, then it moves “backwards” or “inwards”. This movement is shown in Figure 3.

Notice in Figure 3 that I left out all of the parts that don’t move: the basket, the magnet and the pole piece. That’s because those aren’t important for this discussion.

Also notice that I used only two colours: red for the moving parts that don’t move relative to each other (because they’re all glued together) and blue for the stretchy parts that act as a spring. These colours relate directly to the colours I used in Part 1, because they’re doing exactly the same thing. In other words, if you hold a woofer by the basket or magnet, and tap it, it will “bounce” up and down because it’s just a mass suspended by a spring. And, just like I talked about in Part 1, this means that it will oscillate at some frequency that’s determined by the relationship of the mass to the spring’s compliance (a fancy word for “springiness” or “stiffness” of a spring. The more compliant it is, the less stiff.) In other words, I’m trying to make it obvious that Figure 3, above is exactly the same as Figures 3 and 5 in Part 1.

However, it’s very rare to see a loudspeaker where the driver is suspended without an enclosure. Yes, there are some companies that do this, but that’s outside the limits of this discussion. So, what happens when we put a loudspeaker driver in a sealed cabinet? For the purposes of this discussion, all it means is that we add an extra spring attached to the moving parts.

I’ve shown the “spring” that the air provides as a blue coil attached to the back of the dust cap. Of course, this is not true; the air is pushing against all surfaces inside the loudspeaker. However, from the outside, if you were actually pushing on the front of the driver with your fingers, you would not be able to tell the difference.

This means that the spring that pushes or pulls the loudspeaker diaphragm back into position is some combination of the surround (typically made of rubber nowadays), the spider (which might be made of different things…) and the air in the sealed cabinet. Those three springs are in parallel, so if you make one REALLY stiff (or lower its compliance) then it becomes the important spring, and the other two make less of a difference.

So, if you make the cabinet too small, then you have less air inside it, and it becomes the predominant spring, making the surround and spider irrelevant. The bigger the cabinet, the more significant a role the surround and spider play in the oscillation of the system.

Sidebar: If you are planning on making a lot of loudspeakers on a production line, then you can use this to your advantage. Since there is some variation in the compliance of the surround and spider from driver-to-driver, then your loudspeakers will behave differently. However, if you make the cabinet small, then it becomes the most important spring in the system, and you get loudspeakers that are more like each other because their volumes are all the same.

Remember from part 1 that if you increase the stiffness of the spring, then the resonant frequency of the oscillation will increase. It will also ring for longer in time. In practical terms, if you put a woofer in a big sealed cabinet and tap it, it will sound like a short “thump”. But if the cabinet is too small, then it will sound like a higher-pitched and longer-ringing “bonnnnnnnggggg”.

So far, we’ve only been talking about physical things: masses and springs. In the next part, we’ll connect the loudspeaker driver to an amplifier and try to push and pull it with electrical signals.

Mixing closed and ported cabinets: Part 1

I made a comment on a forum this week, commenting that, if you mix loudspeakers with closed cabinets with loudspeakers with ported cabinets (or slave drivers), the end result can be a reduction in total output: less sound from more loudspeakers. Of course, the question is “why?” and the short answer is “due to the phase mismatching of the loudspeakers”.

This is the long answer.

Before we begin, we have to get an intuitive understanding of what a ported loudspeaker is. (Note that I’ll keep saying “ported loudspeaker”, but the principle also applies to loudspeakers with slave drivers, as I’ll explain later.) Before we get to a ported loudspeaker, we need to talk about Helmholtz resonators.

Take a block that’s reasonably heavy and hang it using a spring so that it looks like this:

The spring is a little stretched because the weight of the block (which is the result of its mass and the Earth’s gravity) is pulling downwards. (We’ll ignore the fact that the spring is also holding up its own weight. Let’s keep this simple…) However, it doesn’t fall to the floor because the spring is pulling upwards.

Now pull downwards on the block, so it will look like the example on the right in the figure below.

The spring is stretched because we’re pulling down on the block. The spring is also pulling upwards more, since it’s pulling against the weight of the block PLUS the force that you’re adding in a downwards direction.

Now you let go of the block. What happens?

The spring is pulling “too hard” on the block, so the block starts rising back to where it started (we’ll call that the “resting position”). However, when it gets there, it has inertia (a body in motion tends to stay in motion… until it hits something big…) so it doesn’t stop. As a result, it moves upwards, higher than the resting position. This squeezes the spring until it gets to some point, at which time the block stops, and then starts going back downwards. When it returns to the resting position, it still has inertia, so it passes that point and goes too far down again. I’ve shown this as a series of positions from left to right in the figure below.

If there were no friction, no air around the block, and no friction within the metal molecules of the spring, then this would bounce up and down forever.

However, there is friction, so some of the movement (“kinetic energy”) is turned into heat and lost. So, each bounce gets smaller and smaller and the maximum velocity of the block (as it passes the resting position) gets lower and lower, until, eventually, it stops moving (at the resting position, where it started).

Notice that I changed the colour of the spring to show when it’s more stretched (lighter blue) and when it’s more compressed (darker blue).

If everything were behaving perfectly, then the RATE at which the bounce repeats wouldn’t change. Only its amplitude (or the excursion of the block, or the height of the bounce) would reduce over time. That bounce rate (let’s say 1 bounce per second, and by “bounce” I mean a full cycle of moment down, up, and back down to where it started again) is the frequency of the repetition (or oscillation).

If you make the weight lighter, then it will bounce faster (because the spring can pull the weight more “easily”). If you make the spring stiffer, then it will bounce faster (because the spring can pull the weight more “easily”). So, we can change the frequency of the oscillation by changing the weight of the block or the stiffness of the spring.

Now take a look at the same weight on a spring next to an up-side down wine bottle that (sadly) has been emptied of wine.

Notice that I’ve added some colours to the air inside the bottle. The air in the bottle itself is blue, just like the colour of the spring. This is because, if we pull air out of the bottle (downwards), the air inside it will pull back (upwards; just like the metal spring pulling back upwards on the block). I’ve made the small cylinder of air in the neck of the bottle red, just like the block. This is because that air has some mass, and it’s free to move upwards (into the bottle) and downwards (out of the bottle) just like the block.

If I were somehow able to pull the “plug” of air out of the neck of the bottle, the air inside would try to pull it back in. If I then “let go”, the plug would move inwards, go too far (because it also has inertia), squeezing (or compressing) the air inside the bottle, which would then push the plug back out. This is shown in the figure below.

At the level we’re dealing with, this behaviour is practically identical to the behaviour of the block on the spring. In other words, although the block and the plug are made of different materials, and although the metal spring and the air inside the bottle are different materials, Figures 3 and 5 show the same behaviour of the same kind of system.

How do you pull the plug of air out of the bottle? It’s probably easier to start by pushing it inwards instead, by blowing across the top.

When you do this, a little air leaks into the opening, pushing the plug inwards. The “spring” in the bottle then pushes the plug outwards, and your cycle has started. If you wanted to do the same thing with the block, you’d lift it and let go to start the oscillation.

However, you don’t need to blow across the bottle to make it oscillate. You can just tap it with the palm of your hand, for example. Or, if you put the bottle next to your ear and listen carefully, you’ll hear a note “singing along” with the sound in the room. This is because the air in the bottle resonates; it moves back and forth very easily at the frequency that’s determined by the mass of the air in the neck and the volume of air in of the bottle (the spring).

However, remember that friction can make the oscillation decay (or die away) faster, by turning the movement into heat.

One last thing…

There’s another way to get either the block or the wine bottle oscillating:

You can move the TOP of the spring (for example, if you pull it up, then the spring will pull the block upwards, and it’ll start bouncing). Or, you could tap the bottom of the wine bottle (which is on the top in my drawings).

This method of starting the oscillation will come in handy in part 2.

earfluff and eyecandy

mostly audio, but with some other stuff occasionally

Author: geoff