Loudspeaker Crossovers: Part 6

Up to now, we’ve been looking at two-way crossovers with different implementation types, analysing the responses of the two individual outputs and the total summed output as if we just mixed the two frequency bands electrically. This analysis shows us what the crossover does in isolation, but this is just a small portion of what’s happening in real life.

Let’s now start by including some real-world implications into the mix to see what happens.

For this posting, I won’t be just adding the two outputs of the high- and low-pass filter paths. We’re now going to pretend that the outputs of those two paths are connected to two point-source loudspeakers floating in infinite space. Since they’re both point sources, each one has a flat magnitude response and a flat phase response relative to its input, and these two characteristics are true in all directions. They also have no frequency limits. So, although I’m calling one a “tweeter” and the other a “woofer”, they don’t behave like real loudspeakers.

Using this kind of model allows me to analyse the implications of the differences in distances to the microphone (or listening) position, including the characteristics of the crossover.

Figure 6.1.

Figure 6.1 shows a schematic diagram of the system that we’re analysing in this posting. As you ca see, the “tweeter” and “woofer” are separated by some vertical distance. The microphone position is at some distance from the centre of the two loudspeakers (the radius of the big semi-circle in the drawing), and at some angle above or below the on-axis angle to the loudspeaker pair. In my analyses, negative angles are below the horizon, and positive angles are above.

If the angle is 0º, then the distances to the tweeter and the woofer are identical, and the result is the same as the plots I’ve shown in Parts 2, 3, 4, and 5. However, if the angle to the microphone goes positive, then this means that the woofer’s signal will be delayed relative to the tweeter’s, and this will have some effect on the way the two signals interfere with each other when they are summed.

This change in interference results in a change in the magnitude response of the summed signals at the microphone as a function of the angle. So, another way to consider this is that we’re changing the directivity of the loudspeaker pair.

For all of the plots below, I’ve shown the responses at angles in 30º increments from -90º to 90º. As I’ve said above, the 0º plot should be identical to the plot for the same crossover type shown in one of the previous postings.

Of course, a change in the separation between the two drivers will change the amount of effect on the magnitude response when the angle to the microphone is not 0º. For these plots, I’ve decided to keep the crossover frequency at 100 Hz, to maintain consistency, and to plot the responses for 3 example loudspeaker separations: 43.125 cm (0.125 * wavelength at 100 Hz), 86.25 cm (0.25 * wavelength at 100 Hz), and 1.725 m (0.5 * wavelength at 100 Hz).

The point of these is not really to give “real world” suggestions, but to show tendencies…

2nd-order Butterworth

Figure 6.2. 100 Hz, 2nd-Order Butterworth
Separation = 0.125 * wavelength at 100 Hz
Figure 6.3. 100 Hz, 2nd-Order Butterworth
Separation = 0.25 * wavelength at 100 Hz
Figure 6.4. 100 Hz, 2nd-Order Butterworth
Separation = 0.5 * wavelength at 100 Hz

Not surprisingly, the greater the separation between the loudspeaker drivers, the bigger the effect on the magnitude response off-axis. Notice, however, that the effect is not symmetrical. In other words, the magnitude responses at -90º and 90º are not the same. This is because the relative phase responses of the two filter paths (also remembering that the tweeter’s signal is flipped in polarity) has an effect on the sum of the two signals at different points in space.

Hopefully, it’s clear that if the crossover had been at a different frequency, the characteristics of the magnitude responses would have been the same – they would have just moved in frequency. This is because I’m relating the separation between the two loudspeaker drivers as a fraction of the wavelength of the crossover frequency.

And, of course, you don’t need to email me to remind me that a loudspeaker separation of 1.725 m is silly. As I said, the point of this is NOT to help you design a loudspeaker, it’s to show the characteristics and the tendencies. (On the other hand, 1.725 m between a subwoofer and a main loudspeaker is not silly… so there…)

4th-order Linkwitz Riley

Figure 6.5. 100 Hz, 4th-order Linkwitz Riley
Separation = 0.125 * wavelength at 100 Hz
Figure 6.6. 100 Hz, 4th-order Linkwitz Riley
Separation = 0.25 * wavelength at 100 Hz
Figure 6.7. 100 Hz, 4th-order Linkwitz Riley
Separation = 0.5 * wavelength at 100 Hz

Notice here that the magnitude responses never go above 0 dB at any angle. It’s also interesting that at smaller separations, the difference in the magnitude response as a function of angle is smaller than that for the 2nd-order Butterworth crossover.

2nd-order Linkwitz Riley

Figure 6.8. 100 Hz, 2nd-order Linkwitz Riley
Separation = 0.125 * wavelength at 100 Hz
Figure 6.9. 100 Hz, 2nd-order Linkwitz Riley
Separation = 0.25 * wavelength at 100 Hz
Figure 6.10. 100 Hz, 2nd-order Linkwitz Riley
Separation = 0.5 * wavelength at 100 Hz

Constant Voltage

As I mentioned in the previous posting, there are many ways to implement a constant voltage crossover. The plots below show analyses of the same crossover as the one I shows in Part 5 – using a 2nd-order Butterworth for the high-pass section, and subtracting that from the input to create the low-pass section.

Figure 6.11. 100 Hz, Constant Voltage
Separation = 0.125 * wavelength at 100 Hz
Figure 6.12. 100 Hz, Constant Voltage
Separation = 0.25 * wavelength at 100 Hz
Figure 6.13. 100 Hz, Constant Voltage
Separation = 0.5 * wavelength at 100 Hz

One thing to notice here is that, although we saw in Part 5 that a constant voltage crossover’s output is identical to its input, that’s only true for the hypothetical example where we just summed the outputs. You’ll notice that, at a microphone angle of 0º in this still-hypothetical example, the total magnitude response is still flat. However, at other angles, the change in magnitude response is much larger than it is for the other crossover types for angles between -60º and 60º. Therefore, if you jumped to the conclusion in the previous posting that a constant voltage design is the winner, you might want to re-consider if you don’t live in a room that extends to infinite space without any walls (or a perfect anechoic chamber), and you only listen on-axis to the loudspeaker.

Just sayin’…

P.S.

In case you’re wondering, it’s also possible to look at the effects of summing the outputs of the two loudspeakers without including a crossover in the signal path. The result of this is that you have two full-range drivers, whose only difference at the microphone position is the time of arrival as a function of the angle of the microphone relative to the “horizon”. This results in two big differences in what you see above:

  • The total output when the interference is construction is + 6 dB relative to the input. This happens because the two signals are identical, and, at some frequencies and some microphone positions, they just add together.
  • The interference extends to a much wider frequency band, since both loudspeakers’ signals are interfering with each other at all frequencies.
Figure 6.14. No crossover
Separation = 0.125 * wavelength at 100 Hz

Figure 6.16. No crossover
Separation = 0.5 * wavelength at 100 Hz

Loudspeaker Crossovers: Part 5

Constant Voltage design

The four crossover types we’ve looked at so far all use the same basic concept: take the input signal and divide it into different frequency bands using some kind of filters that are implemented in parallel. You send the input to a high pass filter to create the high-frequency output, and you send the same input to a low-pass filter to create the low-frequency output.

In all of the examples we’ve seen so far, because they have been based on Butterworth sections, incur some kind of phase shift with frequency. We’ll talk about this more later. However, the fact that this phase shift exists bothers some people.

There are various ways to make a crossover that, when you sum its outputs, result in a total that is NOT phase shifted relative to the input signal. The general term for this kind of design is a “Constant Voltage” crossover (see this AES paper by Richard Small for a good discussion about constant voltage crossover design).

Let’s look at just one example of a constant voltage crossover to see how it might be different from the ones we’ve looked at so far. To create this particular example, I take the input signal and filter it using a 2nd-order Butterworth high pass. This is the high-frequency output of the crossover. To create the low-frequency output of the crossover, I subtract the high-frequency output from the input signal. This is shown in the block diagram below in Figure 5.1

Figure 5.1. One example of a constant voltage crossover.

As with the previous four crossovers, I’ve added the two outputs of the crossover back together to look at the total result.

Figure 5.2: the magnitude and phase responses of the two sections of the crossover.

Figure 5.2 shows the magnitude and phase responses of the high- and low-frequency portions of the crossover. One thing that’s immediately noticeable there is that the two portions are not symmetrical as they have been in the previous crossover types. The slopes of the filters don’t match, the low-pass component has a bump that goes above 0 dB before it starts dropping, and their phase responses do not have a constant difference independent of frequency. They’re about 180º apart in the low end, and only about 90º in the high end.

However, because the low-frequency output was created by subtracting the high-frequency component from the input, when we add them back together, we just get back what we put in, as can be seen in Figure 5.3.

Figure 5.3. The magnitude and phase responses of the summed output of the crossover shown in Figure 5.1.

Essentially, this shows us that Output = Input, which is hopefully, not surprising.

If we then run our three sinusoidal signals through this crossover and look at the summed output, the results will look like Figures 5.4 to 5.6

Figure 5.4: Row 1: the input (10 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Figure 5.5: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Figure 5.6: Row 1: the input (1 kHz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output

Notice in all three of those figures that the outputs and the inputs are identical, even though the individual behaviours of the two frequency-limited outputs might be temporarily weird (look at the start of the signals of the high-frequency output in Figures 5.4 and 5.6 for example…)

Now, don’t go jumping to conclusions… Just because the sum of the output is identical to the input of a constant voltage crossover does NOT make this the winner. We’re just getting started, and so far, we have only considered a very simple aspect of crossovers that, although necessary to understand them, is just the beginning of considering what they do in the real world.

Up to now, we have really only been thinking about crossovers in three dimensions: Frequency, Magnitude, and Phase. Starting in the next posting, we’ll add three more dimensions (X,Y, and Z of physical space) to see how, even a simple version of the real world makes things a lot more complicated.

Loudspeaker Crossovers: Part 4

2nd-order Linkwitz Riley

A 2nd-order Linkwitz Riley crossover is something like a hybrid of the previous two crossover types that I’ve described. If you’re building one, then the “helicopter view” block diagram looks just like the one for the 4th-order Linkwitz Riley, but I’ve shown it here again anyway.

Figure 4.1

The difference between a 2nd-order and a 4th-order Linkwitz Riley is in the details of exactly what’s inside those blocks called “HPF” and “LPF”. In the case of a 2nd-order crossover, each block contains a 1st-order Butterworth filter, and they all have the same cutoff frequency. (For a 4th-order Linkwitz Riley, the filters are all 2nd-order Butterworth)

Since each of those filters will attenuate the signal by 3 dB at the cutoff frequency, then the total combined response for each section will be -6 dB at the crossover. This can be seen below in Figure 4.2. Also, the series combination of the two 1st-order Butterworths means that the high and low sections of the crossover will have a phase different of 180º at all frequencies.

Figure 4.2

Since the two filter sections have a phase separation of 180º, we need to invert the polarity of the high-pass section. This means that, when the two outputs are summed as shown in Figure 4.1, the total magnitude response is flat, but the phase response is the same as a 2nd-order minimum phase allpass filter, as can be seen in Figure 4.3, below.

Figure 4.3

If we then look at the low- mid- and high-frequency sinusoidal signals that have been passed through the crossover, the results look like those shown below in Figures 4.4, 4.5, and 4.6.

Figure 4.4: Row 1: the input (10 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output

As can be seen in Figure 4.4, for a very low frequency, the output is the same as the input, the magnitude is identical (as we would expect based on the Magnitude Response plot shown in Figure 4.3, and the phase difference of the output relative to the input is 0º.

Figure 4.5: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output

At the crossover frequency, shown in Figure 4.5, the output has shifted in phase relative to the input by 90º, but their magnitudes still match.

Figure 4.6: Row 1: the input (1 kHz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output

At a high frequency, the phase has shifted by 180º relative to the input.

One last thing. The dotted plots in Figures 4.4 to 4.6 are the signals magnified by a factor of 10 to make them easier to see when they’re low in level. There are two interesting ones to look at:

  • the very beginning of the black plot on the right of Figure 4.4. Notice that this one starts with a positive spike before it settles down into a sinusoid.
  • the red plot on the left In Figure 4.6. Notice that the signal goes positive, and stays positive for the full 5 ms.

We will come back later to talk about both of these points. The truth is that they’re not really important for now, so we’ll pretend that they didn’t look too weird.

Loudspeaker Crossovers: Part 3

Fourth-order Linkwitz-Riley

A fourth-order Linkwitz-Riley crossover is made using the same filters in the 2nd-order Butterworth crossover described in the previous posting. The difference in implementation is that you use two second-order filters in series. Again, all filters have the same cutoff frequency and, if you’re implementing them with biquads, the Q of all of them is 1/sqrt(2).

Figure 3.1

Since we have two high pass filters in series, then the total result is -6 dB at the cutoff frequency (since each of the two filters attenuates by 3 dB) and the slope of the filter is 24 dB per octave. This results in the magnitude and phase responses shown below in Figure 3.2.

Figure 3.2

One important thing to notice now is that the phase responses of the two filters are 360º apart at all frequencies. This is different from the second-order Butterworth crossover, in which the two outputs are 180º apart. So we won’t need to flip the polarity of anything to compensate for the phase difference.

As in the previous posting, Let’s look at the signals that get through the crossover, and the total summed output for three input frequencies. This is shown in Figure 3.3, 3.4, and 3.5.

Figure 3.3: Row 1: the input (1 kHz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Figure 3.4: Row 1: the input (10 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Figure 3.5: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output

If you take a look at Figures 3.3 and 3.4 it appears that the total summed output of the crossover is in phase with the input at very low and very high frequencies. However, this is actually misleading. Take a look at Figure 3.5 and you’ll see that, when the input signal is the same frequency as the crossover frequency, the summed output is shifted by 180º relative to the input signal.

Figure 3.6

If we compare the summed output to the input, they are in-phase at very low frequencies. As the frequency increases, the phase of the summed output of the crossover gets later and later, passing 180º at the crossover frequency and approaching a shift of 360º in the high frequencies.

In other words, a 4th-order Linkwitz-Riley crossover by itself, when you sum the outputs of the filters as shown in Figure 3.1, has the same response as a 4th-order minimum phase allpass filter.

One extra thing to notice is that, since the high-pass and low-pass paths are 360º apart, and (partly) since they’re -6 dB at the crossover frequency, the magnitude response of the summed total is flat.

Loudspeaker Crossovers: Part 2

One way to look at the behaviour of a signal when it’s sent through a crossover is to pretend that the loudspeaker isn’t part of the system. Once-upon-a-time, I probably would have phrased this differently and said something like “pretend that the loudspeaker is perfect”, but, now that I’m older, my opinions about the definition of “perfect” have changed.

So, we’ll take a signal, send it to a two-way crossover of some kind, and then just add the two signals back together. This shows us one view of the behaviour of the crossover, which is good enough to deal with the basics for now. In a later posting in this series, we’ll look at a more multi-dimensional and therefore realistic view of what’s happening.

Figure 2.1

The block diagram above shows the signal flow that I used for all of the following plots in this posting.

Butterworth, 2nd-order (12 db/octave)

Although the block diagram above shows that we have a high-pass and a low-pass filter to separate the signal into two frequency bands, there are a lot of details missing about the specific characteristics of those filters. There are many ways to make a high-pass filter, for example…

One common crossover type uses 2nd-order Butterworth filters, both with the same cutoff frequency. One way to implement these are to use biquads to make low-pass and high-pass filters with Q = 1/sqrt(2).

Fig. 2.2: The individual magnitude and phase responses of the low-pass (in red) and high-pass (in black)

Before we look at the output of the entire crossover after the two signals have been summed, let’s talk about the red and the black curves in the plots above.

The magnitude responses should not come as a surprise. The fact that I’m using 2nd-order filters means that the slope of the attenuation will be 12 dB per octave (or 40 dB per decade) once you get far enough away from the cutoff frequency. The fact that they also have a Q of 1/sqrt(2) (approximately 0.707) means that they will attenuate the signal by 3 dB at the cutoff frequency, and that there is no “bump” in the slope of the magnitude response.

However, the phase responses might be a little confusing. Let’s take those separately:

For the low-pass filter (the black line), you can see that in the high-frequency band, where the magnitude response is a flat line at 0 dB (which means that the level of the output level is equal to the level of the input), the phase shift is 0º.

Another way to look at this is to put a sine wave into the system and see what comes out, as shown in Figure 2.3 below. The top plot shows the input to the two filters. Since this sine wave has a period of 1 ms, then it’s a 1000 Hz tone.

The second row of plots shows the magnitude responses of the low-pass filter (in red, on the left) and the high-pass filter (in black, on the right). Notice the levels of these two curves at a frequency of 1000 Hz.

The third row of plots shows the actual outputs of the two filters. For now, we’ll only look at the output of the high-pass filter on the right. There are three things to notice about this plot:

  • After about 1 ms, the amplitude of this sine tone is the same as the one in the top plot.
  • The phase of this sine tone is the same as the one in the top plot. In other words (for example), they both pass the 0 line, heading positive at Time = 1 ms.
  • The start of the sine wave is a little weird. Notice that the positive peak is lower than expected and first negative trough is BELOW the maximum-negative amplitude. (it’s below a value of -1).
    We’ll ignore this for now, and come back to it later.

The fourth row shows the output of the two filters when they have been added together. Notice here that the output is almost identical to the input because it’s essentially just the contribution of the high-pass filter. The low-pass filter has so little output that it’s practically irrelevant.

Figure 2.3: Row 1: the input (1 kHz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output

Let’s now look a what happens if we put in a low-frequency sine wave instead. This is shown in Figure 2.4.

Figure 2.4 Row 1: the input (10 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. (the dotted line shows the signal with its amplitude multiplied by 10 to “zoom in” on it) Row 4: the summed output

Notice now that the time scale is 100 times longer. The sine wave now has a period of 100 ms, so it’s a 10 Hz sine wave.

We’ll focus on the third row of plots again, still looking only at the output of the high-pass filter on the right. There are three things to notice about this plot:

  • After about 100 ms, the amplitude of this sine tone (the solid black line) is MUCH lower than the amplitude of the input. The dotted line is a “magnified” version of the same signal so that we can see it for the phase comparison.
  • The phase of this sine tone is the shifted by 180º relative to the top plot. In other words (for example), at Time = 200 ms they both pass the 0 line, but this signal is going negative when the input is going positive.
  • The start of the sine wave is a also weird, but differently so; with that spike at the beginning and the weird wiggle in the curve before it settles down.
    We’ll ignore this for now, and come back to it later.

If you go back and look at the low-pass filter’s output, then you’ll see basically the same behaviour, but for the opposite frequency.

And, again, the output is almost identical to the input because it’s essentially just the contribution of the low-pass filter. The high-pass filter has so little output that it’s practically irrelevant.

Now let’s look at what happens when the frequency of the input signal is on the cutoff frequency of the two filters – in other words, the crossover frequency.

Figure 2.5: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output

Now the sine wave has a period of 10 ms, so its frequency is 100 Hz.

Take a look at the third row of plots at Time = 10 ms.

The first thing to do is to compare the outputs of the two filters. The output of the low-pass filter on the left is negative, whereas the output of the high-pass filter on the right is positive. The outputs of the two filters are 180º out of phase with each other. This can also be seen in the plot back in Figure 2.2, where it’s shown that the difference between the red and the black phase response plots is 180º at all frequencies.

This is also why., once everything settles down, the sum of the two filters (the blue line on the bottom) is silence. The two signals have equal amplitude, and are 180º out of phase, so they cancel each other out.

Now compare those signal plots in the third row in Figure 2.5 to the input signal shown in the top plot. If you look at Time = 10 ms again (for example), you can see that the output of the low-pass filter is 90º behind the input. However, the output of the high-pass filter is early 90º ahead of the input.

The fact that the phase of the output of the high-pass filter is ahead of its own input confuses many people, however, don’t panic. This does not mean that the output is ahead of the input in TIME. The high-pass filter cannot see into the future. The only reason its output can have a phase that precedes the phase of its input is if the sine wave has been playing for a long time (and, in this case a “long” time can be measured in milliseconds…).

This confusion is the result of two things:

  • People are typically taught the concept of phase as it relates to time. However, if you’re talking about a sine wave, then you are implying infinite time. in order for a signal to be a REAL sine wave, it must have always been playing and it must never stop. If it started or stopped, then there are other frequencies present, and so it’s not a theoretically-perfect sine wave.
  • We use the words “ahead” and “behind” or “earlier” and “later” to describe the phase relationships, and these words typically imply a time relationship.

Maybe a rough analogy that can help is to walk next to a friend, at the same speed, but do not synchronise your steps. You will both arrive at the same place at the same time, but at two different moments in the cycles of your footsteps.

Of course, if you make a crossover like this, it won’t work very well, since you get that cancellation at the crossover frequency when the two filters outputs are added together. If we plot the summed response’s magnitude and phase characteristics, they look like the plots shown in Figure 2.6.

Figure 2.6: the magnitude and phase responses of the total shown in Figure 2.5.

As you can see there, there is complete cancellation at the crossover frequency, and the phase response flips across that notch.

So, the solution with a 2nd-order Butterworth crossover is to assume that people won’t notice if you invert the polarity of the high-pass filter’s output. This is a good assumption that I will not argue with at all.

This polarity inversion “undoes” the 180º phase difference of the two filters seen in Figure 2.2, and the summed result is shown below in Figure 2.7 and 2.8.

Figure 2.7: the magnitude and phase responses of the total shown in Figure 2.8, which is the same as Figure 2.5 after the polarity of the HPF’s output has been inverted.
Figure 2.8: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. (Note that the polarity of the HPF’s output has been inverted) Row 4: the summed output

Now the outputs of the two filters appear to be in phase with each other. They are still 90º out of phase with the input, which means that their summed outputs are also 90º out of phase with the input. This can be seen in the bottom plots of Figure 2.7 and 2.8.

You’ll also notice that there is a 3 dB bump at the crossover frequency. This is because, at their cutoff frequencies, both filters attenuate by 3 dB (a linear gain of 0.707). When those two signals of equal amplitude and matching phase are added together, you get a magnitude that is 6 dB higher (or a linear gain of 1.41). We’ll talk about this later when we start looking at the real world.

Finally, take a look at the bottom plot in Figure 2.7. You can see there that the summed outputs of the two filters result in a phase shift that increases with frequency. In fact, when we look at a 2nd-order Butterworth crossover like this, without all the real-world implications of loudspeaker drivers that have their own characteristics and are separated in space, it can be seen that it acts as a 2nd-order minimum-phase allpass filter. This isn’t necessarily a bad thing, so don’t jump to conclusions this early…

We are STILL not going to talk about that weirdness at the beginning of the signal after it’s been filtered. That will come later.

Loudspeaker Crossovers: Part 1

What is a crossover?

A crossover is a set of filters that take an audio signal and separate it into different frequency portions or “bands”.

For example, possibly the simplest type of crossover will accept an audio signal at its input, and divide it into the high frequency and the low frequency components, and output those two signals separately. In this simple case, the filtering would be done with

  • a high-pass filter
    (which allows the high frequency bands to pass through and increasingly attenuates the signal level as you go lower in frequency), and
  • a low-pass filter
    (which allows the low frequency bands to pass through and increasingly attenuates the signal level as you go higher in frequency).

This would be called a “Two-way crossover” since it has two outputs.

Crossovers with more outputs (e.g. Three- or Four-way crossovers) are also common. These would probably use one or more band-pass filters to separate the mid-band frequencies.

Why do we need crossovers?

In order to understand why we might need a crossover in a loudspeaker, we need to talk about loudspeaker drivers, what they do well, and what they do poorly.

It’s nice to think of a loudspeaker driver like a woofer or a tweeter as a rigid piston that moves in and out of an enclosure, pushing and pulling air particles to make pressure waves that radiate outwards into the listening room. In many aspects, this simplified model works well, but it leaves out a lot of important information that can’t be ignored. If we could ignore the details, then we could just send the entire frequency range into a single loudspeaker driver and not worry about it. However, reality has a habit of making things difficult.

For example, the moving parts of a loudspeaker driver have a mass that is dependent on how big it is and what it’s made of. The loudspeaker’s motor (probably a coil of wire living inside a magnetic field) does the work of pushing and pulling that mass back and forth. However, if the frequency that you’re trying to produce is very high, then you’re trying to move that mass very quickly, and inertia will work against you. In fact, if you try to move a heavy driver (like a woofer) a lot at a very high frequency, you will probably wind up just burning out the motor (which means that you’ve melted the wire in the coil) because it’s working so hard.

Another problem is that of loudspeaker excursion, how far it moves in and out in order to make sound. Although it’s not commonly known, the acoustic output level of a loudspeaker driver is proportional to its acceleration (which is a measure of its change in velocity over time, which are dependent on its excursion and the frequency it’s producing). The short version of this relationship is that, if you want to maintain the same output level, and you double the frequency, the driver’s excursion should reduce to 1/4. In other words, if you’re playing a signal at 1000 Hz, and the driver is moving in and out by ±1 mm, if you change to 2000 Hz, the driver should move in and out by ±0.25 mm. Conversely, if you halve the frequency to 500 Hz, you have to move the driver in and out with an excursion of ±4 mm. If you go to 1/10 of the frequency, the excursion has to be 100x the original value. For normal loudspeakers, this kind of range of movement is impractical, if not impossible.

Note that both of these plots show the same thing. The only difference is the scaling of the Y-axis.

One last example is that of directivity. The width of the beam of sound that is radiated by a loudspeaker driver is heavily dependent on the relationship between its size (assuming that it’s a circular driver, then its diameter) and the wavelength (in air) of the signal that it’s producing. If the wavelength of the signal is big compared to the diameter of the driver, then the sound will be radiated roughly equally in all directions. However, if the wavelength of the signal is similar to the diameter of the driver, then it will emit more of a “beam” of sound that is increasingly narrow as the frequency increases.

So, if you want to keep from melting your loudspeaker driver’s voice coil you’ll have to increasingly attenuate its input level at higher frequencies. If you want to avoid trying to push and pull your tweeter too far in and out, you’ll have to increasingly attenuate its input level at lower frequencies. And if you’re worried about the directivity of your loudspeaker, you’ll have to use more than one loudspeaker driver and divide up the signal into different frequency bands for the various outputs.

Loudspeaker Crossovers: Part 0 (Introduction)

I recently received a question in my inbox:

In passive crossovers, many are phase incoherent,
meaning that the phase shift of one frequency will be different than another frequency.
Do you agree?  Am curious how this is dealt with in the active crossover’s of B&O products?

At first, I debated just sending a quick email back with a short answer saying something pithy. But while I was thinking about what to write, I realised that:

  • This is actually a really good question / topic
  • I haven’t posted anything about crossovers in a long time
  • I’ve learned a lot about crossovers since the last time I did post something
  • I still have a LOT more to learn about crossovers.

As a result, this will be the first in what I expect to be a long series of postings about loudspeaker crossovers, starting with basic questions like

  • Why do we use them?
  • What do we think they do?
  • What do they really do? and
  • How are the ones we implement these days different from the ones you read about in old textbooks?

As usual, I’ll probably get distracted and wind up going down more than one rabbit hole along the way… But that’s one of the reasons why I’m doing this – to find out where I wind up, and hopefully to meet some new rabbits along the way.

The Sound of Music

This episode of The Infinite Monkey Cage is worth a listen if you’re interested in the history of recording technologies.

There’s one comment in there by Brian Eno that I COMPLETELY agree with. He mentions that we invented a new word for moving pictures: “movies” to distinguish them from the live equivalent, “plays”. But we never really did this for music… Unless, of course, you distinguish listening to a “concert” from listening to a “recording” – but most of us just say “I’m listening to music”.

Bit depth conversion: Part 4

Converting floating point to fixed point

It is often the case that you have to convert a floating point representation to a fixed point representation. For example, you’re doing some signal processing like changing the volume or adding equalisation, and you want to output the signal to a DAC or a digital output.

The easiest way to do this is to just send the floating point signal into the DAC or the S/PDIF transmitter and let it look after things. However, in my experience, you can’t always trust this. (I’ll explain why in a later posting in this series.) So, if you’re a geek like me, then you do this conversion yourself in advance to ensure you’re getting what you think you’re getting.

To start, we’ll assume that, in the floating point world, you have ensured that your signal is scaled in level to have a maximum amplitude of ± 1.0. In floating point, it’s possible to go much higher than this, and there’re no serious reason to worry going much lower (see this posting). However, we work with the assumption that we’re around that level.

So, if you have a 0 dB FS sine wave in floating point, then its maximum and minimum will hit ±1.0.

Then, we have to convert that signal with a range of ±1.0 to a fixed point system that, as we already know, is asymmetrical. This means that we have to be a little careful about how we scale the signal to avoid clipping on the positive side. We do this by multiplying the ±1.0 signal by 2^(nBits-1)-1 if the signal is not dithered. (Pay heed to that “-1” at the end of the multiplier.)

Let’s do an example of this, using a 5-bit output to keep things on a human scale. We take the floating point values and multiply each of them by 2^(5-1)-1 (or 15). We then round the signals to the nearest integer value and save this as a two’s complement binary value. This is shown below in Figure 1.

Figure 1. Converting floating point to a 5-bit fixed point value without dither.

As should be obvious from Figure 1, we will never hit the bottom-most fixed point quantisation level (unless the signal is asymmetrical and actually goes a little below -1.0).

If you choose to dither your audio signal, then you’re adding a white noise signal with an amplitude of ±1 quantisation level after the floating point signal is scaled and before it’s rounded. This means that you need one extra quantisation level of headroom to avoid clipping as a result of having added the dither. Therefore, you have to multiply the floating point value by 2^(nBits-1)-2 instead (notice the “-2” at the end there…) This is shown below in Figure 2.

Figure 2. Converting floating point to a 5-bit fixed point value with dither.

Of course, you can choose to not dither the signal. Dither was a really useful thing back in the days when we only had 16 reliable bits to work with. However, now that 24-bit signals are normal, dither is not really a concern.

Bit depth conversion: Part 2

Binary concatenation and bit splitting

In Part 1, I talked about different options for converting a quantised LPCM audio signal, encoded with some number of bits into an encoding with more bits. In this posting, we’ll look at a trick that can be used when you combine these options.

To start, made two signals:

  • “Signal 1” is a sinusoidal tone with a frequency of 100 Hz.
    It has an amplitude of ±1, but then encoded it as a quantised 8-bit signal, so in Figure 1, it looks like it has an amplitude of ±127 (which is 2^(nBits-1)-1)
  • “Signal 2” is a sinusoidal tone with a frequency of 1 kHz and the same amplitude as Signal 1.

Both of these two signals are plotted on the left side of Figure 1, below. On the right, you can see the frequency content of the two signals as well. Notice that there is plenty of “garbage” at the bottom of those two plots. This is because I just quantised the signals without dither, so what you’re seeing there is the frequency-domain artefacts of quantisation error.

Figure 1. Two sinusoidal waveforms with different frequencies. Both are 8-bit quantised without dither.

If I look at the actual sample values of “Signal 1” for the first 10 samples, they look like the table below. I’ve listed them in both decimal values and their binary representations. The reason for this will be obvious later.

Sample numberSample value (decimal)Sample Value (binary)
1000000000
2200000010
3300000011
4500000101
5700000111
6800001000
71000001010
81200001100
91300001101
101500001111

Let’s also look at the first 10 sample values for “Signal 2”

Sample numberSample value (decimal)Sample Value (binary)
1000000000
21700010001
33300100001
44900110001
56300111111
67701001101
79001011010
810101100101
911001101110
1011701110101

The signals I plotted above have a sampling rate of 48 kHz, so there are a LOT more samples after the 10th one… however, for the purposes of this posting, the ten values listed in the tables above are plenty.

At the end of the Part 1, I talked about the Most and the Least Significant Bits (MSBs and LSBs) in a binary number. In the context of that posting, we were talking about whether the bit values in the original signal became the MSBs (for Option 1) or the LSBs (for Option 3) in the new representation.

In this posting, we’re doing something different.

Both of the signals above are encoded as 8-bit signals. What happens if we combine them by just slamming their two values together to make 16-bit numbers?

For example, if we look at sample #10 from both of the tables above:

  • Signal 1, Sample #10 = 00001111
  • Signal 2, Sample #10 = 01110101

If I put those two binary numbers together, making Signal 1 the 8 MSBs and Signal 2 the 8 LSBs then I get

0000111101110101

Note that I formatted them with bold and italics just to make it easier to see them. I could have just written 0000111101110101 and let you figure it out.

Just to keep things adequately geeky, you should know that “slamming their values together” is not the correct term for what I’ve done here. It’s called binary concatenation.

Another way to think about what I’ve done is to say that I converted Signal 1 from an 8-bit to a 16-bit number by zero-padding, and then I added Signal 2 to the result.

Yet another way to think of it is to say that I added about 48 dB of gain to Signal 1 (20*log10(2^8) = about 48.164799306236993 dB of gain to be more precise…) and then added Signal 2 to the result. (NB. This is not really correct, as is explained below.)

However, when you’re working with the numbers inside the computer’s code, it’s easier to just concatenate the two binary numbers to get the same result.

If you do this, what do you get? The result is shown in Figure 2, below.

Figure 2. The binary concatenated result of Signal 1 and Signal 2

As you can see there, the numbers on the y-axis are MUCH bigger. This is because of the bit-shifting done to Signal 1. The MSBs of a 16-bit number are 256 times bigger in decimal world than those of an 8-bit number (because 2^8 = 256).

In other words, the maximum value in either Signal 1 or Signal 2 is 127 (or 2^(8-1)-1) whereas the maximum value in the combined signal is 32767 (or 2^(16-1)-1).

The table below shows the resulting first 10 values of the combined signal.

Sample numberSample value (decimal)Sample Value (binary)
100000000000000000
25290000001000010001
38010000001100100001
413290000010100110001
518550000011100111111
621250000100001001101
726500000101001011010
831730000110001100101
934380000110101101110
1039570000111101110101

Why is this useful? Well, up to now, it’s not. But, we have one trick left up our sleeve… We can split them apart again, taking that column of numbers on the right side of the table above, cut each one into two 8-bit values, and ta-da! We get out the two signals that we started with!

Just to make sure that I’m not lying, I actually did all of that and plotted the output in Figure 3. If you look carefully at the quantisation error artefacts in the frequency-domain plots, you’ll see that they’re identical to those in Figure 1. (Although, if they weren’t, then this would mean that I made a mistake in my Matlab code…)

Figure 3. The two signals after they’ve been separated once again.

So what?

Okay, this might seem like a dumb trick. But it’s not. This is a really useful trick in some specific cases: transmitting audio signals is one of the first ones to come to mind.

Let’s say, for example, that you wanted to send audio over an S/PDIF digital audio connection. The S/PDIF protocol is designed to transmit two channels of audio with up to 24-bit LPCM resolution. Yes, you can do different things by sending non-LPCM data (like DSD over PCM (DoP) or Dolby Digital-encoded signals, for example) but we won’t talk about those.

If you use this binary concatenation and splitting technique, you could, for example, send two completely different audio signals in each of the audio channels on the S/PDIF. For example, you could send one 16-bit signal (as the 16 MSBs) and a different 8-bit signal (as the LSBs), resulting in a total of 24 bits.

On the receiving end, you split the 24-bit values into the 16-bit and 8-bit constituents, and you get back what you put in.

(Or, if you wanted to get really funky, you could put the two 8-bit leftovers together to make a 16-bit signal, thus transmitting three lossless LPCM 16-bit channels over a stream designed for two 24-bit signals.)

However, if you DON’T split them, and you just play the 24-bit signal into a system, then that 8-bit signal is so low in level that it’s probably inaudible (since it’s at least 93 dB below the peak of the “main” signal). So, no noticeable harm done!

Hopefully, now you can see that there are lots of potential uses for this. For example, it could be a sneaky way for a record label to put watermarking into an audio signal, for example. Or you could use it to send secret messages across enemy lines, buried under a recording of the Alvin and the Chipmunk’s cover of “Achy Breaky Heart”. Or you could use it for squeezing more than two channels out of an S/PDIF cable for multichannel audio playback.

One small issue…

Just to be clear, I actually used Matlab and did all the stuff I said above to make those plots. I didn’t fake it. I promise!

But if you’re looking carefully, you might notice two things that I also noticed when I was writing this.

I said above that, by bit-shifting Signal 1 over by 8 bits in the combined signal, this makes it 48 dB louder than Signal 2. However, if you look at the frequency domain plot in Figure 2, you’ll notice that the 1 kHz tone is about 60 dB lower than the 100 Hz tone. You’ll also notice that there are distortion artefacts on the 1 kHz signal at 3 kHz, 5 kHz and so on – but they’re not there in the extracted signal in Figure 3. So, what’s going on?

To be honest, when I saw this, I had no idea, but I’m lucky enough to work with some smart people who figured it out.

If you go back to the figures in Part 1, you can see that the MSB of a sample value in binary representation is used as the “sign” of the value. In other words, if that first bit is 0, then it’s a positive value. If it’s a 1 then it’s a negative value. This is known as a “two’s complement” representation of the signal.

When we do the concatenation of the two sample values as I showed in the example above, the “sign” bit of the signal that becomes the LSBs of the combined signal no longer behaves as a +/- sign. So, the truth is that, although I said above that it’s like adding the two signals – it’s really not exactly the same.

If we take the signal combined through concatenation and subtract ONLY the bit-shifted version of Signal 1, the result looks like this:

Figure 4. The difference between the combined signals shown in Figure 3 and Signal 1, after it’s been bit-shifted (or zero-padded) by 8 LSBs.

Notice that the difference signal has a period of 1 ms, therefore its fundamental is 1 kHz, which makes sense because it’s a weirdly distorted version of Signal 2, which is a 1 kHz sine tone.

However, that fundamental frequency has a lower level than the original sine tone (notice that it shows up at about -60 dB instead of -48 dB in Figure 2). In addition, it has a DC offset (no negative values) and it’s got to have some serious THD to be that weird looking. Since it’s a symmetrical waveform, its distortion artefacts consist of only odd multiples of the fundamental.

Therefore, when I stated above that you’re “just” adding the two signals together, so there’s no harm done if you don’t separate them at the receiving end. This was a lie. But, if your signal with the MSBs has enough bits, then you’ll get away with it, since this pushes the second signal further down in level.