The four crossover types we’ve looked at so far all use the same basic concept: take the input signal and divide it into different frequency bands using some kind of filters that are implemented in parallel. You send the input to a high pass filter to create the high-frequency output, and you send the same input to a low-pass filter to create the low-frequency output.
In all of the examples we’ve seen so far, because they have been based on Butterworth sections, incur some kind of phase shift with frequency. We’ll talk about this more later. However, the fact that this phase shift exists bothers some people.
There are various ways to make a crossover that, when you sum its outputs, result in a total that is NOT phase shifted relative to the input signal. The general term for this kind of design is a “Constant Voltage” crossover (see this AES paper by Richard Small for a good discussion about constant voltage crossover design).
Let’s look at just one example of a constant voltage crossover to see how it might be different from the ones we’ve looked at so far. To create this particular example, I take the input signal and filter it using a 2nd-order Butterworth high pass. This is the high-frequency output of the crossover. To create the low-frequency output of the crossover, I subtract the high-frequency output from the input signal. This is shown in the block diagram below in Figure 5.1
Figure 5.1. One example of a constant voltage crossover.
As with the previous four crossovers, I’ve added the two outputs of the crossover back together to look at the total result.
Figure 5.2: the magnitude and phase responses of the two sections of the crossover.
Figure 5.2 shows the magnitude and phase responses of the high- and low-frequency portions of the crossover. One thing that’s immediately noticeable there is that the two portions are not symmetrical as they have been in the previous crossover types. The slopes of the filters don’t match, the low-pass component has a bump that goes above 0 dB before it starts dropping, and their phase responses do not have a constant difference independent of frequency. They’re about 180º apart in the low end, and only about 90º in the high end.
However, because the low-frequency output was created by subtracting the high-frequency component from the input, when we add them back together, we just get back what we put in, as can be seen in Figure 5.3.
Figure 5.3. The magnitude and phase responses of the summed output of the crossover shown in Figure 5.1.
Essentially, this shows us that Output = Input, which is hopefully, not surprising.
If we then run our three sinusoidal signals through this crossover and look at the summed output, the results will look like Figures 5.4 to 5.6
Figure 5.4: Row 1: the input (10 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Figure 5.5: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Figure 5.6: Row 1: the input (1 kHz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Notice in all three of those figures that the outputs and the inputs are identical, even though the individual behaviours of the two frequency-limited outputs might be temporarily weird (look at the start of the signals of the high-frequency output in Figures 5.4 and 5.6 for example…)
Now, don’t go jumping to conclusions… Just because the sum of the output is identical to the input of a constant voltage crossover does NOT make this the winner. We’re just getting started, and so far, we have only considered a very simple aspect of crossovers that, although necessary to understand them, is just the beginning of considering what they do in the real world.
Up to now, we have really only been thinking about crossovers in three dimensions: Frequency, Magnitude, and Phase. Starting in the next posting, we’ll add three more dimensions (X,Y, and Z of physical space) to see how, even a simple version of the real world makes things a lot more complicated.
A 2nd-order Linkwitz Riley crossover is something like a hybrid of the previous two crossover types that I’ve described. If you’re building one, then the “helicopter view” block diagram looks just like the one for the 4th-order Linkwitz Riley, but I’ve shown it here again anyway.
Figure 4.1
The difference between a 2nd-order and a 4th-order Linkwitz Riley is in the details of exactly what’s inside those blocks called “HPF” and “LPF”. In the case of a 2nd-order crossover, each block contains a 1st-order Butterworth filter, and they all have the same cutoff frequency. (For a 4th-order Linkwitz Riley, the filters are all 2nd-order Butterworth)
Since each of those filters will attenuate the signal by 3 dB at the cutoff frequency, then the total combined response for each section will be -6 dB at the crossover. This can be seen below in Figure 4.2. Also, the series combination of the two 1st-order Butterworths means that the high and low sections of the crossover will have a phase different of 180º at all frequencies.
Figure 4.2
Since the two filter sections have a phase separation of 180º, we need to invert the polarity of the high-pass section. This means that, when the two outputs are summed as shown in Figure 4.1, the total magnitude response is flat, but the phase response is the same as a 2nd-order minimum phase allpass filter, as can be seen in Figure 4.3, below.
Figure 4.3
If we then look at the low- mid- and high-frequency sinusoidal signals that have been passed through the crossover, the results look like those shown below in Figures 4.4, 4.5, and 4.6.
Figure 4.4: Row 1: the input (10 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
As can be seen in Figure 4.4, for a very low frequency, the output is the same as the input, the magnitude is identical (as we would expect based on the Magnitude Response plot shown in Figure 4.3, and the phase difference of the output relative to the input is 0º.
Figure 4.5: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
At the crossover frequency, shown in Figure 4.5, the output has shifted in phase relative to the input by 90º, but their magnitudes still match.
Figure 4.6: Row 1: the input (1 kHz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
At a high frequency, the phase has shifted by 180º relative to the input.
One last thing. The dotted plots in Figures 4.4 to 4.6 are the signals magnified by a factor of 10 to make them easier to see when they’re low in level. There are two interesting ones to look at:
the very beginning of the black plot on the right of Figure 4.4. Notice that this one starts with a positive spike before it settles down into a sinusoid.
the red plot on the left In Figure 4.6. Notice that the signal goes positive, and stays positive for the full 5 ms.
We will come back later to talk about both of these points. The truth is that they’re not really important for now, so we’ll pretend that they didn’t look too weird.
A fourth-order Linkwitz-Riley crossover is made using the same filters in the 2nd-order Butterworth crossover described in the previous posting. The difference in implementation is that you use two second-order filters in series. Again, all filters have the same cutoff frequency and, if you’re implementing them with biquads, the Q of all of them is 1/sqrt(2).
Figure 3.1
Since we have two high pass filters in series, then the total result is -6 dB at the cutoff frequency (since each of the two filters attenuates by 3 dB) and the slope of the filter is 24 dB per octave. This results in the magnitude and phase responses shown below in Figure 3.2.
Figure 3.2
One important thing to notice now is that the phase responses of the two filters are 360º apart at all frequencies. This is different from the second-order Butterworth crossover, in which the two outputs are 180º apart. So we won’t need to flip the polarity of anything to compensate for the phase difference.
As in the previous posting, Let’s look at the signals that get through the crossover, and the total summed output for three input frequencies. This is shown in Figure 3.3, 3.4, and 3.5.
Figure 3.3: Row 1: the input (1 kHz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Figure 3.4: Row 1: the input (10 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Figure 3.5: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
If you take a look at Figures 3.3 and 3.4 it appears that the total summed output of the crossover is in phase with the input at very low and very high frequencies. However, this is actually misleading. Take a look at Figure 3.5 and you’ll see that, when the input signal is the same frequency as the crossover frequency, the summed output is shifted by 180º relative to the input signal.
Figure 3.6
If we compare the summed output to the input, they are in-phase at very low frequencies. As the frequency increases, the phase of the summed output of the crossover gets later and later, passing 180º at the crossover frequency and approaching a shift of 360º in the high frequencies.
In other words, a 4th-order Linkwitz-Riley crossover by itself, when you sum the outputs of the filters as shown in Figure 3.1, has the same response as a 4th-order minimum phase allpass filter.
One extra thing to notice is that, since the high-pass and low-pass paths are 360º apart, and (partly) since they’re -6 dB at the crossover frequency, the magnitude response of the summed total is flat.
One way to look at the behaviour of a signal when it’s sent through a crossover is to pretend that the loudspeaker isn’t part of the system. Once-upon-a-time, I probably would have phrased this differently and said something like “pretend that the loudspeaker is perfect”, but, now that I’m older, my opinions about the definition of “perfect” have changed.
So, we’ll take a signal, send it to a two-way crossover of some kind, and then just add the two signals back together. This shows us one view of the behaviour of the crossover, which is good enough to deal with the basics for now. In a later posting in this series, we’ll look at a more multi-dimensional and therefore realistic view of what’s happening.
Figure 2.1
The block diagram above shows the signal flow that I used for all of the following plots in this posting.
Butterworth, 2nd-order (12 db/octave)
Although the block diagram above shows that we have a high-pass and a low-pass filter to separate the signal into two frequency bands, there are a lot of details missing about the specific characteristics of those filters. There are many ways to make a high-pass filter, for example…
One common crossover type uses 2nd-order Butterworth filters, both with the same cutoff frequency. One way to implement these are to use biquads to make low-pass and high-pass filters with Q = 1/sqrt(2).
Fig. 2.2: The individual magnitude and phase responses of the low-pass (in red) and high-pass (in black)
Before we look at the output of the entire crossover after the two signals have been summed, let’s talk about the red and the black curves in the plots above.
The magnitude responses should not come as a surprise. The fact that I’m using 2nd-order filters means that the slope of the attenuation will be 12 dB per octave (or 40 dB per decade) once you get far enough away from the cutoff frequency. The fact that they also have a Q of 1/sqrt(2) (approximately 0.707) means that they will attenuate the signal by 3 dB at the cutoff frequency, and that there is no “bump” in the slope of the magnitude response.
However, the phase responses might be a little confusing. Let’s take those separately:
For the low-pass filter (the black line), you can see that in the high-frequency band, where the magnitude response is a flat line at 0 dB (which means that the level of the output level is equal to the level of the input), the phase shift is 0º.
Another way to look at this is to put a sine wave into the system and see what comes out, as shown in Figure 2.3 below. The top plot shows the input to the two filters. Since this sine wave has a period of 1 ms, then it’s a 1000 Hz tone.
The second row of plots shows the magnitude responses of the low-pass filter (in red, on the left) and the high-pass filter (in black, on the right). Notice the levels of these two curves at a frequency of 1000 Hz.
The third row of plots shows the actual outputs of the two filters. For now, we’ll only look at the output of the high-pass filter on the right. There are three things to notice about this plot:
After about 1 ms, the amplitude of this sine tone is the same as the one in the top plot.
The phase of this sine tone is the same as the one in the top plot. In other words (for example), they both pass the 0 line, heading positive at Time = 1 ms.
The start of the sine wave is a little weird. Notice that the positive peak is lower than expected and first negative trough is BELOW the maximum-negative amplitude. (it’s below a value of -1). We’ll ignore this for now, and come back to it later.
The fourth row shows the output of the two filters when they have been added together. Notice here that the output is almost identical to the input because it’s essentially just the contribution of the high-pass filter. The low-pass filter has so little output that it’s practically irrelevant.
Figure 2.3: Row 1: the input (1 kHz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Let’s now look a what happens if we put in a low-frequency sine wave instead. This is shown in Figure 2.4.
Figure 2.4 Row 1: the input (10 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. (the dotted line shows the signal with its amplitude multiplied by 10 to “zoom in” on it) Row 4: the summed output
Notice now that the time scale is 100 times longer. The sine wave now has a period of 100 ms, so it’s a 10 Hz sine wave.
We’ll focus on the third row of plots again, still looking only at the output of the high-pass filter on the right. There are three things to notice about this plot:
After about 100 ms, the amplitude of this sine tone (the solid black line) is MUCH lower than the amplitude of the input. The dotted line is a “magnified” version of the same signal so that we can see it for the phase comparison.
The phase of this sine tone is the shifted by 180º relative to the top plot. In other words (for example), at Time = 200 ms they both pass the 0 line, but this signal is going negative when the input is going positive.
The start of the sine wave is a also weird, but differently so; with that spike at the beginning and the weird wiggle in the curve before it settles down. We’ll ignore this for now, and come back to it later.
If you go back and look at the low-pass filter’s output, then you’ll see basically the same behaviour, but for the opposite frequency.
And, again, the output is almost identical to the input because it’s essentially just the contribution of the low-pass filter. The high-pass filter has so little output that it’s practically irrelevant.
Now let’s look at what happens when the frequency of the input signal is on the cutoff frequency of the two filters – in other words, the crossover frequency.
Figure 2.5: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. Row 4: the summed output
Now the sine wave has a period of 10 ms, so its frequency is 100 Hz.
Take a look at the third row of plots at Time = 10 ms.
The first thing to do is to compare the outputs of the two filters. The output of the low-pass filter on the left is negative, whereas the output of the high-pass filter on the right is positive. The outputs of the two filters are 180º out of phase with each other. This can also be seen in the plot back in Figure 2.2, where it’s shown that the difference between the red and the black phase response plots is 180º at all frequencies.
This is also why., once everything settles down, the sum of the two filters (the blue line on the bottom) is silence. The two signals have equal amplitude, and are 180º out of phase, so they cancel each other out.
Now compare those signal plots in the third row in Figure 2.5 to the input signal shown in the top plot. If you look at Time = 10 ms again (for example), you can see that the output of the low-pass filter is 90º behind the input. However, the output of the high-pass filter is early 90º ahead of the input.
The fact that the phase of the output of the high-pass filter is ahead of its own input confuses many people, however, don’t panic. This does not mean that the output is ahead of the input in TIME. The high-pass filter cannot see into the future. The only reason its output can have a phase that precedes the phase of its input is if the sine wave has been playing for a long time (and, in this case a “long” time can be measured in milliseconds…).
This confusion is the result of two things:
People are typically taught the concept of phase as it relates to time. However, if you’re talking about a sine wave, then you are implying infinite time. in order for a signal to be a REAL sine wave, it must have always been playing and it must never stop. If it started or stopped, then there are other frequencies present, and so it’s not a theoretically-perfect sine wave.
We use the words “ahead” and “behind” or “earlier” and “later” to describe the phase relationships, and these words typically imply a time relationship.
Maybe a rough analogy that can help is to walk next to a friend, at the same speed, but do not synchronise your steps. You will both arrive at the same place at the same time, but at two different moments in the cycles of your footsteps.
Of course, if you make a crossover like this, it won’t work very well, since you get that cancellation at the crossover frequency when the two filters outputs are added together. If we plot the summed response’s magnitude and phase characteristics, they look like the plots shown in Figure 2.6.
Figure 2.6: the magnitude and phase responses of the total shown in Figure 2.5.
As you can see there, there is complete cancellation at the crossover frequency, and the phase response flips across that notch.
So, the solution with a 2nd-order Butterworth crossover is to assume that people won’t notice if you invert the polarity of the high-pass filter’s output. This is a good assumption that I will not argue with at all.
This polarity inversion “undoes” the 180º phase difference of the two filters seen in Figure 2.2, and the summed result is shown below in Figure 2.7 and 2.8.
Figure 2.7: the magnitude and phase responses of the total shown in Figure 2.8, which is the same as Figure 2.5 after the polarity of the HPF’s output has been inverted.
Figure 2.8: Row 1: the input (100 Hz sine wave). Row 2: the magnitude responses of the two filters. Row 3: the outputs of the individual filters. (Note that the polarity of the HPF’s output has been inverted) Row 4: the summed output
Now the outputs of the two filters appear to be in phase with each other. They are still 90º out of phase with the input, which means that their summed outputs are also 90º out of phase with the input. This can be seen in the bottom plots of Figure 2.7 and 2.8.
You’ll also notice that there is a 3 dB bump at the crossover frequency. This is because, at their cutoff frequencies, both filters attenuate by 3 dB (a linear gain of 0.707). When those two signals of equal amplitude and matching phase are added together, you get a magnitude that is 6 dB higher (or a linear gain of 1.41). We’ll talk about this later when we start looking at the real world.
Finally, take a look at the bottom plot in Figure 2.7. You can see there that the summed outputs of the two filters result in a phase shift that increases with frequency. In fact, when we look at a 2nd-order Butterworth crossover like this, without all the real-world implications of loudspeaker drivers that have their own characteristics and are separated in space, it can be seen that it acts as a 2nd-order minimum-phase allpass filter. This isn’t necessarily a bad thing, so don’t jump to conclusions this early…
We are STILL not going to talk about that weirdness at the beginning of the signal after it’s been filtered. That will come later.
A crossover is a set of filters that take an audio signal and separate it into different frequency portions or “bands”.
For example, possibly the simplest type of crossover will accept an audio signal at its input, and divide it into the high frequency and the low frequency components, and output those two signals separately. In this simple case, the filtering would be done with
a high-pass filter (which allows the high frequency bands to pass through and increasingly attenuates the signal level as you go lower in frequency), and
a low-pass filter (which allows the low frequency bands to pass through and increasingly attenuates the signal level as you go higher in frequency).
This would be called a “Two-way crossover” since it has two outputs.
Crossovers with more outputs (e.g. Three- or Four-way crossovers) are also common. These would probably use one or more band-pass filters to separate the mid-band frequencies.
Why do we need crossovers?
In order to understand why we might need a crossover in a loudspeaker, we need to talk about loudspeaker drivers, what they do well, and what they do poorly.
It’s nice to think of a loudspeaker driver like a woofer or a tweeter as a rigid piston that moves in and out of an enclosure, pushing and pulling air particles to make pressure waves that radiate outwards into the listening room. In many aspects, this simplified model works well, but it leaves out a lot of important information that can’t be ignored. If we could ignore the details, then we could just send the entire frequency range into a single loudspeaker driver and not worry about it. However, reality has a habit of making things difficult.
For example, the moving parts of a loudspeaker driver have a mass that is dependent on how big it is and what it’s made of. The loudspeaker’s motor (probably a coil of wire living inside a magnetic field) does the work of pushing and pulling that mass back and forth. However, if the frequency that you’re trying to produce is very high, then you’re trying to move that mass very quickly, and inertia will work against you. In fact, if you try to move a heavy driver (like a woofer) a lot at a very high frequency, you will probably wind up just burning out the motor (which means that you’ve melted the wire in the coil) because it’s working so hard.
Another problem is that of loudspeaker excursion, how far it moves in and out in order to make sound. Although it’s not commonly known, the acoustic output level of a loudspeaker driver is proportional to its acceleration (which is a measure of its change in velocity over time, which are dependent on its excursion and the frequency it’s producing). The short version of this relationship is that, if you want to maintain the same output level, and you double the frequency, the driver’s excursion should reduce to 1/4. In other words, if you’re playing a signal at 1000 Hz, and the driver is moving in and out by ±1 mm, if you change to 2000 Hz, the driver should move in and out by ±0.25 mm. Conversely, if you halve the frequency to 500 Hz, you have to move the driver in and out with an excursion of ±4 mm. If you go to 1/10 of the frequency, the excursion has to be 100x the original value. For normal loudspeakers, this kind of range of movement is impractical, if not impossible.
Note that both of these plots show the same thing. The only difference is the scaling of the Y-axis.
One last example is that of directivity. The width of the beam of sound that is radiated by a loudspeaker driver is heavily dependent on the relationship between its size (assuming that it’s a circular driver, then its diameter) and the wavelength (in air) of the signal that it’s producing. If the wavelength of the signal is big compared to the diameter of the driver, then the sound will be radiated roughly equally in all directions. However, if the wavelength of the signal is similar to the diameter of the driver, then it will emit more of a “beam” of sound that is increasingly narrow as the frequency increases.
So, if you want to keep from melting your loudspeaker driver’s voice coil you’ll have to increasingly attenuate its input level at higher frequencies. If you want to avoid trying to push and pull your tweeter too far in and out, you’ll have to increasingly attenuate its input level at lower frequencies. And if you’re worried about the directivity of your loudspeaker, you’ll have to use more than one loudspeaker driver and divide up the signal into different frequency bands for the various outputs.
In passive crossovers, many are phase incoherent, meaning that the phase shift of one frequency will be different than another frequency. Do you agree? Am curious how this is dealt with in the active crossover’s of B&O products?
At first, I debated just sending a quick email back with a short answer saying something pithy. But while I was thinking about what to write, I realised that:
This is actually a really good question / topic
I haven’t posted anything about crossovers in a long time
I’ve learned a lot about crossovers since the last time I did post something
I still have a LOT more to learn about crossovers.
As a result, this will be the first in what I expect to be a long series of postings about loudspeaker crossovers, starting with basic questions like
Why do we use them?
What do we think they do?
What do they really do? and
How are the ones we implement these days different from the ones you read about in old textbooks?
As usual, I’ll probably get distracted and wind up going down more than one rabbit hole along the way… But that’s one of the reasons why I’m doing this – to find out where I wind up, and hopefully to meet some new rabbits along the way.
In the April, 1968 issue of Wireless World, there is a short article titled “P.C.M. Copes with Everything”
It’s interesting reading the 57-year old predictions in here. One has proven to be not-quite-correct:
While 27 levels are quite adequate for telephonic speech, 211 or 212 need to be used for high quality music.
I doubt that anyone today would be convinced that 11- or 12-bit PCM would deserve the classification of “high quality”. Although some of my earliest digital recordings were made on a Sony PCM 2500 DAT machine, with an ADC that was only reliable down to about 12 or 13 bits, I wouldn’t try to pass those off as “high quality” recordings.
But, towards the end of the article, it says:
The closing talk was given by A. H. Reeves, the inventor of p.c.m. Letting his imagination take over, he spoke of a world in the not too distant future where communication links will permit people to carry out many jobs from the comfort of their homes, conferences using closed-circuit television etc. For this, he said, reliable links capable of bit rates of the order of 109 or 1010 bits will be required. Light is the most probable answer.
Impressive that, in 1968, Reeves predicted fibre optic connections to our houses and the ability to sit at home on Teams meetings (or Facetime or Zoom or Skype, or whatever…)
I had a little time at work today waiting for some visitors to show up and, as I sometimes do, I pulled an old audio book off the shelf and browsed through it. As usually happens when I do this, something interesting caught my eye.
I was reading the AES publication called “The Phonograph and Sound Recording After One-Hundred Years” which was the centennial issue of the Journal of the AES from October / November 1977.
In that issue of the JAES, there is an article called “Record Changers, Turntables, and Tone Arms – A Brief Technical History” by James H. Kogen of Shure Brothers Incorporated, and in that article he mentions US Patent Number 1,468,455 by William H. Bristol of Waterbury, CT, titled “Multiple Sound-Reproducing Apparatus”.
Before I go any further, let’s put the date of this patent in perspective. In 1923, record players existed, but they were wound by hand and ran on clockwork-driven mechanisms. The steel needle was mechanically connected to a diaphragm at the bottom of a horn. There were no electrical parts, since lots of people still didn’t even have electrical wiring in their homes: radios were battery-powered. Yes, electrically-driven loudspeakers existed, but they weren’t something you’d find just anywhere…
In addition, 3- or 2-channel stereo wasn’t invented yet, Blumlein wouldn’t patent a method for encoding two channels on a record until 1931: 8 years in the future…
But, if we look at Bristol’s patent, we see a couple of astonishing things, in my opinion.
If you look at the top figure, you can see the record, sitting on the gramophone (I will not call it a record player or a turntable…). The needle and diaphragm are connected to the base of the horn (seen on the top right of Figure 3, looking very much like my old Telefunken Lido, shown below.
But, below that, on the bottom of Figure 3 are what looks a modern-ish looking tonearm (item number 18) with a second tonearm connected to it (item number 27). Bristol mentions the pickups on these as “electrical transmitters”: this was “bleeding edge” emerging technology at the time.
So, why two pickups? First a little side-story.
Anyone who works with audio upmixers knows that one of the “tricks” that are used is to derive some signal from the incoming playback, delay it, and then send the result to the rear or “surround” loudspeakers. This is a method that has been around for decades, and is very easy to implement these days, since delaying audio in a digital system is just a matter of putting the signal into a memory and playing it out a little later.
Now look at those two tonearms and their pickups. As the record turns, pickup number 20 in Figure 3 will play the signal first, and then, a little later, the same signal will be played by pickup number 26.
Then if you look at Figure 6, you can see that the first signal gets sent to two loudspeakers on the right of the figure (items number 22) and the second signal gets sent to the “surround” loudspeakers on the left (items number 31).
So, here we have an example of a system that was upmixing a surround playback even before 2-channel stereo was invented.
Mind blown…
NB. If you look at Figure 4, you can see that he thought of making the system compatible with the original needle in the horn. This is more obvious in Figures 1 and 2, shown below.
One of the things I have to do occasionally is to test a system or device to make sure that the audio signal that’s sent into it comes out unchanged. Of course, this is only one test on one dimension, but, if the thing you’re testing screws up the signal on this test, then there’s no point in digging into other things before it’s fixed.
One simple way to do this is to send a signal via a digital connection like S/PDIF through the DUT, then compare its output to the signal you sent, as is shown in the simple block diagram in Figure 1.
Figure 1: Basic block diagram of a Device Under Test
If the signal that comes back from the DUT is identical to the signal that was sent to it, then you can subtract one from the other and get a string of 0s. Of course, it takes some time to send the signal out and get it back, so you need to delay your reference signal to time-align them to make this trick work.
The problem is that, if you ONLY do what I described above (using something like the patcher shown in Figure 2) then it almost certainly won’t work.
Figure 2: The wrong way to do it
The question is: “why won’t this work?” and the answer has very much to do with Parts 1 through 4 of this series of postings.
Looking at the left side of the patcher, I’m creating a signal (in this case, it’s pink noise, but it could be anything) and sending it out the S/PDIF output of a sound card by connecting it to a DAC object. That signal connection is a floating point value with a range of ±1.0, and I have no idea how it’s being quantised to the (probably) 24 bits of quantisation levels at the sound card’s output.
That quantised signal is sent to the DUT, and then it comes back into a digital input through an ADC object.
Remember that the signal connection from the pink noise output across to the latency matching DELAY object is a floating point signal, but the signal coming into the ADC object has been converted to a fixed point signal and then back to a floating point representation.
Therefore, when you hit the subtraction object, you’re subtracting a floating point signal from what is effectively a fixed point quantised signal that is coming back in from the sound card’s S/PDIF input. Yes, the fixed point signal is converted to floating point by the time it comes out of the ADC object – but the two values will not be the same – even if you just connect the sound card’s S/PDIF output to its own input without an extra device out there.
In order to give this test method a hope of actually working, you have to do the quantisation yourself. This will ensure that the values that you’re sending out the S/PDIF output can be expected to match the ones you’re comparing them to internally. This is shown in Figure 3, below.
Figure 3: A better way to do it
Notice now that the original floating point signal is upscaled, quantised, and then downscaled before its output to the sound card or routed over to the comparison in the analysis section on the right. This all happens in a floating point world, but when you do the rounding (the quantisation) you force the floating point value to the one you expect when it gets converted to a fixed point signal.
This ensures that the (floating point) values that you’re using as your reference internally CAN match the ones that are going through your S/PDIF connection.
In this example, I’ve set the bit depth to 16 bits, but I could, of course, change that to whatever I want. Typically I do this at the 24-bit level, since the S/PDIF signal supports up to 24 bits for each sample value.
Be careful here. For starters, this is a VERY basic test and just the beginning of a long series of things to check. In addition, some sound cards do internal processing (like gain or sampling rate conversion) that will make this test fail, even if you’re just doing a loop back from the card’s S/PDIF output to its own input. So, don’t copy-and-paste this patcher and just expect things to work. They might not.
But the patcher shown in Figure 2 definitely won’t work…
One small last thing
You may be wondering why I take the original signal and send it to the right side of the “-” object instead of making things look nice by putting it in the left side. This is because I always subtract my reference signal from the test signal and not the other way around. Doing this every time means that I don’t have to interpret things differently every time, trying to figure out whether things are right-side-up or upside-down.
It is often the case that you have to convert a floating point representation to a fixed point representation. For example, you’re doing some signal processing like changing the volume or adding equalisation, and you want to output the signal to a DAC or a digital output.
The easiest way to do this is to just send the floating point signal into the DAC or the S/PDIF transmitter and let it look after things. However, in my experience, you can’t always trust this. (I’ll explain why in a later posting in this series.) So, if you’re a geek like me, then you do this conversion yourself in advance to ensure you’re getting what you think you’re getting.
To start, we’ll assume that, in the floating point world, you have ensured that your signal is scaled in level to have a maximum amplitude of ± 1.0. In floating point, it’s possible to go much higher than this, and there’re no serious reason to worry going much lower (see this posting). However, we work with the assumption that we’re around that level.
So, if you have a 0 dB FS sine wave in floating point, then its maximum and minimum will hit ±1.0.
Then, we have to convert that signal with a range of ±1.0 to a fixed point system that, as we already know, is asymmetrical. This means that we have to be a little careful about how we scale the signal to avoid clipping on the positive side. We do this by multiplying the ±1.0 signal by 2^(nBits-1)-1 if the signal is not dithered. (Pay heed to that “-1” at the end of the multiplier.)
Let’s do an example of this, using a 5-bit output to keep things on a human scale. We take the floating point values and multiply each of them by 2^(5-1)-1 (or 15). We then round the signals to the nearest integer value and save this as a two’s complement binary value. This is shown below in Figure 1.
Figure 1. Converting floating point to a 5-bit fixed point value without dither.
As should be obvious from Figure 1, we will never hit the bottom-most fixed point quantisation level (unless the signal is asymmetrical and actually goes a little below -1.0).
If you choose to dither your audio signal, then you’re adding a white noise signal with an amplitude of ±1 quantisation level after the floating point signal is scaled and before it’s rounded. This means that you need one extra quantisation level of headroom to avoid clipping as a result of having added the dither. Therefore, you have to multiply the floating point value by 2^(nBits-1)-2 instead (notice the “-2” at the end there…) This is shown below in Figure 2.
Figure 2. Converting floating point to a 5-bit fixed point value with dither.
Of course, you can choose to not dither the signal. Dither was a really useful thing back in the days when we only had 16 reliable bits to work with. However, now that 24-bit signals are normal, dither is not really a concern.