Next: Coherence
Up: Statistics
Previous: or Chi-Square distribution
  Contents
  Index
Correlation Coefficient
The term ``correlation'' is one that is frequently misused and, as a result, misunderstood in the field of audio engineering. Consequently, some discussion is required to define the term. Generally speaking, the correlation of two audio signals is a measure of the relationship of these signals in the time domain [Fahy, 1989]. Specifically, given two two-dimensional variables (in the case of audio, the two dimensions are amplitude and time), their correlation coefficient, is calculated using their covariance
and their standard deviations
and
as is shown in Equation 4.1 [Neter et al., 1992]. The line over these three components indicates a time average, as is discussed below.
 |
(5.1) |
The standard deviation of a series of values is an indication of the average amount the individual values are different from the total average for all values. Specifically, it is the square root of the average of the squares of the differences between the average of all values and each individual value. For example, in order to find the standard deviation of a PCM digital audio signal, we begin by finding the average of all sample values. This will likely be 0 since audio signals typically do not contain a DC offset. Each sample is then individually subtracted from this average and each result is squared. The average of these squares is calculated and its square root is the standard deviation. When there is no DC component in an audio signal, its standard deviation is equal to its RMS value. In such a situation, it can be considered the square root of the average power of the signal.
The covariance of two series of values is an indication of whether they are interrelated. For example, if the average temperature for today's date is 19 C and the average humidity is 50 , yet today's actual temperature and humidity are 22 C and 65 , we can find whether there is an interdependent relationship between these two values, called the covariation [Neter et al., 1992]. This is accomplished by multiplying the differences of today's values from their respective averages, therefore (19 - 22) * (50 - 65) = 45. The result of this particular calculation, being a positive number, indicates that there is a positive relationship between the temperature and humidity today - in other words, if one goes up, the other does also. Had the covariation been negative, then the relationship would indicate that the two variables had behaved oppositely. If the result is 0, then at least one of the variables equalled the average value. The covariance is the average of the covariations of two variables measured over a period of time. The difficulty with this measurement is that its scale changes according to the scale of the two variables being measured. Consequently, covariance values for different statistical samples cannot be compared. For example, we cannot tell whether the covariance of air temperature and humidity is greater or less than the covariance of the left and right channels in a stereo audio recording if both have the same polarity.
Fortunately, if the standard deviations of the two signals are multiplied, the scale is identical to that of the covariance. Therefore, the correlation coefficient (the covariance divided by the product of the two standard deviations) can be considered to be a normalised covariance. The result is a value that can range from -1 to 1 where 1 indicates that the two signals have a positive linear relationship (in other words, they always have the same polarity if we remove any DC offset they might have). A correlation coefficient of -1 indicates that the two signals are negatively linearly related (therefore, they always have opposite polarities when we remove their DC offsets). In the case of wide-band signals, a correlation of 0 usually indicates that the two signals are either completely unrelated or separated in time by a delay greater than the averaging time.
In the particular case of two sinusoidal waveforms with identical frequency and a constant phase difference , Equation 4.1 can be simplified to Equation 4.2 [Morfey, 2001].
 |
(5.2) |
where the radian frequency is defined in Equation 4.3 [Strawn, 1985] and where is the time separation of the two sinusoids.
 |
(5.3) |
where the symbol
denotes ``is defined as'' and is the frequency in Hz.
Further investigation of the topic of correlation highlights a number of interesting points. Firstly, two signals of identical frequency and phase have a correlation of 1. It is important to remember that this is true regardless of the amplitudes of the two signals [Welle, ]. Two signals of identical frequency and with a phase difference of 180 have a correlation of -1. Signals of identical frequency and with a phase difference of 90 have a correlation of 0. Finally, it is important to remember that the correlation between two signals is highly dependent on the amount of time used in the averaging process.
Next: Coherence
Up: Statistics
Previous: or Chi-Square distribution
  Contents
  Index
Geoff Martin 2006-10-15
|
|