What is phase?
Phase is a measure of angle and position in a wave. In its purest form, phase shift is an offset between a signal's timing from the original unshifted signal by a measure of degrees or radians. Since sound is normally modeled as a sine wave, we visualize the phase shift with sine waves.
You can see a signal shifted by 90 degrees back in time (a negative shift) is constructive while a shift by 180 degrees back is destructive. Below is an animation showing the range of values in which a shifted wave causes constructive and destructive interference. Notice the final byproduct in purple is a wave that has it's own phase shift as a result of the summing of the original wave and it's copy which is out of phase.
Phase (typically referring to phase shift in shorthand ) rears its ugly head in 3 primary fashions in audio- speaker alignment, mic placement with more than one mic, and in processing signals, especially in DAW enabled work spaces.
Differences in how phase shift is caused
in any circuit (digital or analog) and physical sound waves.
In physical sound, phase shifts can occur from speakers set at different distances from a location. Similarly, despite speakers being coplanar, one speaker could have a signal delay, causing the waves to arrive at different times. This can also happen from the response of a subwoofer being “out of phase” or slightly off timing of the main speakers despite being coplanar.
In physical sound, it can also occur when an audio source arrives at two different mics at different times due to nonuniform distance between the origin of sound and the two mics. This is very typical in interviews where the two individuals have lapels and the lapels pick up their owner and the other person in the room. Since the distances from audio source to the lapels is different, it can cause a distinct interference and “phasy” sound with a hollow and/or echoing sound.
With instruments, this often subtly happens with drum mics and can even happen from the reverberation of sound waves off the top of a semi-closed piano lid. The sound from the piano hits the mic, but then waves pass the mic, hit the lid, and bounce back at a later time into the microphone, causing comb filtering.
In analog circuits this can happen from microphones at different distances from sound origin but can also happen from delays in signals introduced by things such as capacitors and inductors and processing. In fact, the very use of capacitors and inductors in analog EQ is why phase shifts always occur in analog EQ.
In digital circuits, sound waves can be shifted due to number of processors and non-linear phase EQs. Most DAWs and digital mixers have some measurement of overall signal delay introduced by plugin processing time and will compensate final signals to realign them, but phase shift within frequencies can still occur in some processors, especially less costly and complex processors.
Analog circuits use phase shift to EQ, it is a necessary part of causing purposeful constructive or destructive interference in targeted frequency. Analog EQs are IIR processors which never fully reset after an impulse or signal hits the capacitor or inductor, thus the name infinite impulse response, IIR.
With digital systems, the processing is typically FIR, which is a finite impulse response. This is accomplished by using a tapped delay line to offset a copy of the original signal from the original, causing shift in phase and constructive or destructive interference. However in Linear phase EQ, the phase shift is compensated for in the processing. This of course is a very sophisticated process, but what we learn from this is that only in the digital world can we have the possibility of removing phase shift from EQ. There are also methods of digital EQ called minimal phase EQ which limit phase shift introduced from EQ.
For further reading, see this excellent article on analog vs. digital EQ provided here.
An even more interesting articles establishes that people really can’t hear phase shift in most situations unless it results in comb filtering.
Here is a wonderful video on understanding phase.
Processing in DAWs, such as parallel tracks where EQ is added to one of the two tracks causing a phase shift will cause comb filtering in summed result. So if you parallel compress your drums, don't EQ the compressed bus unless you use a linear phase EQ.
In DAWs, when using linear phase EQ watch out for pre-ringing from compensations in the adjustments of the processor.
Pay attention to sampling rates in digital processors, typically higher sample rates help with improved EQ and phase response than lower rates. A sample rate of 44.1 kHz and bit depth of 16 bits is fine for CDs, but when you are sending signals repeatedly through processors the end result is more and more noise. Having a higher sample rate and bit depth help to lower noise floor and provide more information to processors to maintain clarity in final signal. EQ response will tend to be better overall with higher sampling and bit depth. Also, to really capture the “air” in a mix requires frequencies over 20kHz, which means 41.8 kHz sample rate may not be enough in the studio to fully capture these higher frequencies above 20kHz that contribute to “air” in a mix, according to the Nyquist theorem.
More on IIR and FIR processing
Wikipedia describes IIR and FIR filtering in this way:
“In practice, the impulse response, even of IIR systems, usually approaches zero and can be neglected past a certain point. However the physical systems which give rise to IIR or FIR responses are dissimilar, and therein lies the importance of the distinction. For instance, analog electronic filters composed of resistors, capacitors, and/or inductors (and perhaps linear amplifiers) are generally IIR filters. On the other hand, discrete-time filters (usually digital filters) based on a tapped delay line employing no feedback are necessarily FIR filters. The capacitors (or inductors) in the analog filter have a "memory" and their internal state never completely relaxes following an impulse (assuming the classical model of capacitors and inductors where quantum effects are ignored). But in the latter case, after an impulse has reached the end of the tapped delay line, the system has no further memory of that impulse and has returned to its initial state; its impulse response beyond that point is exactly zero.”
Beyond the basics and into phase graphs
Now most audio engineers are at least somewhat aware of the basic concept of phase. What you may well not be aware of is how to read a phase graph in audio measurement software such as in Smaart for system tuning or the free alternative Open Sound Meter.
Bob McCarthy has one of the best videos online for understanding how to read phase over frequency graphs, but I found even his great videos to be lacking some clarity, so let me dive into the principles of the graph and how to read it.
Below is a phase over frequency plot with phase changes that continuously wrap the full 360 degrees.
Below you have a phase shift which starts at 0Hz and continues till around 370 Hz and then becomes stable at 0 degrees of shift for higher frequencies.
Here we have three examples of phase which is not shifting. The wave form itself may be initially offset from 0 degrees, but the frequencies stay in phase with respect to each other.
First of all, you will notice a phase graph has y values ranging from 180 down to -180. If you have an angled line ranging from a vertical value of 180 to -180, it has experienced -360 degrees of phase shift and is lagging or behind original signal. Difference frequencies have different wavelength times and so McCarthy states that the angle over a certain frequency is related to the amount of delay. We will dive into understanding that formula in a minute. Extremely important, a horizontal line has no phase shift at all and uniformly exits the speaker at the same time it arrives without any shift in phase due to circuits. Most audio will have some wrapping in phase around the lower frequencies and less in the higher frequencies. Some of this is from the HPF that is applied to the low frequencies, causing phase shift to be introduced.
Why does it look like that?
To understand why the phase graph looks the way it does, it’s helpful to understand it as a the result of a cylindrical 3D spiral of radial phase. See the graphic below.
Now if you take the sleeve of the cylinder and unwrap it, you will see lines with a constant downward angle. This shows phase shift occurring. Because a single frequency can’t shift phase without affecting neighboring frequencies, the phase shift occurs over an x-axis of frequency. Typically the x-axis is displayed in logarithmic scale, very typical for frequency displays in audio. Now when you read the phase shift between two points, if it dead-ends at -180 degrees it can then “wrap” back up to the top at 180 degrees and continue, as -180 and 180 are the same point on the 2D circle of phase. Normally phase shifts do not happen consistently and continuously over the frequency spectrum, but if they did on a linear frequency display (as opposed to the typical logarithmic display) it would look like this...
When it comes to system tuning, you will have separate phase graphs from mains and subs from a mic's measurement. What you are trying to do in essence is to align the timing and then focus on aligning the phase shift in the neighborhood of the crossover frequency between your mains and subs, because where the signals “overlap” will either result in constructive or destructive interference. You want it to be constructive, so you want to try and align the phase shift occurring around 100 Hz between the two speakers.
We'll get into how you do this with my article on system tuning.
One of the most important concepts of phase graphs is understanding the slope of the phase graphs indicates the lag or lead of the sound wave. If the slope is angled down, as it normally will be, then it is lagging. It is being smeared back in time. If the slope is angled up, then it is leading the reference signal, which means your offset timing is not correct and your measurement mic appears to be receiving signal before it enters the system (this is only an appearance due to signal timing.) If the phase graph is a flat line, there is no phase shift occurring in that region.
Now McCarthy introduces a formula for calculating the amount of phase shift in ms that depends on the angle of the phase shift graph. Below is a detailed explanation of why this in fact works. As a mathematician and educator, I find it concerning when an equation is presented without explanation for why it works. If you care to use the formula in life, then make sure you first understand why it works.
You can download and view my notes on how this works here.
I have attached an image of the notes below.