“If the only tool you have is a hammer, you tend to see every problem as a nail.”
This well known quote is attributed to psychologist Abraham Maslow, who observed that the accessibility of a given tool tends to influence the type of approach humans take when solving a problem.
As engineers and researchers, we are not spared from the phenomenon of Maslow’s hammer, although many of us might like to think otherwise. In our education and training we’ve spent much time and effort learning a variety of practical and theoretical tools. And once they’re understood and mastered we often cling to them in ways that are not always beneficial. In the worst case, over-reliance on the most familiar engineering tools can serve to maintain misconceptions and inhibit progress.
However, it is also true that an uncritical eagerness to “think outside the box” can be equally harmful. Particularly if it means adopting any kind of new and exciting tool without first considering its suitability for the problem at hand.
“One might ask, ‘Am I using my understanding of the problem to define which tools I use, or am I letting the tools in my toolbox define which problem to solve?’”
In light of this, the issue is not so much about whether the tools we are using are familiar, novel, simple or complex, but rather whether we are aware of all relevant aspects of the problem we are trying to solve and how those aspects relate to the tools in our toolbox. One might ask, ‘Am I using my understanding of the problem to define which tools I use, or am I letting the tools in my toolbox define which problem to solve?’
As a researcher, I am mostly in favor of the problem-oriented view, maintaining focus on what needs to be done before beginning to think about how to do it. Of course there are situations in which a more pragmatic, tool-oriented approach is well motivated, but my general view is that we should dare to let go of our favorite tools more often.
As an example of how overly tool-oriented thinking can keep us stuck in old paradigms, I will discuss the traditional use of digital linear filters in the audio industry, and the confusion that resulted when companies like Dirac entered the business in the early 2000s. But first, let’s take a slight detour by looking at what a digital linear filter really is, in the most general sense.
What is a digital filter?
The digital linear filter is one of the most widely applicable tools in audio signal processing. Its application stretches from basic operations such as a simple bass cut or treble boost, to advanced sound effects such as simulating the acoustics of a cathedral. A linear filter operates by altering the amplitude (gain) and the time delay (phase shift) of the individual frequency components of a signal.
The most general way of defining a digital linear filter is through its transfer function H(f) which is defined for frequencies f between 0 and fs/2 , where fs is the sample rate, e.g., fs = 48 kHz. Furthermore, the transfer function is complex valued, which means that it consists of a magnitude part |H(f)|, called the magnitude response, and a phase part ∠H(f), called the phase response, determining the gain and phase shift, respectively, at frequency f (see Figures 1 and 2).
Figure 1: Magnitude response of a typical digital linear filter
An alternative but equivalent definition of a linear filter is through its impulse response h(t), which is the output signal that results when the filter is excited with a unit pulse—a spike-like signal of very short duration—at the input (see Figure 3).
The degree to which a certain specified transfer functionH(f) can be realized as a filter depends on how the filter is implemented. For instance, a highly detailed transfer function generally requires more computational resources than a smooth, low-resolution transfer function. The broadest categories of digital linear filter implementations are IIR (infinite impulse response) and FIR (finite impulse response) filters. An IIR filter is recursive, which means that its output depends on past outputs as well as past and present inputs. The output of an FIR filter depends only on past and present inputs.
The traditional view of digital filters in audio
Filtering in audio has traditionally been thought of mainly in terms of emphasizing or attenuating the level of various frequency bands. For example, a music signal can be made to sound “crisp” or “bright” by boosting high frequencies, or “less boomy” by reducing the low midrange. A bad sounding loudspeaker can be tuned to sound less colored by using peak/notch filters that compensate for the loudspeaker’s spectral irregularities.
Digital filters for such tasks are traditionally designed as IIR filters that are digital equivalents of classical analog equalizers. That is, filters are constructed as combinations of standard “cookbook” designs such as high-/low shelf, peak/notch, or high-/low-/bandpass filters. These classic parametric equalizers have well known mathematical definitions. They are easy to implement, and their tuning parameters comply well with the musical intuition of most audio engineers. An example of this kind of cookbook filter which is familiar to most people is illustrated in Figure 4 below.
With this approach to filtering, it is natural that the main concern and criterion is the magnitude response of the filter, whereas the phase response is viewed as a secondary byproduct of minor importance.
In some cases, though, so-called linear phase filters have been used because of their property that causes all frequencies to be delayed by an equal amount of time, regardless of the magnitude response. The linear phase characteristic requires a digital FIR filter structure in order to be realized, and can thus not be obtained with traditional analog circuits. The possibility of obtaining a linear phase response through the use of FIR filters is sometimes described as one of the main advantages of digital over analog filtering. However, in audio applications linear phase is far from always being a desirable property.
The methods for digital linear filter design that are being taught in textbooks and at our universities normally include the standard cookbook IIR filters mentioned above, along with algorithms for linear phase FIR filter design. This teaching approach has unfortunately led to the widespread misconception among audio engineers that these filter types are all that exist— and moreover, that all FIR filters are by necessity of the linear phase type. As we shall see, these are indeed misconceptions, with the consequence being that many highly complex problems are addressed using a far too limited set of tools.
So the question arises: What is the true potential of digital linear filtering in audio, and how can it be exploited in a meaningful way?
If we disregard some practical issues – such as how to realize a particular filter – and just for a moment consider the full range of possibilities hidden in the complex valued transfer function H(f), then digital linear filters suddenly become extremely versatile and a lot more powerful than commonly realized. This freedom comes with a great deal of responsibility, though: Unless you know exactly what you want and how to reach it, the probability of failure is infinitely greater than that of success. As an example, let’s consider the topic of sound field control and so-called room correction filters.
Sound field control and room correction
Assume that you can record the sound pressure with a microphone at a point in your living room, for example at your music listening “sweet spot.” Let’s call this a recorded signal y(t). Moreover, assume you have defined some desired sound pressure signal d(t) that you would like to realize in this sweet spot. For example, d(t) could be a clear and undistorted version of your favorite music track. The equipment at your disposal for reaching this goal is the hi-fi stereo system in your living room, along with an arbitrarily powerful signal processor that’s connected to its input.
Next, consider the following questions:
- How should you go about ensuring that the recorded signal y(t) aligns as close as possible with the desired signal d(t)?
- What is a reasonable definition of one sound pressure wave y(t) being “close to” or “similar to” another sound pressure wave d(t)?
- What if y(t) is not an exact copy of d(t), but contains some approximation error? How large an error can be tolerated? Are there certain types of errors that can tolerated to a larger extent than others?
These questions, if considered deeply and thoroughly, give rise to many subsequent questions, which, altogether make up an entire field of research – a field in which our company has been deeply engaged for the past 15 years.
Note that, although the concept of “filter design” is not mentioned explicitly anywhere in those three questions, it turns out that a proper solution to the problem can actually be described as a linear filter having a complex valued transfer function H(f). Such a filter cannot, however, be described in the traditional audio engineering terms, such as “bass cut,” “less midrange,” or “a 1 kHz notch,” so it would mislead us if we were to look for the solution only among the well-known classical equalizer filters.
Actually, nothing in particular can be said a priori about its magnitude and phase responses or whether it should be realized using FIR or IIR techniques. Instead, the filter is better described as a controller, a term borrowed from the field of automatic control.
Thinking about it more abstractly, in control theory terms: Our audio system S has an input and a measured output quantity y(t), and we want to construct a controller C that produces a control signal u(t) at the input so as to steer the measured output y(t) toward, and as close as possible, to the desired output signal d(t). This abstraction of the problem is illustrated in Figure 6. The controller’s transfer function H(f) will be determined by the electro-acoustic properties of your hi-fi system, your living room acoustics (here represented by the system block S), and the various mathematical criteria and constraints that we choose to involve when producing answers to questions 1-3 above.
The problem outlined here, and its control-theory inspired solution, constituted a great paradigm shift in the audio engineering field some 15-20 years ago when a handful of small companies like Dirac decided to explore the true potential of digital filtering and control in audio. In the beginning, the new approach met a great deal of resistance from engineers and audio enthusiasts, in large part because our filters didn’t fall into any of the well-known categories in the cookbook. Therefore, instead of taking an interest in the purpose and properties of the filter’s transfer function H(f), it was common practice that people concentrated only on the aspects most familiar to them. Such as whether an FIR or IIR structure is used for implementing the filter, or whether a fixed point or floating point processor is used. In cases where a FIR structure was proposed for implementing H(f), the immediate assumption was that the filter was of the linear phase type and therefore not of interest.
A way forward: Drop the tools, look at the problem
In order to clear up such confusion, I propose the following working scheme, by which one can avoid the pitfalls of tool-oriented thinking. The key point is to separate the problem as such from questions related to the potential restrictions of the final engineering realization, at least in an initial phase of the work.
1. Know the problem. What is the problem to be solved – is a digital linear filter appropriate? If so, suppose we have no restrictions whatsoever regarding how to realize a linear filter for a given audio system. Then what transfer function H(f) would be the best or most preferred solution to your application? If this question cannot be clearly answered, it is an indication that we do not yet know our problem. We then need to stop and get to know the problem better before proceeding.
2. Filter realization. Given a desired filter transfer function H(f), can it be realized with a stable filter of finite computational complexity? If not, what would be the best possible realizable approximation of H(f)? Let’s call this best possible realizable approximation F(f).
3. Restrictions. What restrictions exist in our particular case? For example, it may be the case that our DSP platform can only implement FIR filters with fixed-point arithmetic. Or that we are restricted to using a DSP that has been hard-coded to implement only a few low-order parametric IIR filters.
4. Feasibility. Can the desired transfer function H(f), or its best possible realizable approximation F(f), be implemented using the DSP resources at hand, with all its computational restrictions?
a) If not, is it possible to shrink the realization F(f), from being the “best possible realizable approximation” to being an “acceptable approximation” of H(f)?
b) If alternative a) is not possible, then go back to step 1, and include the DSP restrictions already in the formulation of the basic problem. The solution of the restricted problem may be fundamentally different from the original unrestricted solution. In the worst case, it may turn out that a meaningful solution cannot be found under these restrictions.
Following the above scheme, we can ensure that the right problem is solved, and that any restrictions put on the solution come from necessary real-world circumstances, and not from the engineer being limited by his toolbox.
– Lars-Johan Brännmark, Chief Scientist at Dirac Research