Mixing Secrets For The Small Studio - Additional Resources

Whistlestop Guide To Studio Production

Sound in Different Forms
Sinewaves and Audio Frequencies
Logarithmic Scales for Level and Pitch
Frequency Response
The Multitrack Recording Process
Audio Signals and Mixers
Signal Processing
Real-World Studio Setups: Something Old, Something New
Setting Up a Small Stereo Monitoring System
Jargon-Busting Glossaries

Neither ‘Mixing Secrets For The Small Studio’ nor ‘Recording Secrets For The Small Studio’ are truly entry-level books, which means you’ll need the following basic background knowledge before you can get the best out of them. Important pieces of technical jargon are in bold-face to make them stand out. (It’s a fact of life that every engineer applies studio terminology slightly differently, so I hope that clarifying my own usage will help minimise confusion for readers.) I’ve tried to keep the text as concise as I can, but you’ll find that many key terms are linked to sources of further explanation.

Sound in Different Forms

Sound is a series of pressure waves moving through the air at the speed of sound (roughly 343 meters/second). Any vibrating object can create these pressure waves, and when they reach your eardrums, those vibrate in sympathy, triggering the hearing sensation. Transducers such as pickups and microphones can convert air-pressure waves, or the vibrations which triggered them, into an electrical signal, representing their movements over time as voltage fluctuations – often displayed for inspection purposes as a waveform of voltage level (vertical axis) against time (horizontal axis). Other transducers can convert this electrical signal back into air-pressure waves for listening or ‘monitoring’ purposes (via loudspeakers or headphones) or into some other form for storage/replay (eg. the variable groove depths on a vinyl record or the variable flux levels on a magnetic tape.

Once sounds have been converted to electrical signals, it becomes possible to process and combine them using all manner of electronic circuitry. In addition, the voltage variations of an electrical signal can also be represented as a stream of numbers in digital form, whereupon digital signal-processing (DSP) techniques can be applied to them. Transferring any signal between the analogue domain (of electrical signals, vinyl grooves, magnetic flux, physical vibrations, and pressure waves) and the digital domain requires either an analogue-to-digital converter (ADC) or a digital-to-analogue converter (DAC). The fidelity of analogue-to-digital conversion is primarily determined by two statistics: the frequency with which the analogue signal’s voltage level is measured (the sample rate or sampling frequency) and the resolution (or bit depth) of each measurement (or sample), expressed in terms of the length of the binary number required to store it.

Sinewaves and Audio Frequencies

If an electrical signal’s waveform looks chaotic, what you hear will usually feature noisy sounds, whereas repeating waveform patterns are heard as pitched events such as musical notes or overtones. However, before I say any more about complex sounds, it’s useful first to understand the simplest repeating sound wave: a sinewave tone. Its pitch is determined by the number of times it repeats per second, referred to as its frequency and measured in Hertz (Hz). Roughly speaking, the human ear can detect sinewave tones across a 20Hz–20kHz frequency range (the audible frequency spectrum). Low-frequency tones are perceived as low pitched, while high-frequency tones are perceived as high pitched. Although a sinewave tone isn’t exactly thrilling to listen to on its own, it turns out that all the more interesting musical sounds can actually be broken down into a collection of different sinewave tones. The mixture of different sinewave components within any given complex sound determines its timbre.

One way to examine these sinewave components is to use a spectrum analyzer, a real-time display of a sound’s energy distribution across the audible frequency spectrum. On a spectrum analyzer, a simple sinewave tone shows up as a narrow peak, while real-world signals create a complex undulating plot. Narrow peaks in a complex spectrum-analyzer display indicate pitched components within the signal, while the distribution of energy across the frequency display determines the timbre of the sound – subjectively darker sounds are richer in low frequencies, whereas bright sounds tend to be strong on high frequencies. Here’s a short video of a spectrum analyser in action, to illustrate what I mean:

If you have trouble downloading or playing this video, here's a mirror file on an alternate server.

Although a single sinewave tone will be perceived as a pitched note, almost all real-world musical notes are actually made up of a harmonic series of related sinewaves. The most low-frequency of these, the fundamental, determines the perceived pitch, while a series of overtones at multiples of the fundamental’s frequency determine the note’s timbre according to their relative levels.

Logarithmic Scales for Level and Pitch

Human hearing perceives both level and pitch in a roughly logarithmic way – in other words, we compare levels and pitches in terms of ratios. For example, the perceived pitch interval between a note with its fundamental at 100Hz and a note with its fundamental at 200Hz is the same as that between notes with fundamentals at 200Hz and 400Hz. Similarly, when dealing with sound in electrical form, the perceived volume difference on playback between signals peaking at 100mV and 200mV is roughly similar to that between signals peaking at 200mV and 400mV. In recognition of this, both pitch and signal level measurements are frequently made using a logarithmic scale. In the case of pitch, this is done in terms of traditional musical intervals: eg. 200Hz is an octave above 100Hz, 200Hz is an octave below 400Hz, and so on. (More examples on this Wikipedia page.) In the case of signal levels this is done using decibels (dB): e.g., 200mV is 6dB higher in level than 100mV, 200mV is 6dB lower in level than 400mV–or, to express it another commonly used form, a +6dB level change (’+6dB gain’) takes 100mV to 200mV, whereas a -6dB level change (’-6dB gain’) takes 400mV to 200mV.

On their own, decibel values can only be used to indicate changes in signal level, which is why they are often used to label audio ‘gain controls’ (such as faders) that are expressly designed for this purpose. However, it’s important to remember that decibels (just like musical intervals) are always relative. In other words, it’s meaningless to say that a signal level is ‘4.75dB’, much as it’s nonsensical to say that any isolated note is a major sixth, because the question is ‘4.75dB larger than what?’ or ‘a major sixth above what?’ Therefore, if you want to state absolute level values in terms of decibels, you need to express them relative to an agreed reference level, indicating this using a suffix. Common reference levels used for studio purposes include dBSPL (for acoustic sound pressure), dBu and dBV (for electrical signals) and dBFS (for digital signals), but those are by no means the only ones out there.

Frequency Response

Any studio device will alter the nature of sound passing through it in some way, however small, and the nature of any such effect on a signal’s frequency balance is commonly expressed in terms of a frequency response graph, which shows the gain applied by the device across the frequency range. A device that left the frequency balance completely unchanged would show a straight horizontal frequency-response plot at the 0dB level. However, real-world equipment deviates somewhat from this ideal flat response – indeed, some devices deliberately warp their frequency-response curve for creative purposes.

The Multitrack Recording Process

Modern studio production revolves around the concept of multitrack recording, whereby you can capture different electrical signals on different recorder tracks, retaining the flexibility to process and blend them independently afterwards. Furthermore, multitrack recorders also allow you to overdub new signals to additional tracks while listening back to (monitoring) any tracks that have already been recorded, which enables complicated musical arrangements to be built up one instrument at a time if required.

An equally important cornerstone of the production process in many styles is the use of synthesizers (which generate audio signals electronically) and samplers) (which can creatively manipulate selected sections of prerecorded audio). These can sometimes mimic the performances of live musicians, but more importantly provide the opportunity to design sounds that reach beyond the realms of the natural. Typically these devices are digitally controlled using MIDI (Musical Instrument Digital Interface) messages, which can be programmed/recorded in multitrack form, edited, and replayed using a MIDI sequencer].

Although the production workflow in different musical styles can contrast radically, many people in the studio industry find it useful to discuss the progress of any given project in terms of a series of notional ‘stages’. Everyone has a slightly different view of what constitutes each stage, what they’re called exactly, and where the boundaries are between them, but roughly speaking they work out as follows:

Preproduction & Programming. The music is written and arranged. Fundamental synth/sampler parts may be programmed at this stage and musicians rehearsed in preparation for recording sessions.
Recording (or Tracking). The instruments and vocals required for the arrangement are recorded, either all at once or piecemeal via overdubbing. Audio editing and corrective processing may also be applied during this stage to refine the recorded tracks into their final form, in particular when pasting together (comping) the best sections of several recorded takes to create a single master take. MIDI-driven synth and sampler parts are bounced down to the multitrack recorder by recording their audio outputs.
Mixing (or Mixdown). All the recorded tracks are balanced and processed to create a commercial-sounding stereo mix.
Mastering. The mixdown file is further processed to adapt it to different release formats.

The most common professional studio setup for the recording stage involves two separate rooms, acoustically isolated from each other: a live room where musicians perform with their instruments; and a control room containing the bulk of the recording equipment, where the recording engineer can make judgments about sound quality without the direct sound from the performers interfering with what he’s hearing from his monitoring loudspeakers (or monitors). Where several performers are playing together in the live room, each with their own microphone, every mic will not only pick up the sound of the instrument/voice it’s pointing at, but will also pick up some of the sound from the other instruments in the room – something variously referred to as spill, leakage, bleed, or crosstalk, depending on who you speak to! In some studio setups, additional sound-proofed rooms (isolation booths) are provided to get around this and improve the separation of the signals.

Audio Signals and Mixers

A typical multitrack recording session can easily involve hundreds of different audio signals. Every audio source (microphones, pickups, synths, samplers) needs routing to its own track of the multitrack recorder, often through a recording chain of studio equipment designed to prepare it for capture. Each playback signal from the recorder will pass through its own monitoring chain, being blended with all the other tracks so that you can evaluate your work in progress via loudspeakers or headphones. Additional cue/foldback mixes may be required to provide personalised monitoring signals for each different performer during a recording session. Further mixes might also feed external (or outboard) effects processors, the outputs of which must be returned to the main mix so you can hear their results.

The way studios marshal all these signals is by using mixers (aka mixing desks, boards, or consoles). At its most basic, a mixer accepts a number of incoming signals, blends them together in some way, and outputs the resulting blended signal. Within the mixer’s architecture, each input signal passes through its own independent signal-processing path (or channel), which is furnished with a set of controls (the channel strip) for adjusting the level and sound character of that signal in the mixed output. In the simplest of mixers, each channel strip may have nothing more than a fader to adjust its relative level for a single output mix, but most real-world designs have many other features besides this:

If the main/master mix output is stereo (which it usually will be) then each mono channel will have a pan control (or pan pot) which adjusts the relative levels sent from that channel to the left and right sides of the main mix. If the mixer provides dedicated stereo channels, these may have a balance control instead, which sets the relative level of the stereo input signal’s left and right signal streams.
An independent monitor mix or control-room mix may be available for your studio loudspeakers. Although this will usually receive the master mix signal by default, you can typically also feed it with any subset of the input signals for closer scrutiny by activating per-channel solo buttons.
In addition to the faders that set each input signal’s level in the main mix, there may be controls for creating further auxiliary mixes too – perhaps labelled as cue sends (for the purposes of foldback) and effects sends (for feeding external effects processors).
There may be buttons on each channel strip that allow you to disconnect that channel from the main mix, routing it instead to a separate group or subgroup channel with its own independent output. This provides a convenient means of routing different input signals to different tracks of the multitrack recorder and of submixing several input signals together onto a single recorder track.
Audio metering may be built in, visually displaying the signal levels for various channels as well as for the group, monitor, and master mix signals.

Mixer channels that are conveying signals to the multitrack recorder for capture are often referred to as input channels, whereas those which blend together the multitrack recorder’s monitor outputs and send them to your loudspeakers/headphones are frequently called monitor channels. Some mixers just have a bunch of channels with identical functionality, and leave it up to you to decide which to use as input and monitor channels, while others have dedicated sections of input and monitor channels whose channel-strip facilities are specifically tailored for their respective tasks. Another design, the in-line mixer, combines the controls of both an input channel and a monitor channel within the same channel strip. This is popular in large-scale studio set-ups, because it creates a physically more compact control layout, provides ergonomic benefits for advanced users, and allows the two channels to share some processing resources. (Plus there’s the added benefit that it confuses the hell out of the uninitiated, which is always gratifying…)

Another specialised mixer, called a monitor controller, has evolved to cater for studios where several different playback devices and/or loudspeaker systems are available. It typically provides switches to select between the different audio sources and speaker rigs, as well as a master volume control for whichever speaker system is currently active.

Signal Processing

Beyond simply blending and routing signals, multitrack production invariably involves processing them as well. In some cases this may comprise nothing more than ‘preamplifying’ the signal to a suitable level for recording purposes, but there are several other processes that are frequently applied as well:

Spectral Shaping: Audio filters and equalisers may be used to adjust the levels of different frequencies relative to each other.
Dynamics: Tools such as compressors, limiters, gates, and expanders allow the engineer to control the level-contour of a signal over time in a semi-automatic manner.
Modulation Effects: A family of processes which introduce cyclic variations into the signal. Includes effects such as chorusing, flanging, phasing, vibrato, and tremolo.
Delay-based Effects: Another group of processes which involve overlaying one or more echoes onto the signal. Where these effects become complex, they can begin to artificially simulate the reverberation characteristics of natural acoustic spaces.

In some cases, such processing may be inserted into the signal path directly – rather than being fed from an independent effects send and then returned to the mix (a send-return configuration).

Real-World Studio Setups: Something Old, Something New

Although every recording studio needs to route, record, process, and mix audio signals, every engineer’s rig ends up being slightly different, either by virtue of the equipment chosen, or because of the way the gear is hooked up. One defining feature of many systems is the extent to which digital technology is used. While there are still some people who uphold the analogue-only studio tradition of the 1970s, the reliability, features, and pricing of DSP processing and data storage have increasingly drawn small studios toward hybrid systems. Standalone digital recorders and effects processors began this trend within otherwise analogue systems, but the advent of comparatively affordable digital mixers and ‘studio in a box’ digital multitrackers during the 1990s eventually allowed project studios to operate almost entirely in the digital domain, converting all analogue signals to digital data at the earliest possible opportunity and then transferring that data between different digital studio processors losslessly. These days, however, the physical hardware units of early digital studios have largely been superseded by Digital Audio Workstation (DAW) software, which allows a single general-purpose computer to emulate all their routing, recording, processing, and mixing functions at once, connecting to the analogue world where necessary via an audio interface: a collection of audio input/output (I/O) sockets, ADCs, and DACs.

A similar trajectory can be observed with synths and samplers. Although early devices were all-analogue designs, microprocessors quickly made inroads during the 1980s as the MIDI standard took hold. The low data bandwidth of MIDI messages and the plummeting price of personal computing meant that computer-based MIDI sequencing was already the norm 20 years ago, but in more recent years the synths and samplers themselves have increasingly migrated into that world too, in the form of software virtual instruments. As a result, most modern DAW systems integrate MIDI sequencing and synthesis/sampling facilities alongside their audio recording and processing capabilities, making it possible for productions to be constructed almost entirely within a software environment. In practice, however, most small studios occupy a middle ground between the all-analogue and all-digital extremes, combining old and new, analogue and digital, hardware and software – depending on production priorities, space/budget restrictions, and personal preferences.

Setting Up a Small Stereo Monitoring System

When it comes to audio engineering, the equipment you use for monitoring purposes is vital – the better you can hear what you’re doing, the faster you’ll learn how to improve your sonics. Loudspeakers are usually preferable over headphones, because the latter struggle to reproduce low frequencies faithfully and create an unnaturally wide ‘inside your head’ stereo sensation. That said, if you’ve got less than $750 (£500) to spend, my view is that top-of-the-range studio headphones will give you better results for the money than any combination of speakers and acoustic treatment in that price range. Whatever monitoring system you use, it will only become useful to you once you know how your favourite productions sound through it.

In general, it’s best to choose full-range speakers designed specifically for studio use, rather than general-purpose hi-fi models which tend to flatter sounds unduly. The speakers themselves should be firmly mounted (preferably on solid stands) and carefully positioned according to the manufacturer’s instructions – usually the shorter dimension of a rectangular room will give the better sound. To present a good stereo image, the two speakers and the listening ‘sweetspot’ should form an equilateral triangle, with the tweeter and woofer of each speaker vertically aligned and both tweeters angled toward the listener’s ears. Speaker systems with built-in amplification (‘active’ or ‘powered’ speakers) are not only convenient, but also offer sonic advantages because of the way the amplifier(s) can be matched to the specific speaker drivers.

Although most small-studio speaker systems are nearfield models which are designed to operate within about 6 feet (2m) of the listener, room acoustics can still have an enormous impact on their tone and fidelity. As such, you should probably spend roughly the same amount of money on acoustic treatment as on the speaker system itself if you’re going to get the best bang for your buck. High-frequency reflections and room resonances can be absorbed very effectively with judiciously applied acoustic foam, but taming low-end problems requires more expensive and bulky acoustics measures. (For more detailed information on monitoring setup and technique, see Part 1 of Mixing Secrets For The Small Studio.)

Jargon-Busting Glossaries

If you encounter any studio-related technical term you don’t understand, you should find an explanation of it in one of the following well-maintained glossaries:

Los Senderos Studio’s Recording Glossary: A large and clearly written glossary, which is also nicely interlinked and illustrated. Try this in the first instance.
Sound On Sound Technical Glossary: Masses of useful info here, especially if you’re just starting out.
Audio Engineering Society Pro Audio Reference: A slightly more specialist glossary which complements the others quite well.