Recording in Stereo (Part 1)

Wonderful things can happen when a recording is made in stereo.  By definition, the term "stereo" implies three dimensions.  In contrast, my first exposures to so-called stereo recordings were strictly demonstrations of left-right phenomena with a guitar on the left and a piano on the right or a marching band appearing to move from one speaker to the other.

It was many years later that I first encountered the idea that stereo reproduction could move far beyond the appearance of sounds emanating from different speakers.  Indeed, I'd heard stereo recordings where the speakers, when properly placed, seemed to disappear entirely, leaving behind a sort of audio hologram displaying a sonic stage with width and depth (and sometimes a sense of height).  Instruments were spread all around the apparent stage and the instrumental images themselves seemed to have a sense of solidity to them.

These early exposures to a more realistic stereo coincided with my earliest experiences as an engineer working at my first job in a recording studio.  At the studio, I learned the more common way of creating stereo recordings which involved having microphones placed closely to each instrument, later to be combined by the engineer using a mixing console.  During the mix, the engineer would place the individual sound sources across the stereo stage (left-right) using the pan pots on the console.  These allow the sound to be placed full left or full right or anywhere in between.

Comparing the studio generated stereo with those other recordings which I found to offer a more convincing illusion, I realized what we actually did in the studio was create multiple mono recordings played back via two speakers.  I started reading about how those more convincing records were made and found they tended to use far fewer microphones and none of the sounds was recorded in mono (i.e. with a single, closely placed microphone).  Further reading and my first experiments on my own, taught me about the different approaches and philosophies used by other engineers in their work on stereo recordings.  This started my drift away from closely placed, multiple microphones.

I should say at this point that while convincing stereo is very pleasing and adds to our ability to hear complex music, my personal sonic priorities put this below being able to capture the true sound of an instrument.  Only when the true timbre the musician and instrument maker work so hard to attain is captured, do I consider placement upon the stereo stage.  I should also say that my listening biases follow timbral and harmonic integrity closely with dynamic integrity.  One of the things I find to be the weak link in most recordings is compression of dynamic range, both the macro (difference between softest and loudest) and the micro, where a great deal of the musical emotion lies.  I have a strong aversion to any sort of dynamic compromise and find the recordings whose sound I prefer the most all leave performance dynamics intact.  In other words, I'd rather hear a great instrumental sound with vague stereo than a dried up version that is well placed on the stage.

With this in mind, I began wondering why anyone would want to listen to a Steinway grand piano from under the closed lid with their ears just above the hammers.  And who would want to put their head inside a bass drum or right up against a Marshall stack or in the bell of a saxophone?  These were the places from which the studio microphones "listened".  When heard from these locations by human ears, the instruments sound thin and harmonically dry.  The lower frequencies and many harmonics just don't come to focus in the air that close to the instruments.  No wonder things recorded this way are destined to endure endless processing devices in the attempt to make them sound right.  I understood the desire for isolation between parts when multitracking but the sonic cost, to my ears, was way too high.  There had to be a better way.  There is.  More on this in Part 2 of this article.

As I read more about the different approaches to recording in stereo and the reasoning behind each approach and as I tried my own experiments, my own views on the subject took form.  One of the earliest methods, devised by Alan Blumlein in the early 1930's used what is called a coincident pair of microphones.  A coincident pair uses two directional microphones.  Some directional microphones "hear" what is in front of them and downplay sounds coming from the sides or behind them.  Other directional mics hear both what is in front of them and what is behind them, downplaying sounds coming from the sides.  These differ from omnidirectional mics, which hear sounds coming from all directions.  By having the microphone pair placed one above the other and facing 90 degrees apart for left-right pickup (one mic hearing the left side of the stage and the other hearing the right side), the mics effectively coincided in space (at least horizontal space).  Placing the mics in this fashion will result in sounds from different parts of the stage arriving at both mics at the same time.  When sounds arrive at the mics at different times due to having to travel a different distance to reach each mic, phase differences result.  Blumlein sought to eliminate phase differences between microphones and the problems that could result when two channels with phase differences are combined for reproduction in mono.  Coincident pairs derive stereo from the intensity differences between channels.  Sounds arriving from the left will be louder in the left microphone than they will be in the right.  Sounds arriving from the right will be louder in the right microphone than they'll be in the left.  This "theoretically correct" method is still used today by many engineers.

A variation on this idea was the quasi-coincident spacing developed by the French radio network ORTF.  The ORTF method used a pair of directional microphones but added a small space (and hence time) element by placing the mics about 7 inches apart, to emulate the distance between human ears.  This was done with the understanding that a sound arriving at our ears from the left will arrive at the left ear sooner than it will arrive at the right ear because it has a shorter distance to travel.  Similarly, sounds arriving from the right will have a shorter distance to travel to our right ear, therefore they'll be heard sooner by the right ear than they will by the left ear.

While Blumlein was working on his coincident pair technique, Bell Laboratories was experimenting with the use of omnidirectional microphones.  By using "spaced omnis", that is, leaving a larger distance between microphones, the engineers at Bell were able to capture a better sense of the space in which an acoustic event took place than the coincident technique could.  This increased sense of the acoustic space comes at the cost of some vagueness in the placement of individual images within that space.  Spaced omnis make deliberate use of the temporal cues (the phase or timing information) that coincident techniques seek to avoid.

The wide spacing generally used for spaced omnis tended to result in, among other things, a "hole in the middle" where instruments appeared to bunch up at the loudspeakers leaving a gap in the center.  In order to remedy this, some engineers would use a third omni mic placed in the center and mixed into the stereo pair (at a lower level) to help fill in the center.  A lot of great recordings have been made this way including the Living Presence series on Mercury, recorded by C. Robert Fine and the Living Stereo series on RCA, recorded by Lewis Layton.  Engineers at Decca developed their own version of the three omni technique often referred to as the "Decca tree" due to the structure they devised to hold their mics.  Sometimes, the Decca engineers added an additional pair of more widely spaced "outrigger" mics.

As I got more experience as an engineer, I supplemented my multitrack studio work (recording, overdubbing, mixing and later cutting lacquer masters for vinyl) with experiments in direct to stereo recording using only a pair of microphones.  After some preliminary tests using directional microphones, I'd come to favor omnidirectional microphones because they always seemed to my ears to editorialize less than the directional mics I'd tried.  Omnis tend to have flatter, wider frequency response than directional mics with their attendant bass roll-offs and treble peaks.  The sound from a good omni (and by this I don't mean a mic with a pattern switch that includes an omni position, I mean a true, omni only mic) always sounds more honest to my ears.

Since coincident omnis result in mono (they'd both hear the same thing) and quasi-coincident (ORTF) omnis result in "fat mono" (they'd both hear almost the same thing), a spaced pair was the natural way to go.  Besides, the better loudspeakers I'd heard always made a big deal about being phase (time) accurate.  With spaced omnis giving priority to timing information, I started to think about how others had used them.  What I found was the pair was generally spaced 12 or more feet apart.  I thought about how far apart my loudspeakers were for playback and how far apart they were in most of the good systems I'd heard.  I started with a 6 foot spacing between the mics, to more closely emulate the space between playback speakers and hopefully to avoid the hole in the middle.  Considering the time element, I reasoned the time a signal took to get from left mic to right mic should match the time it took to travel between the speakers in the listening room, hoping the reciprocity would get me closer to "being there" when listening to the result.

In many ways, the results of my early tests were very pleasing.  The instruments really sounded convincingly like what I heard in their presence during the recording.  Space too, was reproduced beautifully.  With a small jazz ensemble recording, you could hear it when the saxophone player gently swayed from side to side during his solo.  Sometimes I got the feeling those B&K omni mics could almost tell you what color his sweater was!  Now that instrumental sounds were all set, issues with spatial reproduction started to come to the fore.  With subsequent recordings of more complex ensembles, I wasn't getting any hole in the middle but instruments that were positioned slightly off center still seemed to "pull" to the near speaker on playback.

My interest in astronomy and optics led to an understanding of the reason for the apparent pull to the near speaker on playback.  In some situations, light bends.  I believe the same was occurring with the directional information with my 6 foot spaced pair of omnis.  For the spaced omnis alone to give an accurate rendition of the spatial information, one would need a multitude of mics, each feeding its own dedicated speaker from a multitude of speakers.  In other words, an array of say 16 spaced omnis (a wall of microphones) feeding 16 similarly placed speakers (a wall of speakers) would do a better job and be considerably less subject to the bending effects.  But I don't know anyone who has 16 speakers set up to reproduce 16 track sources, so this would not be a practical way to distribute recordings.

In keeping with my interest in maintaining integrity in the time domain, I didn't want to use any additional microphones.  Each of our ears hears sounds only once and I didn't want to combine pickups from different points in space which I believe only confuses the timing information received by each ear.  This would be the audio equivalent of a "ghost" image in a television picture.  I sought to use only one signal per speaker/ear, therefore only a single pair of microphones.

In an effort to rethink the 6 foot spacing I had been using, I decided to try a number of variations and devised a test where I would record myself with a spaced pair of omnis.  I would walk around in the front of my room knocking two pieces of wood together, all the while announcing exactly where I was.  I made several recordings of myself doing this, using the 6 foot spacing as well as a number of others, all the way to a 7" "ear spacing".  On playback, I paid particular attention to the just off center area that had proven problematic with the 6 foot spacing.  Somewhere around 15", things seemed to gel.  All the qualities I liked were there with considerably less vagueness in the image.  My long held belief in recording with omnis spaced at 6 feet was being revised.

I started researching just how it is our brains perceive stereo and the cues required for localization (our ability to determine where a sound is coming from).  What I learned was our brains use three types of cues to determine localization:  intensity, time and frequency.  Then it dawned on me that if nature could have gotten by with fewer cues, it would have done so.  I began to consider what was needed to supply all three types of cues in a stereo recording.

Coincident techniques prioritize intensity differences between mics/ears (and to a much lesser extent, frequency differences).  Spaced omni techniques prioritize time differences (and to a much lesser extent, intensity differences).  I wanted a way to incorporate all three.  Some more reading informed me that Blumlein, before he went with his coincident method, experimented with using omnidirectional microphones separated by a baffle.  A Swiss engineer, Jurg Jecklin, revived the idea half a century later with his "Jecklin disk".  Jecklin too sought to incorporate all three types of cues nature requires for stereo localization.

The literature describing his work says Jecklin used a pair of omnis and like Blumlein, spaced them to emulate ear spacing.  My own experiments told me I was getting a better sense of the space with a 15" spacing.  I looked into the commercially available Jecklin disks sold to recordists and determined these, in addition to having the microphone mounts locked in at ear spacing, had surfaces that I deemed to be too reflective.  I didn't want to place my microphones near a surface that would bounce some signals back to the mics.  The bounced reflection would arrive at the mic slightly later in time due to the slightly longer path it has to travel, creating a variation of the television ghost image mentioned above.  The combination of the direct sound with the slightly delayed sound would also result in what is called a comb filtering effect.  Comb filtering will cause certain frequencies to cancel, altering the natural spectral (i.e. frequency) response of the microphones.  This led me to build my own variation on the theme, a 12" circle made of soft, absorbent material.  In trying one more wood knocking recording using this disk between my microphones, I heard the sonic images solidify in a way I'd not heard before.  Now my recorded announcements appeared to emanate from where I said I was when I made the recording.

I'd found my way to record in stereo, incorporating all three types of cues nature uses to inform us of where a sound is coming from:  intensity, timing and frequency.  The timing information provided by the omnis benefited from the disk-shaped baffle which provided increased intensity differences between mics as well as frequency discrimination between mics.  I have since heard it said Mr. Jecklin now uses an inter mic spacing of 15"-16".

In Part 2, using this stereo microphone array.