Kees A. Schouhamer Immink
Turing
Machines Inc, Rotterdam, The Netherlands,
Institute
for Experimental Mathematics, Essen, Germany.
Abstract –
An audio compact disc (CD) holds up to 74 minutes, 33 seconds of sound, just
enough for a complete mono recording of Ludwig von Beethoven's Ninth Symphony
(‘Alle Menschen werden Brüder’)
at probably the slowest pace it has ever been played, during the Bayreuther Festspiele in 1951 and
conducted by Wilhelm Furtwängler. Each second of
music requires about 1.5 million bits, which are represented as tiny pits and
lands ranging from 0.9 to 3.3 micrometers in length. More than 19 billion
channel bits are recorded as a spiral track of alternating pits and lands over
a distance of 5.38 kilometers (3.34 miles), which are scanned at walking speed,
4.27 km per hour.
This
year it is 25 years ago that Philips and Sony introduced the CD. In this
jubilee article we will discuss the various crucial technical decisions made
that would determine the technical success or failure of the new medium.
In 1973, I started my work on servo
systems and electronics for the videodisc in the Optics group of Philips
Research in Eindhoven. The videodisc is a 30 cm diameter optical disc that can
store up to 60 minutes of analog
FM-modulated video and sound. It is like
a DVD, but much larger, heavier, and less reliable. The launch of the videodisc
in 1975 was a technical success, but a monumental marketing failure since the
consumers showed absolutely no interest at all. After two years, Philips
decided to throw in the towel, and they withdrew the product from the market.
While my colleagues and I were working on the videodisc, two Philips engineers were asked to develop an audio-only disc based on optical videodisc technology. The two engineers were recruited from the audio department, since my research director believed a sound-only disc was a trivial matter given a video and sound videodisc, and he refused to waste costly researcher’s time. In retrospect, given the long forgotten videodisc and the CD’s great success, this seems a remarkable decision.
The
audio engineers started by experimenting with an analog approach using
wide-band frequency modulation as in FM radio. Their experiments revealed that
the analog solution was scarcely more immune to dirt and scratches than a
conventional analog LP. Three years later they decided to look for a digital
solution. In 1976 and later, Philips and Sony independently demonstrated the
first prototypes of a digital disc using laser videodisc technology. In 1977,
Sony completed a prototype with a 30 cm diameter disc, the same as the
videodisc, and 60 minutes playing time [2]. In October 1979, a crucial
high-level decision was made to join forces in the development of a world audio
disc standard. Philips and Sony, although competitors in many areas, shared a
long history of cooperation, for instance in the joint establishment of the
compact cassette standard in the 1960's. In marketing the final products,
however, both firms would compete against each other again. Philips brought its
expertise and the huge videodisc patent portfolio to the alliance, and Sony
contributed its expertise in digital audio technology. In addition, both firms
had a significant presence in the music industry via CBS/Sony, a joint venture
between CBS Inc. and Sony Japan Records Inc. dating from the late 1960s, and
Polygram, a 50% subsidiary of Philips [4].
Within a few weeks, a joint task force
of experts was formed. As the only electronics engineer within the ‘Optics’
research group, I participated and dealt with servos, coding, and electronics
at large. In 1979 and 1980, a number of meetings, alternating between Tokyo and
Eindhoven, were held. The first meeting, in August 1979 in Eindhoven, and the
second meeting, in October 1979 in Tokyo, provided an opportunity for the engineers
to get to know each other and to learn each other’s main strengths. Both
companies had shown prototypes and it was decided to take the best of both
worlds. During the third technical meeting on December 20, 1979, both partners
wrote down their list of preferred main specifications for the audio disc.
Although there are many other
specifications, such as the dimensions of the pits, disc thickness, diameter of
the inner hole, etcetera, these are too technical to be discussed here.
As can be seen from the list, a lot of work had to be done as the partners agreed only on one item, namely the one-hour playing time. The other target parameters, sampling rate, quantization, and notably disc diameter look very similar, but were worlds apart.
Item |
Philips |
Sony |
Sampling rate (kHz) |
44.0 - 44.5 |
44.1 |
Quantization |
14 bit |
16 bit |
Playing time (min) |
60 |
60 |
Diameter (mm) |
115 |
100 |
EC Code |
t.b.d. |
t.b.d. |
Channel Code |
M3 |
t.b.d. |
t.b.d.
= to be discussed
The Shannon-Nyquist sampling theorem dictates that in order to achieve
lossless sampling, the signal should be sampled with a frequency at least twice
the signal’s bandwidth. So for a bandwidth of 20 kHz a sampling frequency of at
least 40 kHz is required. A large number of people, especially young people,
are perfectly capable of hearing sounds at frequencies well above 20 kHz. That
is, in theory, all we can say. In 1978, each and every piece of digital audio
equipment used its own ‘well-chosen’ sampling frequency ranging from 32 to 50
kHz. Modern digital audio equipment accepts many different sampling rates, but
the CD task force opted for only one frequency, namely 44.1 kHz. This sampling
frequency was chosen mainly for logistics reasons as will be discussed later,
once we have explained the state-of-the-art of digital audio recording in 1979.
Towards the end of the 1970s, ‘PCM adapters’ were developed in Japan,
which used ordinary analog video tape recorders as a means of storing digital
audio data, since these were the only widely available recording devices with
sufficient bandwidth. The best commonly-available video recording format at the
time was the 3/4" U-Matic.
The presence of the PCM video-based
adaptors explains the choice of sampling frequency for the CD, as the number of
video lines, frame rate, and bits per line end up dictating the sampling
frequency one can achieve for storing stereo audio. The sampling frequencies of
44.1 and 44.056 kHz were the direct result of a need for compatibility with the
NTSC and PAL video formats. Essentially, since there were no other reliable
recording products available at that time that offered other options in
sampling rates, the Sony/Philips task force could only choose between 44.1 or
44.056 KHz and 16 bits resolution (or less).
During the fourth meeting held in Tokyo from March 18-19, 1980, Philips
accepted, and thus followed Sony’s original proposal, the 16-bit resolution and
the 44.1 kHz sampling rate. 44.1 kHz as opposed to 44.056 kHz was chosen for
the simple reason that it was easier to remember. Philips dropped their wish to
use 14 bits resolution: they had no technical rationale as the wish for the 14
bits was in fact only based on the availability of their 14-bit digital-analog
converter. In other words, the Compact Disc sound quality equals the sound
quality of Sony’s PCM-1600 adaptor.
Thus, quite remarkably, in recording practice, an audio CD starts life
as a PCM master tape, recorded on a U-Matic videotape cassette, where the audio
data is converted to digital information superimposed within a standard
television signal. The industry standard hardware to do this was the Sony
PCM-1600, the first commercial video-based 16-bit recorder, followed by the
PCM-1610 or PCM-1630 adaptors. Until the 1990s, only video cassettes could be
used as a means for exchanging digital sound from the studios to the CD
mastering houses. Later, Exabyte computer tapes, CD-Rs and memory sticks have
been used as a transport vehicle.
Coding systems
Coding techniques form the basis of
modern digital transmission and storage systems. There had been previous
practical applications of coding, especially in space communications, but the
Compact Disc was the first mass-market electronics product equipped with
fully-fledged error correction and channel coding systems. To gain an idea of
the types of errors, random versus burst errors, burst length distribution and
so on, we made discs that contained known coded sequences. Burst error length
distributions were measured for virgin, scratched, or dusty discs. The error
measurement was relatively simple, but scratching or fingerprinting a disc in
such a way that it can still be played is far from easy. How do you get a disc
with the right kind of sticky dust? During playing, most of the dust fell off
the disc into the player, and the optics engineers responsible for the player
were obviously far from happy with our dust experiments. The experimental discs
we used were handmade, and not pressed as commercial mass-produced polycarbonate discs. In retrospect, I think that
the channel characterization was a far from adequate instrument for the design
of the error correction control (ECC).
There were only two competing ECC
proposals to be studied. Experiments in Tokyo and Eindhoven -Japanese dust was not the same as Dutch
dust- were conducted to verify the
performance of the two proposed ECCs. Sony proposed a byte-oriented, rate 3/4,
Cross lnterleaved Reed-Solomon code (CIRC) [6]. Vries
of Philips designed an interleaved convolutional, rate 2/3, code having a basic
unit of information of 3-bit characters [9]. CIRC uses two short RS codes,
namely (32, 28, 5) and (28, 24, 5) RS codes using a Ramsey-type of interleaver. If a major burst error occurs and the ECC is
overloaded, it is possible to obtain an approximation of an audio sample by
interpolating the neighboring audio samples, so concealing uncorrectable
samples in the audio signal. CIRC has various nice features to make error
concealment possible, so extending the player's operation range [10]. CIRC showed both a much higher performance
and code rate (and thus playing time), although extremely complicated to cast into
silicon at the time. Sony used a 16 kByte RAM for
data interleaving, which, then, cost around $50, and added significantly to the
sales price of the player. During the fifth meeting in Eindhoven, May 1980, the
partners agreed on the CIRC error correction code since our experiments had
shown its great resilience against mixtures of random and burst errors [11].
The fully correctable burst length is about 4.000 bits (around 1.5 mm missing
data on the disc). The length of errors that can be concealed is about 12.000
bits (around 7.5 mm). However, the largest error burst we ever measured during
the many long days of disc channel characterization was 0.1 mm.
We also had to decide on the channel
code. This is a vital component as it has a considerable impact on both the
playing time and the quality of ‘disc handling’ or 'playability'. Servo systems
follow the track of alternating pits and lands in three dimensions, namely
radial, focal, and rotational speed. Everyday handling damage, such as dust,
fingerprints, and tiny scratches, not only affects retrieved data, but also
disrupts the servo functions. In worst cases, the servos may skip tracks or get
stuck, and error correction systems become utterly worthless. A product with
such devastating weaknesses would remain a laboratory toy. A well-designed
channel code will make it possible to remove the major barriers related to
these playability issues.
The system designer should find a good
trade-off between long playing time and playability. Both partners proposed
some form of (d, k) runlength-limited (RLL)
codes, where d is the minimum number and k is the maximum number
of zeros between consecutive ones. The differences between the various
proposals were the code rate, runlength parameters d
and k, and the spectral content. The spectral content has a direct
bearing on the playability. In their prototype, Philips used the propriety M3
channel code, a rate ½, d=1, k=5 code, with a well-suppressed
spectral content [1]. M3 is a variation on the M2 code, which was developed in
the 1970s by Ampex Inc. for their digital video tape
recorder [5]. Sony started with a rate 1/3, d=5, RLL code, but since
that did not work, they changed horses halfway, and proposed a propriety rate
½, d=2, k=7 code, a type of code that had been used in magnetic
disk data storage. Both Sony codes did not have spectral suppression, and the
engineers had opposing views on how the servo and synchronization issue could
be solved. In May 1980, the choice of the channel code therefore remained open,
and ‘more study was needed’. Before continuing with the coding cliffhanger, we
take a musical break.
Playing time and disc diameter are
probably the parameters most visible for consumers. Clearly, these two are
related: a 5% increase in disc diameter yields 10% more disc area, and thus an
increase in playing time of 10%. The Philips’ top made the proposal regarding
the disc diameter. They argued 'The Compact Audio Cassette was a great success',
and, 'we don't think CD should be much larger'. The cross diameter of the
Compact Audio Cassette, very popular at that time and also developed by
Philips, is 115 mm. The Philips prototype audio disc and player were based on
this idea, and the Philips team of engineers restated this view in the list of
preferred main parameters. Sony, no doubt with
portable players in mind, initially preferred a smaller 100 mm disc.
During the May 1980 meeting something
remarkable happened. The minutes of the May 1980 meeting in Eindhoven literally
reads:
disc diameter: 120 mm,
playing time: 75 minutes,
track pitch: 1.45 µm,
can be achieved with the Philips M3
channel code. However, the negative points are: large numerical aperture needed
which entails smaller (production) margins, and the Philips’ M3 code might
infringe on Ampex M2.
Both disc diameter and playing time
differ significantly from the preferred values listed during the Tokyo meeting
in December 1979. So what happened during the six months? The minutes of the
meetings do not give any clue as to why the changes to playing time and disc
diameter were made. According to the Philips’ website with the ‘official’
history: "The playing time was determined posthumously by Beethoven".
The wife of Sony's vice-president, Norio Ohga,
decided that she wanted the composer's Ninth Symphony to fit on a CD. It was,
Sony’s website explains, Mrs. Ohga's favorite piece
of music. The Philips’ website proceeds:
“The performance by the Berlin
Philharmonic, conducted by Herbert von Karajan, lasted for 66 minutes. Just to
be quite sure, a check was made with Philips’ subsidiary, Polygram, to
ascertain what other recordings there were. The longest known performance lasted
74 minutes. This was a mono recording made during the Bayreuther
Festspiele in 1951 and conducted by Wilhelm Furtwängler. This therefore became the maximum playing time
of a CD. A diameter of 120 mm was required for this playing time”.
Everyday practice is less romantic than
the pen of a public relations guru, as at that time, Philips’ subsidiary
Polygram –one of the world's largest distributors of music– had set up a CD
disc plant in Hanover, Germany that could produce large quantities CDs with, of
course, a diameter of 115mm. Sony did not have such a facility yet. So if Sony
had agreed on the 115mm disc, Philips would have had a significant competitive
edge in the music market. Ohga was aware of that, did
not like it, and something had to be done. It was not about Mrs. Ohga’s great passion for music, but the money and competition in the market of the two partners. The
decision regarding diameter/playing time was taken outside of the group of
experts responsible for the CD format. So I, a former member of that group, can
only guess what happened at the upper floor. But something unforeseen happened: at the last minute
we changed the code.
Popular literature, as exemplified in
Philips’ website mentioned above, states that the disc diameter is a direct result
of the requested playing time. And that the extra 14 minutes playing time for Furtwängler’s
Ninth subsequently
required the change from 115mm to a 120 mm disc. It suggests that
there are no other factors affecting playing time. Note that in May 1980, when
disc diameter and playing time were agreed, the channel code, a key factor
affecting playing time, was not yet settled. In the minutes of the May 1980
meeting, it was remarked that the above (diameter, playing time, and track
pitch) could be achieved with Philips' M3 channel code. In the mean time, but not mentioned in the minutes of the May
meeting, the author was experimenting with a new channel code, later coined EFM
[3]. EFM, a rate 8/17, d=2, code made it possible to achieve a 30
percent higher information density than the Philips' M3. Due to its good
spectral suppression, EFM also showed a good resilience against disc handling
damage such as fingerprints, dust, and scratches. Note that 30 percent
efficiency improvement is highly attractive, since, for example, the disc
diameter increase from 115 to 120 mm only offers a mere10 percent increase in
playing time.
A month later, in June 1980, we could
not choose the channel code, and again more study and experiments were needed.
Although experiments had shown the greater information density that could be
obtained with EFM, it was at first merely rejected. At the end of the
discussion, which at times was heated, the Sony people were specifically
opposing the complexity of the EFM decoder, which then required 256 gates. My
remark that the CIRC decoder needed at least half a million gates and that the
extra 256 gates for EFM were irrelevant was jeered at. Then suddenly, during
the meeting, we received a phone call from the presidents of Sony and Philips, who
were meeting in Tokyo. We were running out of time, they said, and one week for
an extra, final, meeting in Tokyo was all the lads could get. Sony stated that
if the EFM hardware would be less than 80 gates, they would accept it. I had a
week to reduce the gate count. I used the first Apple II computer in the lab,
which was much handier for such an interactive design using trial and error
than the IBM mainframe, especially as I had to walk to the IBM computer center
for every job. I succeeded in bringing the gate count down to just 52 gates,
and on June 19, 1980 in Tokyo, Sony agreed to EFM. The 30 percent extra
information density offered by EFM could have been used to reduce the diameter
to 115mm or even 100mm, (with, of course, the requested 74 minutes and 33
seconds for playing Mrs. Ohga’s favorite Ninth).
However such a change was not considered to be politically feasible, as the
powers to be had decided 120mm. The option to increase the playing time to 97
minutes was not even considered. We decided to improve the production margins
of player and disc by lowering the information density by 30 percent: the disc
diameter remained 120mm, the track pitch was increased from 1.45 to 1.6µm, and
the user bit length was increased from 0.5 to 0.6µm. By increasing the bit size
in two dimensions, in a similar vein to large letters being easier to read, the
disc was easier to read, and could be introduced without too many technical
complications.
The maximum playing time of the CD was
settled at 74 minutes and 33 seconds, but in practice, however, the maximum
playing time was determined by the playing time of the U-Matic video recorder,
which was 72 minutes. Therefore, rather sadly, Mrs. Ohga’s
favorite Ninth by Furtwängler could not be recorded
in full on a single CD till 1988, when alternative digital transport media
became available. On a slightly different note, Jimi Hendrix's
Electric Ladyland featuring a playing time of
75 minutes was originally released as a 2 CD set in the early 1980s, but has
been on a single CD since 1997.
The Sony/Philips task force stood on the
shoulders of the Philips’ engineers who created the laser videodisc technology
in the 1970s. Given the videodisc technology, the task force made choices
regarding various mechanical parameters such as disc diameter, pit dimensions,
and audio parameters such as sampling rate and resolution. In addition, two
basic patents were filed related to error correction, CIRC, and channel code,
EFM. CIRC, the Reed-Solomon ECC format, was completely engineered and developed
by Sony engineers, and EFM was completely created and developed by the author.
Let us take a look at the numbers. The
size of the task force varied per meeting, and the average number of attendees
listed on the minutes of the joint meetings is twelve. If the persons carrying
hierarchical responsibility of the CD project are excluded then we find a very
small group of engineers who carried the technical responsibility of the
Compact Disc ‘Red Book’ standard.
Philips' corporate public relations
department, see The Inventor of the CD on Philips' website [7], states
that the CD was "too complex to be invented by a single individual",
and the "Compact Disc was invented collectively by a large group of people
working as a team". It persuades us to believe that progress is the
product of institutions, not individuals. Evidently, there were battalions of
very capable engineers, who further developed and marketed the product, and
success in the market depended on many other innovations. For example, the
solid-state physicists, who developed an inexpensive laser diode, a primary
enabling technology, made CD possible in practice. Credit should also be given
to the persons who designed the transparent Compact Disc storage case, the '
jewel box', made a clever contribution to the visual appeal of the CD.
Philips and Sony agreed in a memorandum
dated June 1980, that their contributions to channel and error correction codes
are equal. Sony’s website with their 'official' history is entitled 'Our
contributions are equal' [8]. The website proceeds, “We avoid such comments as, ‘We developed this part and
that part’ and to emphasize that the disc's development was a joint effort by
saying, ‘Our contributions are equal’. The leaders of the task force convinced
the engineers to put their companies before individual achievements.” The myth building even went so far that the
patent applications for both CIRC and EFM were filed with joint Sony/Philips
inventors.
Everything
else is gaslight
A favorite expression of audiophiles –particularly during the early period, when they were comparing both vinyl LP and CD versions of the same recordings– was: "It is as though a veil has been lifted from the music". Or, in the words of the famous Austrian conductor Herbert von Karajan, when he first heard CD audio: "Everything else is gaslight". Von Karajan was fond of the gaslight metaphor: he first conducted Der Rosenkavalier in 1956 with the soprano Elisabeth Schwarzkopf. Later, when he revived the opera in 1983 with Anna Tomowa, he referred to his 1956 cast as "gaslight", which rather upset Schwarzkopf.
Philips and Sony settled the
introduction of the new product to be on November 1, 1982. The moment the ink
of the “Red Book”, detailing the CD specifications, was dry, the race started,
and hundreds of developers in Japan and the Netherlands were on their way.
Early January 1982 it became clear that Philips was running
behind, the electronics was seriously delayed, and they asked Sony to postpone
the introduction. Sony rejected the delay, but agreed upon a two-step launch. Sony
would first market their CD players and discs in Japan, where Philips had no
market share, and half a year later, March 1983, the worldwide introduction
would take place by Philips and Sony. Philips Polygram could supply discs for
the Japanese market. This gave Philips some breathing space for the players,
but not enough as in order to make the new deadline, the first generation of
Philips CD players was equipped with Sony electronics.
The first CD players cost over $2000,
but just two years later it was possible to buy them for under $350. Five years after the introduction, sales of CD were higher
than vinyl LPs. Yet this was no great achievement, as in 1980 sales of vinyl
records had been declining for many years although the music industry was all
but dead. A few years later, the Compact
Disc had completely replaced the vinyl LP and cassette tape. Compact Disc
technology was ideal for use as a low-cost, mass-data storage medium, and the
CD-ROM and record-once and re-writable media, CD-R and CD-RW, respectively,
were developed. In 1995, the CD was succeeded by DVD, which offers a six-fold
higher storage capacity. Now, 25 years after the introduction of the CD, home
cinema on DVD accounts for 70 percent of Hollywood's worldwide film revenue.
DVD has replaced VHS videotape. Hundreds of millions of players and more than
two hundred billion CD audio discs have been sold.
Acknowledgement
The author warmly acknowledges the
hospitality of the Rotterdam Radio Museum, where the photos of classic CD
players were made.
Further
reading
[1] M.G. Carasso, W.J. Kleuters, and J.J Mons, Method of coding data bits on a recording medium (M3 Code), US Patent 4,410,877, 1983.
[2] T. Doi, T.
Itoh, and H. Ogawa, A
Long-Play Digital Audio Disk System, AES Preprint 1442, Brussels, Belgium, March 1979.
[3] K.A.S. Immink and
H. Ogawa, Method for Encoding
Binary Data (EFM), US
Patent 4,501,000, 1985.
[4] T. Kretschmer and K. Muehlfeld, Co-opetition
in Standard-Setting: The Case of the Compact Disc.
[5] J.W. Miller, DC Free encoding for data transmission (M2 Code), US Patent 4,234,897, 1980.
[6] K. Odaka, Y. Sako, I. Iwamoto, T. Doi, and L. Vries, Error correctable data
transmission method (CIRC), US Patent 4,413,340, 1983.
[8] Our contributions are equal, Sony’s
historical website.
[9] L.B. Vries,
The Error Control
System of Philips Compact Disc, AES Preprint 1548, New York, Nov. 1979.
[10] K.A.S. Immink, ''Reed-Solomon Codes and the Compact Disc''
in S.B. Wicker and V.K. Bhargava,
Eds., Reed-Solomon Codes and Their Applications, IEEE Press, 1994.
[11] J. Nathan, SONY, The private
life, Houghton Mifflin Co, 1999, pp. 140. Quote:
‘At the final session of the first round, held in Tokyo March 1980, the two teams tested one another’s error correction systems on discs that had been scratched, marked with fingerprints, even dusted with chalk. The Philips system proved inadequate to the extreme conditions, and Sony was judged the winner. There were protests from the Philips team that the test conditions were extreme, but their manager agreed that the test had been fair, and that the Sony error correction mechanism was adopted.’
[12] K.A.S.
Immink ‘The CD Story’,
AES Journal, pp. 458-465, May 1998.