The Compact Disc (CD): Compact disc encoding

Home | Audio mag. | Stereo Review mag. | High Fidelity mag. | AE/AA mag.

A substantial amount of information is added to the audio data before the compact disc is recorded. FIG. 1 illustrates the encoding process and shows the various information to be recorded.

FIG. 1 The encoding process in compact disc production.

There are usually two audio channels with 16-bit coding, sampled at 44.1 kHz. So, the bit rate, after combining both channels, is:

44.1 × 16 × 2 = 1.4112 × 10^6 bit s^-1

CIRC encoding

Most of the errors which occur on a medium such as CD are random. However, from time to time burst errors may occur due to fingerprints, dust or scratches on the disc surface. To cope with both random and burst errors, Sony and Philips developed the cross-interleave Reed-Solomon error-correction code (CIRC). CIRC is a very powerful combination of several error correction techniques.

Table 1 Specification of CIRC system in the compact disc

FIG. 2 CIRC encoder.

It is useful to be able to measure an error-correcting system's ability to correct errors, and as far as the compact disc medium is concerned it is the maximum length of a burst error which is critical. Also, the greater the number of errors received, the greater the probability of some errors being uncorrectable. The number of errors received is defined as the bit error rate (BER). An important system specification, therefore, is the number of data samples per unit time, called the sample interpolation rate, which have to be interpolated (rather than corrected) for given BER values.

The lower this rate is, the better the system. Then, if burst errors cannot be corrected, an important specification is the maximum length of a burst error which can be interpolated. Finally, it is important to know the number of undetected errors, resulting in audible clicks. Any specification of an error-correcting system must take all these factors into account. Table 9.1 is a list of all relevant specifications of the CIRC system used in CD.

The CIRC principle is as follows (refer to FIG. 2):

• The audio signal is sampled (digitized) at the A/D converter and these 16-bit samples are split up into two 8-bit words called symbols.

• Six of the 16-bit samples from each channel, i.e., 24 8-bit symbols, are applied to the CIRC encoder and stored in an RAM memory.

• The first operation in the CIRC encoder is called scrambling.

The scrambling operation consists of a two-symbol delay for the even samples and a mixing up of the connections to the C2 encoder.

• The 24 scrambled symbols are then applied to the C2 encoder, which generates four 8-bit parity symbols called Q words.

The C2 encoder inserts the Q words between the 24 incoming symbols, so that at the output of the C2 encoder 28 symbols result.

• Between the C2 and the C1 decoders there are 28 8-bit delay lines with unequal delays. Due to the different delays, the sequence of the symbols is changed completely, according to a determined pattern.

• The C1 encoder generates further four 8-bit parity symbols known as P words, resulting in a total of 32 8-bit symbols.

• After the C1 encoder, the even words are subjected to a one symbol delay, and all P and Q control words are inverted.

The resultant sequenced 32 8-bit symbols are called a frame; this is a CIRC-encoded signal that is applied to the EFM modulator. On playback, the CIRC decoding circuit restores the original 16-bit samples, which are then applied to the D/A converter.

The C2 encoder outputs 28 8-bit symbols for 24 symbols at its input: it is therefore called a (24, 28) encoder. The C1 encoder outputs 32 symbols for 28 symbols input: it is a (28, 32) encoder.

The bit rate at the output of the CIRC encoder is:

The control word

One 8-bit control word is added to every 32-symbol block of data from the encoder. The compact disc standard defines eight additional channels of information or subcodes that can be added to the music information; these subcodes are called P, Q, R, S, T, U, V and W. At the time of writing, only the P and Q subcodes are commonly used:

• The P subcode is a simple music track separator flag that is normally 0 (during music and in the lead-in track), but is 1 at the start of each selection. It can be used for simple search systems. In the lead-out track, it switches between 0 and 1 in a 2 Hz rhythm to indicate the end of the disc.

• The Q subcode is used for more sophisticated control purposes; it contains data such as track number and time.

The other subcodes carry information relating to possible enhancements, such as text and graphics, but will not be discussed here.

FIG. 3 How one of each of the six subcode bits are present in every frame of information. A total of 98 frames must therefore be read to read all six subcode words.

Each subcode word is 98 bits long and, as each bit of the control word corresponds to each subcode (i.e., P, Q, R, S, T, U, V, W), a total of 98 complete data blocks or frames must be read from the disc to read each subcode word. This is illustrated in FIG. 3.

After addition of the control word, the new data rate becomes:

FIG. 4 Formats of data in the Q subcode: (a) overall format; (b) mode 1 data format in the lead-in track; (c) mode 1 data format in music and lead-out tracks.

The Q subcode and its usage

FIG. 4a illustrates the structure of the 98-bit Q subcode word. The R, S, T, U, V and W subcode words are similar. The first 2 bits are synchronizing bits, S0 and S1. They are necessary to allow the decoder to distinguish the control word in a block from the audio information, and always contain the same data.

The next 4 bits are control bits, indicating the number of channels and pre-emphasis used, as follows: 0000 two audio channels/no pre-emphasis 1000 four audio channels/no pre-emphasis 0001 two audio channels/with pre-emphasis 1001 four audio channels/with pre-emphasis Four address bits indicate the mode of the subsequent data to follow. For the subcode, three modes are defined.

At the end of the subcode word, a 16-bit CRCC error-correction code, calculated on control, address and data information, is inserted. The CRCC uses the polynomial P(x) = x16

+ x12

+ x5

+ 1.

The three modes of data in Q subcode words are used to carry various information.

Mode 1 (address = 0001)

This is the most important mode, and the only one which is of use during normal playback. At least nine of 20 consecutive subcode words must carry data in mode 1 format. Two different situations are possible, depending whether the subcode is in the lead-in track or not.

When in the lead-in track, the data are in the format illustrated in FIG. 4b. The 72-bit section comprises nine 8-bit parts:

• TNO--containing information relating to track number: two digits in BCD form (i.e., 2 × 4 bits). Is 00 during lead-in.

• POINT/PMIN/PSEC/PFRAME--containing information relating to the table of contents (TOC). They are repeated three times.

POINT indicates the successive track numbers, while PMIN, PSEC and PFRAME indicate the starting time of that track.

Furthermore, if POINT = A0, PMIN gives the physical track number of the first piece of music (PSEC and PFRAME are zero); if POINT = A1, PMIN indicates the last track on the disc, and if POINT = A2, the starting point of the lead-out track is given in PMIN, PSEC and PFRAME.

Table 2 shows the encoding of the TOC on a disc which contains six pieces of music.

• ZERO--8 bits, all zero.

Table 2 Table of contents (TOC) information, on a compact disc with six pieces of music

In music and lead-out tracks, data are in the format illustrated in FIG. 4c. The 72-bit section now comprises:

• TNO--current track number: two digits in BCD form (01 to 99).

• POINT--index number within a track: two digits in BCD form (01 to 99).

If POINT = 00 it indicates a pause in between tracks.

• MIN/SEC/FRAME--indicates running time within a track: each part consists of digits in BCD form. There are 75 frames in a second (00 to 74). Time is counted down during a pause, with a value zero at the end of the pause. During lead-in and lead-out tracks, the time increases.

• AMIN/ASEC/AFRAME--indicates the running time of the disc in the same format as above. At the start of the program area, it is set to zero.

• ZERO--8 bits, all zero.

FIG. 5 shows a timing diagram of P subcode and Q subcode status during complete reading of a disc containing four selections (of which selections three and four fade out and in consecutively without an actual pause).

FIG. 5 Timing diagram of P and Q subcodes.

Mode 2 (address = 0010)

If mode 2 data are present, at least one of 100 successive subcode words must contain it. It is of importance only to the manufacturer of the disc, containing the disc catalogue number. The 98-bit, Q subcode word in mode 2 is shown in FIG. 6. The structure is similar to that of mode 1, with the following differences:

• N1 to N13--catalogue number of the disc expressed in 13 digits of BCD, according to the UPC/EAN standard for bar coding. The catalogue number is constant for any one disc.

If no catalogue number is present, N1 to N13 are all zero, or mode 2 subcode words may not even appear.

• ZERO--these 12 bits are zero.

Mode 3 (address = 0111)

Like mode 2 data, if mode 3 is present, at least one of 100 successive subcode words will contain it.

Mode 3 is used to assign each selection with a unique number, according to the 12-character International Standard Recording Code (ISRC), defined in DIN-31-621.

If no ISRC number is assigned, mode 3 subcode words are not present. During lead-in and lead-out tracks, mode 3 subcode words are not used, and the ISRC number must only change immediately after the track number (TNO) has been changed.

The 98-bit, Q subcode word in mode 3 is shown in FIG. 7.

The structure is similar to that of mode 1, with the following differences:

• I1 to I12--the 12 characters of the selection's ISRC number.

Characters I1 and I2 give the code corresponding to country.

Characters I3 to I5 give a code for the owner. Characters I6 and I7 give the year of recording. Characters I8 to I12 give the recording's serial number.

Characters I1 to I5 are coded in a 6-bit format according to Table 9.3, while characters I6 to I12 are 4-bit BCD numbers.

• 00--these 2 bits are zero.

• ZERO--these 4 bits are zero.

FIG. 6 Q subcode format with mode 2 data.

FIG. 7 Q subcode format with mode 3 data.

Table 3 Format of characters I1 to I5 in the ISRC code

FIG. 8 Timing diagram of EFM encoding and merging bits.

EFM encoding

EFM, or eight-to-fourteen modulation, is a technique which converts each 8-bit symbol into a 14-bit symbol, with the purpose of aiding the recording and playback procedure by reducing required bandwidth, reducing the signal's DC content and adding extra synchronization information. A timing diagram of signals at this stage of CD encoding is given in FIG. 8.

The procedure is to use 14-bit codewords to represent all possible combinations of the 8-bit code. An 8-bit code represents 256 (i.e., 28 ) possible combinations, as shown in Table 4. A 14-bit code, on the other hand, represents 16 384 (i.e., 214 ) different combinations, as shown in Table 5. Of the 16 384 14-bit codewords, only 256 are selected, having combinations which aid processing of the signal.

For instance, by choosing codewords which give low numbers of individual bit inversions (i.e., 1 to 0, or 0 to 1) between consecutive bits, the bandwidth is reduced. Similarly, by choosing codewords with only limited numbers of consecutive bits with the same logic level, overall DC content is reduced.

Table 4 An 8-bit code

Table 5 A 14-bit code

Table 6 Examples of 8-bit to 14-bit encoding

A ROM-based look-up table, say, can then be used to assign all 256 combinations of the 8-bit code to the 256 chosen combinations within the 14-bit code. Some examples are listed in Table 6.

In addition to EFM modulation, three extra bits, known as merging bits, are added to each 14-bit symbol, with the purpose of further lowering DC content of the signal. Exact values of the merging bits depend on the adjacent symbols.

Finally, the data bits are changed from NRZ into NRZI (non return to zero inverted) format, by converting each positive-going pulse of the NRZ signal into a single transition. The resultant signal has a minimum length of 3T (i.e., three clock periods) and a maximum of 11T (i.e., 11 clock periods), as shown in FIG. 9.

The bit rate is now:

The sync word

To the signal, comprising 33 symbols of 17 bits (i.e., a total of 561 bits), a sync word and its three merging bits are added, giving 588 bits in total (FIG. 10). Sync words have two main functions: (1) they indicate the start of each frame; (2) sync word frequency is used to control the player's motor speed.

The 588-bit-long signal block is known as an information frame.

Final bit rate

The final bit rate, recorded on the CD, consequently becomes:

1 9404 17 8 4 12335 1061

.. ×= × -

bit s

FIG. 9 Minimum and maximum pit length.

FIG. 10 Adding the sync word.

The frame frequency f frame is:

And, as subcodes are in blocks of 98 frames, the subcode frequency f

Playing time is calculated by counting blocks of subcode (i.e., 75 blocks = 1 second). A 60-minute-long CD consequently contains:

60 × 60 × 7350 = 26 460 000 frames

As each frame comprises 33 × 8 = 264 bits of information, a 1-hour-long CD actually contains 6 985 440 000 bits of information, or 873 180 000 bytes! Of this, the subcode area contains some 25.8 kbytes (200 Mbits). This gigantic data storage capacity of the CD medium is also used for more general purposes on the CD-ROM (compact disc read-only memory), which is derived directly from the audio CD.

Prev. | Next