Digital Audio Recording Systems: Digital audio tape (DAT) format

Home | Audio mag. | Stereo Review mag. | High Fidelity mag. | AE/AA mag.

Although digital audio processors have been developed and used for many years, using conventional video recorders to store high quality audio information, it is inevitable that some form of tape mechanism be required to do the job in a more compact way.

Two main formats have been specified. The first format, known as rotary-head, digital audio tape (R DAT), is based on the same rotary-head principle as a video recorder, and so has the same limitations in portability. The second format, known as stationary-head, digital audio tape ( S DAT), was developed under the name DCC, mentioned in the short history section.

R-DAT

One important difference between standard video recorder and R-DAT techniques is that in a video recorder the recorded signal is continuous; two heads on the drum make contact with the tape for 180° each (i.e., the system is said to have a 180° wrap angle, as shown in FIG. 2a), or 221° each (a 221° wrap angle, as in FIG. 2b). In the R-DAT system, where the digital audio signal is time-compressed, meaning that the heads only need to make contact with the tape for a smaller proportion of the time (actually 50%), a smaller wrap angle may be used (90°, as shown in FIG. 2c).

FIG. 1 DAT mechanism.

FIG. 2

FIG. 3 Simplified R-DAT track pattern.

Table 1

FIG. 4 Overwrite recording is used to ensure each track is as narrow as possible and no guard-band is required.

FIG. 5 R-DAT tape track format.

This means only a short length of tape is in contact with the drum at any one time. Tape damage is consequently reduced, and only a low tape tension is necessary with resultant increase in head life.

The R-DAT standard specifies three sampling frequencies:

• 48 kHz; this frequency is mandatory and is used for recording and playback.

• 44.1 kHz; this frequency, which is the same as for CD, is used for playback of pre-recorded tapes only.

• 32 kHz; this frequency is optional and three modes are provided.

32 kHz has been selected as it corresponds with the broadcast standard.

Quantization:

• A 16-bit linear quantization is the standard for all three sampling rates.

• A12-bit non-linear quantization is provided for special applications such as long play mode at reduced drum speed, 1000 rpm (mode III) and U-channel applications.

FIG. 3 shows a simplified R-DAT track pattern.

The standard track width is 13.591 µm, the track length is 23.5 mm and the linear tape speed is 8.1 mm s^-1. The tape speed of the analog compact cassette (TM) is 47.6 mm s-1. This results in a packing density of 114 Mbit s^-1 m^-2 (see Table 1).

Table 2

The R-DAT format specifies a track width of only 13.6 µm, but the head width is about 1.5 times this value, around 20 µm. A procedure known as overwrite recording is used, where one head partially records over the track recorded by the previous head, illustrated in FIG. 4. This means that as much tape as possible is used--rotating-head recorders without this overwrite record facility must leave a guard-band between each track on the tape. Because of this, recorders using overwrite recording techniques are sometimes known as guard-bandless. To prevent crosstalk on playback (as each head is wide enough to pick up all of its own track and half of the next), the heads are set at azimuth angles of ±20°. This enables, as will be explained later, automatic track following (ATF).

These overwrite record and head azimuth techniques are fairly standard approaches to rotating-head video recording, and are used specifically to increase the recording density.

FIG. 5 shows the R-DAT track format on the tape, while Table 2 shows the track contents. Table 2 lists each part of a track and gives the recording angle, recording period and number of blocks allocated to each part. Frequencies of these blocks which are not of a digital-data form are also listed.

As specified in the standard, a head drum of 30 mm diameter is applied and rotates at a speed of 2000 rpm. However, in future applications smaller drums with appropriate speeds can be used.

At this size and speed, the drum has a resistance to external disturbances similar to that of a gyroscope.

Under these conditions, the 2.46 Mbit s^-1 signal to be recorded,

which includes audio as well as many other types of data, is compressed by a factor of 3 and processed at 7.5 Mbit s^-1. This enables the signal to be recorded continuously.

In order to overcome the well-known low-frequency problems of coupling transformers in the record/playback head, an 8/10 modulation channel code converts the 8-bit signals to 10-bit signals.

This channel coding also gives the benefit of reducing the range of wavelengths to be recorded. The resultant maximum wave length is only four times the minimum wavelength. This allows overwriting, eliminating the need for a separate erase head.

The track outline is given in FIG. 5. Each helical track is divided into seven areas, separated by inter-block gaps. As can be seen, each track has one PCM area, containing the modulated digital information (audio data and error codes), and is 128 blocks of 288 bits long. Table 2 lists all track parts of a track.

The PCM area is separated from the other areas by an inter-block gap (IBG), three blocks long. At both sides of the PCM area, two ATF areas are inserted, each five blocks long.

Table 3 PCM area format

FIG. 6 PCM and subcode data blocks.

Table 4 Bit assignment of ID codes

FIG. 7 Subcode data blocks.

Again, an IBG is inserted at both ends of the track, separating the ATF areas from the sub-1 and sub-2 areas (subcode areas), each eight blocks long. These subareas contain all the information on time code, tape contents, etc.

Then at both track ends a margin block is inserted, 11 blocks long, and is used to cover tolerances in the tape mechanism and head position.

A single track comprises 196 blocks of data, of which the major part is made up of 128 blocks of PCM data. Other important parts are the subcode blocks (sub-1 and sub-2, containing system data, similar to the CD subcode data), automatic track-finding (ATF) signals (to allow high-speed search) and the IBGs around the ATF signals (which means that the PCM and subcode information can be overwritten independently without interference to surrounding areas). Parts are recorded successively along the track.

The PCM area format is shown in Table 3. PCM and subcode parts comprise similar data blocks, shown in FIG. 6. Each block is 288 bits long.

Each block comprises eight synchronization bits, the identification word (W1, 8 bits), the block address word (W2, 8 bits), 8-bit parity word and 256-bit (32 × 8-bit symbol) data. The ID code W1 contains control signals related to the main data. Table 4 shows the bit assignment of the ID codes. W2 contains the block address. The most significant bit (MSB) of the W2 word defines whether the data block is of PCM or subcode form. Where the MSB is zero, the block consists of PCM audio data, and the remainder of word W2, i.e., 7 bits, gives the block address within the track. The 7 bits therefore identify the absolute block address (as 27 is 128).

On the other hand, when the MSB of word W2 is 1, the block is of subcode form and data bits in the word are as shown in FIG. 7, where a further 3 bits are used to extend the W1 word sub code identity code, and the four least significant bits give the block address.

The P-word, block parity, is used to check the validity of the W1 and W2 words and is calculated as follows:

P = W1 ? W2 where ? signifies modulo-2 addition, as explained in Appendix 1.

FIG. 8 ATF signal frequencies.

Automatic track following

In the R-DAT system, no control track is provided. In order to obtain correct tracking during playback, a unique ATF signal is recorded along with the digital data.

The ATF track pattern is illustrated in FIG. 8. One data frame is completed in two tracks and one ATF pattern completed in two frames (four tracks). Each frame has an A and a B track. A tracks are recorded by the head with +20° azimuth and B tracks are recorded by the head with -20° azimuth.

The ATF signal pattern is repeated over subsequent groups of four tracks. The frequencies of the ATF signals are listed in FIG. 8. The key to the operation lies in the fact that different frames hold different combinations and lengths. Furthermore, the ATF operation is based upon the use of the crosstalk signals, picked up by the wide head, which is 1.5 times the track width, and the azimuth recording. This method is called the area divided ATF.

As shown in FIG. 8, the ATF uses a pilot signal f 1

; sync signal 1, f 2

; sync signal 2, f 3

; and erase signal f 4

When the head passes along the track in the direction of the arrow (V-head) and detects an f 2 or f 3 signal, the six adjacent pilot signals f 1 on both sides are immediately compared, which results in a correction of the tracking when necessary.

The f 2 and f 3 signals thus act as sync signals to start the ATF servo operation.

The f 1 signal, a low-frequency signal, i.e., 130.67 kHz, is used as low-frequency signals are not affected by the azimuth setting, so crosstalk can be picked up and detected from both sides. The pilot signal f 1 is positioned so as not to overlap through the head scans across three successive tracks.

FIG. 9 ECC interleaving format.

Error correction

As with any digital recording format, the error-detection and error-correction scheme is very important. It must detect and correct the digital audio data, as well as subcodes, ID codes and other auxiliary data.

Types of errors that must be corrected are burst errors--dropouts caused by dust, scratches and head clogging--and random errors--caused by crosstalk from an adjacent track, traces of an imperfectly erased or overwritten signal, or mechanical instability.

Error-correction strategy

In common with other digital audio systems, R-DAT uses a significant amount of error-correction coding to allow error-free replay of recorded information. The error-correction code used is a double-encoded Reed-Solomon code.

FIG. 10 Data allocation.

These two Reed-Solomon codes produce C1 (32, 28) and C2 (32, 26) parity symbols, which are calculated on GF (28 ) by the polynomial:

[...]

C1 is interleaved on two neighboring blocks, while C2 is inter leaved on one entire track of PCM data every four blocks (see FIG. 9 for the interleaving format).

In order to perform C1 ? C2 decoding/encoding, one track worth of data must be stored in memory.

One track contains 128 blocks consisting of 4096 (32 × 128) symbols. Of these, 1184 symbols (512 symbols C1 parity and 672 symbols C2 parity) are used for error correction, leaving 2912 data symbols (24 × 10^4).

In fact, C1 encoding adds four symbols of parity to the 28 data symbols C1 (32, 28), while C2 encoding adds six symbols of parity to every 26 PCM data symbols C2 (32, 26).

The main data allocation is shown in FIG. 10.

This double-Reed-Solomon code gives the format a powerful correction capability for random errors.

PCM data interleave

In order to cope with burst errors, i.e., head clogging, tape dropouts, etc. PCM data are interleaved over two tracks called one frame, effectively turning burst errors into random errors which are correctable using the Reed-Solomon technique already described.

To interleave the PCM data, the contents of two tracks have first to be processed in a memory. The memory size required for one PCM interleave block is: (128 × 32) symbols × 8 bits × 2 tracks = 65.536 bits, which means a 128-bit memory is required.

The symbols are interleaved, based on the following method, according to the respective number of the audio data symbol.

The interleaving format depends on whether a 16-bit or 12-bit quantization is used. The interleave format discussed here is for 16-bit quantization, the most important format.

FIG. 11 PCM data interleave format.

One 16-bit audio data word indicated as Ai or Bi is converted to two audio data symbols each consisting of 8 bits. The audio data symbol converted from the upper 8 bits of Ai or Bi is expressed as Aiu or Biu. The audio data symbol converted from the lower 8 bits of Ai or Bi is expressed as Ail or Bil.

Note: A stands for the left channel, B for the right channel.

If the audio data symbol is equal to Aiu or Ail , let a = 0.

If the audio data symbol is equal to Biu or Bil , let a = 1.

If the audio data symbol is equal to Aiu or Biu, let u = 0.

If the audio data symbol is equal to A or B , let u = 1.

Tables 5a and b represents an example of the data assignment for both tracks (+ azimuth and--azimuth) respectively, for 16-bit sampled data words.

FIG. 12

FIG. 13 Subcode area format.

FIG. 14 Pack format.

Subcode

The data subcode capacity is about four times that of a CD and various applications will be available in the future. A subcode format which is essentially the same as the CD subcode format is currently specified for pre-recorded tapes.

The most important control bits, such as the sampling frequency bit and copy inhibit bit, are recorded in the PCM-ID area, so it is impossible to change these bits without rewriting the PCM data. As the PCM data are protected by the main error-correction process, subcodes requiring a high reliability are usefully stored here.

Data to allow fast accessing, program number, time code, etc.

are recorded in subcode areas (sub-1 and sub-2) which are located at both ends of the helical tracks. These subcode areas are identical. FIG. 12 illustrates the sub-1 and sub-2 areas, along with the PCM area containing subcode information.

An example of the subcode area format is shown in FIG. 13.

Data are recorded in a pack format.

FIG. 14 shows the pack format, and the pack item codes are listed in Table 6. All the CD-Q channel subcodes are available to be used.

Each pack block comprises an item code of 4 bits, indicating what information is stored in the pack data area. The item code 0100 indicates that the related pack data is a table of contents (TOC) pack. This TOC is recorded repeatedly throughout the tape, in order to allow high-speed access and search (at 200 times nor mal speed). Every subcode data-block is controlled by an 8-bit C1 parity word allowing appropriate control of data validity.

Subcode data in the subarea can be rewritten or modified independently from the PCM data.

FIG. 15 shows an example of subcode information for pre recorded tape. The figure shows the use of different codes and pack data on a tape, such as program time, absolute time, program number, etc.

Table 5a

Table 5b

FIG. 15

Table 6

FIG. 16

FIG. 17

Tape duplication

High-speed duplication of R-DAT tapes can be done by using the magnetic contact printing technique. In this method a master tape of the mirror type is produced on a master tape recorder (FIG. 16, 1).

The magnetic surfaces of the master tape and the copy tape are mounted in contact with each other on a printing machine, as shown in FIG. 16 (2).

By controlling the pressure of both tapes between pinch drum and bias head, the magnetizing process is performed, applying a magnetic bias to the contact area. Special tape and a special bias head are required (see Figures 17 and 18).

FIG. 18

FIG. 19a

FIG. 19b-c

Cassette

The cassette is a completely sealed structure and measures 73 mm × 54 mm × 10.5 mm. It weighs about 20 g (see Figures 19a-c).

Prev. | Next