3.5. Technical appendix: digital audio This section describes digitisation in more technical detail. It may be skipped if all you need is a basic overview. 3.5.1. Analogue and digital sound When a sound is first recorded onto a physical medium such as magnetic tape, it is converted initially into an electrical signal by a microphone. This electrical signal can in turn be converted into magnetism, which is used to magnetise a metal coating on the tape. This is an analogue process; the magnetism fluctuates continuously in sympathy with the fluctuations in air pressure which constitute the original sound. The tape contains a magnetic imprint, an analogous copy, of the sound. On playback, the magnetic fluctuations caused by the tape movement are converted back to electrical signals, and then ultimately back into sound by a loudspeaker. A record player works in a similar way, except that here the electrical fluctuations are used to cut a groove in a disk: on playback, the groove creates a physical movement of the stylus, which is converted back into electrical signals and so on. In the case of early gramophone and phonograph recording, the sound pressure waves were converted directly into physical movement in order to make an imprint on the medium. No electrical signals were involved at all. Digital audio works differently. It still requires the air pressure to be converted into an electrical signal, so it relies on some of the same technology (microphones and loudspeakers in particular). But what happens to the signal subsequently is different. In digital audio, very rapid measurements are made of the strength of the electrical signal. Each measurement corresponds to the loudness of the sound at that instant. These measurements are made by a device called an analogue to digital converter, abbreviated to ADC, which produces a numerical value for each measurement. This process is called sampling and the measurements are called samples. In most digital audio situations, samples are taken many thousands of times per second and the numbers are stored in electronic 'memory' chips, either inside a computer or in some dedicated device such as a digital recorder. Ultimately the data may be stored on the computer's hard disk, or onto magnetic tape: in this case the tape records magnetically encoded numerical data, not a direct magnetic imprint of the sound as an analogue tape recorder would. To playback the sound, the process is done in reverse. A digital to analogue converter, or DAC, is used to convert the samples back into a fluctuating electrical signal which can then be converted into air pressure waves by a loudspeaker. 3.5.2. Resolution and sound quality The division of the sound into discrete samples means that there is some loss of information. The process fails to capture fluctuations in the signal which happen in the time interval between one sample and the next. Similarly it fails to capture fluctuations which are smaller than its measurement steps. So the resolution of the sampling process determines how much information about the sound is captured and how much is lost. Fortunately our hearing has limited resolution. If sufficient information is present in the sound, we don't perceive any losses; they are there, they're just below the threshold at which our aural system is able to discriminate. So if we use a high enough resolution, we can capture enough sonic information to make it seem that all the sound is there. The sampling resolution is determined by how finely the process measures time, and how finely it measures signal magnitude. The first aspect is called the sampling rate, and it corresponds to the number of samples taken per second. It is measured in Hertz, the unit of frequency (abbreviated to Hz). One Hertz corresponds to one sample per second, and one kilohertz (kHz) corresponds to one thousand samples per second. The second aspect is variously called sample resolution, sample width, or bit width. It relates to the number of dynamic levels which the process can measure. A sampling process with low resolution might be able to measure 4 different dynamic levels, from the softest to the loudest. This would be a very limited sampling process. One with a high resolution might be able to measure 100,000 levels or more. The sample resolution is determined by the number of binary digits (or bits) used to measure and store the data. If more bits are used, more dynamic levels can be measured and stored. If 2 bits are used, this gives 4 levels. If 3 bits are used, this gives 8 levels. If 16 bits are used, this gives 65536 levels. The minimum resolution for basic, full-bandwidth reproduction of sound (i.e. where for many purposes the losses can be considered imperceptible) would use a sampling rate of 44.1 kHz and 16-bit sample width. Many professional recording systems use higher resolutions. Top of Page
|