Friday, 3 February 2012

Analog to Digital Coversion

The voice frequency (VF) or voice band ranges from approximately 300 Hz to 3.30 kHz. After adding the guard bands (a guard band is an unused part of the radio spectrum between radio bands, for the purpose of preventing interference) of 0-300 Hz and  3.3kHz-4kHz , the net bandwidth that is required for voice transmission will be 4 kHz. Now we need to convert this signal into a digitized one. And the concept behind digitizing a sound signal is Nyquist theorem. Working at Bell Labs, Harry Nyquist discovered that it was not necessary to capture (or send) the entire analog waveform. Only the samples of the wave taken at various points need to be captured to recreate the original one. He also found that in order to reconstruct the original waveform with enough information as the original one, the sampling rate must be at least twice the signal bandwidth.
This concept became the foundation for PCM (Pulse Code Modulation). PCM is the fundamental thing for any digital transmission or switching technology. As I previously mentioned, the highest frequency content in the voice sample is 4 kHz (4000 samples per second). According to Nyquist criterion, the sampling rate should be at least twice the highest frequency content of the selected signal. So here the sampling rate will be 8000 samples per second. After this, the amplitude of each sample is measured based on a logarithmic scale. This measurement process is called Quantization.
There are two PCM algorithms defined within CCITT G.711, called "A-Law" and "Mu-Law". Mu-Law PCM is used in North America and Japan, and A-Law used in most other countries. In both A-Law and Mu-Law PCM, 8 bit codewords are used to represent each sample's amplitude (voltage value of the voice sample at the sampling instance) To attain this 8 bit quantization, a logarithmic scale is used. The y-axis of this scale is divided into 16 segments called chords ( 8 on the positive side & 8 on the negative side).  Within each chord are 16 uniform quantization intervals, or steps. The length of the steps depend on the chord number. For example, chord 1 has 16 steps having a length of Δ each. Chord 2 has 16 steps having a length of 2Δ each. In general, the step size in the n-th chord is 2n−1Δ.(as shown in Figure below)
Now the representation of the 8 bits is as shown below.


 
The left-most bit is known as the "sign" bit or "polarity" bit, and is a 1 for positive values and a 0 for negative values (both PCM types). This is the Most Significant Bit (MSB) and is transmitted first. The next three bits indicates the chord value, and the final four bits denotes the step value.(as shown in Figure 1). This way of defining an 8 bit pattern for a sample value is called encoding.
Now why do you want to use a logarithmic scale for the quantization process? Dividing the amplitude of the voice signal up into equal positive and negative steps is not an efficient way to encode voice into PCM. This does not take advantage of a natural property of human voice :- voices create low amplitude signals most of the time (people seldom shout on the telephone). That is, most of the energy in human voice is concentrated in the lower end of voice’s dynamic range. To create the highest-quality voice reproduction from PCM, the quantization process must take into account this fact that most voice signals are typically of lower amplitude. To do this the voice coder adjusts the chords and steps so that most of them are in the low-amplitude end of the total encoding range. In this modified scheme, all step sizes are not equal. Step sizes are smaller for lower-amplitude signals. Quantization levels (chords) distributed according to a logarithmic function, instead of linear function gives finer resolution, or smaller quantization steps, at lower signal amplitudes (as shown in Figure 3). Therefore, higher-fidelity reproduction of voice is achieved.
And finally, as each sample is represented using 8 bits, the total bits that  needs to be transmitted per second (for the recreation of the original 4kHz band sample) is 8000 samples/sec * 8 bits/sample = 64000 bits transmitted per second or 64Kbps. And  therefore the minimum bandwidth required to convey your voice (in digital terms) is 64Kbps.




No comments:

Post a Comment