The voice frequency (VF) or voice band ranges from approximately 300 Hz to 3.30 kHz. After adding
the guard bands (a guard band is an unused part of the radio spectrum between
radio bands, for the purpose of preventing interference) of 0-300 Hz and 3.3kHz-4kHz , the net bandwidth
that is required for voice transmission will be 4 kHz. Now we need to convert
this signal into a digitized one. And the concept behind digitizing a sound
signal is Nyquist theorem. Working at Bell Labs, Harry Nyquist discovered that
it was not necessary to capture (or send) the entire analog waveform. Only the samples
of the wave taken at various points need to be captured to recreate the
original one. He also found that in order to reconstruct the original waveform
with enough information as the original one, the sampling rate must be at least
twice the signal bandwidth.
This concept became the foundation
for PCM (Pulse Code Modulation). PCM is the fundamental thing for any digital
transmission or switching technology. As I previously mentioned, the highest
frequency content in the voice sample is 4 kHz (4000 samples per second). According
to Nyquist criterion, the sampling
rate should be at least twice the highest frequency content of the selected
signal. So here the sampling rate will be 8000 samples per second. After this,
the amplitude of each sample is measured based on a logarithmic scale. This measurement
process is called Quantization.
There are two PCM algorithms defined within CCITT G.711, called "A-Law" and "Mu-Law". Mu-Law PCM is used in
North America and Japan ,
and A-Law used in most other countries. In both A-Law and Mu-Law PCM, 8 bit codewords are used to represent each sample's amplitude (voltage value of the voice sample at the sampling instance) To attain this 8 bit quantization, a logarithmic scale is used. The y-axis of this scale is divided into 16 segments called chords ( 8 on the positive side & 8 on the negative side). Within each chord are 16 uniform quantization intervals, or steps. The length of the steps depend on the chord number. For example, chord 1 has 16 steps having a length of Δ each. Chord 2 has 16 steps having a length of 2Δ each. In general, the step size in the n-th chord is 2n−1Δ.(as shown in Figure below)
Now the representation of the 8 bits is as shown below.
The
left-most bit is known as the "sign" bit or "polarity" bit,
and is a 1 for positive values and a 0 for negative values (both PCM types).
This is the Most Significant Bit (MSB) and is transmitted first. The next three
bits indicates the chord value, and the final four bits denotes the step
value.(as shown in Figure 1). This way of defining an 8 bit pattern for a
sample value is called encoding.
Now why do you want to use a
logarithmic scale for the quantization process? Dividing the amplitude of the
voice signal up into equal positive and negative steps is not an efficient way to encode voice into PCM. This does not take advantage
of a natural property of human voice :- voices create low amplitude
signals most of the time (people seldom shout on the telephone). That is, most
of the energy in human voice is concentrated in the lower end of voice’s
dynamic range. To create the highest-quality voice reproduction from PCM, the
quantization process must take into account this fact that most voice signals
are typically of lower amplitude. To do this the voice coder adjusts the chords
and steps so that most of them are in the low-amplitude end of the total
encoding range. In this modified scheme, all step sizes are not equal. Step
sizes are smaller for lower-amplitude signals. Quantization levels (chords)
distributed according to a logarithmic function, instead of linear function
gives finer resolution, or smaller quantization steps, at lower signal amplitudes
(as shown in Figure 3). Therefore, higher-fidelity reproduction of voice is
achieved.
And finally, as each sample is
represented using 8 bits, the total bits that needs to be transmitted per
second (for the recreation of the original 4kHz band sample) is 8000
samples/sec * 8 bits/sample = 64000 bits transmitted per second or 64Kbps.
And therefore the minimum bandwidth required to convey your voice (in
digital terms) is 64Kbps.
No comments:
Post a Comment