Audio/Image Compression
(Application of Discrete Fourier Cosine Transforms)
Audio Compression
Audio compression is a
technique of retaining relevant audio information in a smaller storage
format. This can be achived a number of ways and broad categories
of techniques are addressed as follows:
- Loss-less compression -
This technique uses repeating patterns in the audio file to reduce the
data file. The simplest and obvious method is to check and see if
a stereo file only contains mono data. If so, both audio channels
are identical and can, therefore, be transmitted as one channel rather
than two.
- Frequency filtering -
This technique transforms the time-dependent audio file into the
frequency domain by use of a Fast Fourier Transform (FFT).
Non-existent frequencies (loss-less compression) or insignificant
frequencies (lossy compression) are removed from the audio data
and stored in a format that takes advantage of the missing
frequencies. The common MP3 format uses this technique to
compress files by a factor of 8 - 10. MP3 converters have a
compression parameter which allows you to increase the compression
ratio at the expense of sound quality. This is inherently a
stereo format. Other formats, such as WMA, has a means of
identifying mono audio and, therefore, provided additional compression
by transmitting only one channel of compressed data.
- Sampling rate and bit resolution - All data formats used by
computers are inherently digital. As a result, there is a limit
to the faithful capture and reproduction of sound, which is analog in
nature. Since the average human ear responds to frequencies from
20 - 20 kHz, most audio data is recorded at a sampling frequency of
44.1 kHz. To capture the dynamic range (loudness) of the sound at
any instant of time, the analog single must be converted into a binary
number, which corresponds to a sound intensity. If properly
adjusted, the dynamic range is subdivided between 2N levels,
where N is the number of bits used to represent the sampled
sound. Audio CD uses 16-bits, which allows for 65,536 differnent
levels to represent the sound intensity at an instant of time.
Reducing the sampling rate reduces the size of the data file, but
eliminates higher frequencies. Although this is not desireable in
most cases, when dealing with human speech, higher frequencies are not
as crucial. As a result, sampling at 8 - 10 kHz is
sufficient. Reducing the number of bits also allows for smaller
audio files. However, dropping half of your bits can decrease
your sampling levels significantly.
Whenever lossy compression techniques are used, information is lost and
the quality of reproduced audio is compromised. The goal of
compression technology is to minimize perceived deterioration in the
audio signal and to minimize the amount of stored or transmitted
information.
Reducing the number of bits
In this exercise an audio file is modified to represent different
number of bits. This leads to a sampling error on the intensity
of the sound. The sound is from 2001: A Space Odyssey.
Click on each of the files to hear the change in the sound
quality.
Image Compression
For an example of compression using the JPEG format see page on JPEG Compression.