AUDIO - WordPress.com · 3/9/2017 · Membutuhkan 8000 sample per detik ... Berapa banyak memori...

Preview:

Citation preview

AUDIO Muhammad Aminul Akbar

WHAT IS SOUND?

Sound is a (pressure) wave which is created by a vibrating object.

HOW SOUND IS PRODUCED ?

The vibrations by a vibrating object set particles in the surrounding

medium (typical air) in vibrational motion

Molecules in air are disturbed, one bumping against another

An area of high pressure moves through the air in a wave

Thus a wave representing the changing air pressure can be used to

represent sound

• The sound wave is referred

to as a longitudinal wave.

• The result of longitudinal

waves is the creation of

compressions and

rarefactions within the air.

THE SOUND WAVE

WAVELENGTH, AMPLITUDE, FREQUENCY OF A WAVEAmplitude : The maximum distance of moving particle in a medium from their equilibrium position

The frequency f of a wave is measured as the number of complete back-and-forth vibrations of a particle of the medium per unit of time.

1 Hertz = 1 vibration/second

Depending on the medium, sound travels at some speed c which defines the wavelength l

l = c/f

MEASURING THE INTENSITY OF SOUND

Normally, sound intensity is measured as a relative ratio to some standard intensity ( Io )we define the relative sound intensity level as :

I is the intensity of the sound expressed in watts per meter and Iois the reference intensity defined to be 10-12 w/m2. This value of Io is the threshold (minimum sound intensity) of hearing at 1 kHz for a young person under the best circumstances.

EXAMPLES:

If we measured a sound intensity to be 100 greater that the threshold reference, what would be the sound level expressed in dB?

SL(dB) =10 * Log 100∗10−12

10−12

10*log(100) = 20dB

EXAMPLE 2

The threshold of pain is about 120 dB. How many times greater in intensity (in w/m2) is this?

DIGITIZATION OF SOUND

Microphones, video cameras produce analog signals (continuous-valued voltages) as illustrated in the figure below.

To get audio or video into a computer, we have to digitize it (convert it into a stream of numbers) Need to convert Analog-to-Digital

SAMPLING AND QUANTIZATION

1. Sampling - divide the horizontal axis (the time dimension) into discrete pieces

2. Quantization -divide the vertical axis (signal strength) into pieces

SAMPLING

The rate at which sampling is performed is called the sampling frequency

Frequencies over 22.01 kHz are filteredout before sampling is done.

The sampling rate must be at least twice the highest frequency component of the sound (Nyquist Theorem).

The human voice can reach approximately 4 kHz, For audio, typical sampling rates are from 8 kHz (8,000 samples per second) to 48 kHz,

How many Samples to take?

11.025 KHz

-- Speech (Telephone 8KHz)

22.05 KHz

-- Low Grade Audio

(WWW Audio, AM Radio)

44.1 KHz

-- CD Quality

QUANTIZATION

Sampling in the amplitude or voltage dimension is called quantization.

Typical uniform quantization rates are 8-bit and 16-bit

8-bit quantization divides the vertical axis into 256 levels, and 16-bit divides it into 65,536 levels.

NYQUIST'S SAMPLING THEOREM

Suppose we are sampling a sine wave in figure below. How often do we need to sample it to figure out its frequency?

NYQUIST'S SAMPLING THEOREM

If we sample at 1 time per cycle, we can think it's a constant.

NYQUIST'S SAMPLING THEOREM

If we sample at 1.5 times per cycle, we can think it's a lower frequency sine wave

Now if we sample at twice the sample frequency, (Nyquist Rate), we start to make some progress. In this case (at these sample points) we see we get a sawtooth wave that begins to start crudely approximating a sine wave

Nyquist rate -- For lossless digitization, the sampling rate should be at least twice the maximum frequency responses. Indeed many times more the better.

NYQUIST'S SAMPLING THEOREM

Teorema sampling Nyquist menjamin sample data mengandungsemua informasi dari sinyal orisinal

Voice data (speech) limited to below 4000Hz

Membutuhkan 8000 sample per detik (2x4000Hz Nyquist)

Sistem telepon dapat mendigitalisasi voice dengan 128 level atau256 level.

Level-level tersebut disebut level kuantisasi

Jika128 level, maka bit tiap sampel = 7 bits (2 ^ 7 = 128).

Jika 256 level, maka bit tiap sampel = 8 bits (2 ^ 8 = 256).

8000 samples/sec x 7 bits/sample = 56Kbps for a single voice channel.

8000 samples/sec x 8 bits/sample = 64Kbps for a single voice channel.

NYQUIST'S SAMPLING THEOREM

QUANTIZATION INTERVAL

If Vmax is the maximum positive and negative signal amplitude and n is the number of binary bits used, then the magnitude of the quantization interval, q, is defined as follows:

For example, what if we have 8 bits and the values range from –1000 to +1000?

n

Vq

2

2 max

SIGNAL-TO-QUANTIZATION-NOISE RATIO (SQNR)

For digital signals, we must take into account the fact that only quantized values are stored, we effectively force all continuous values of voltage into only 256 different values

Inevitably, this introduces a roundoff error. Although it is not really “noise,” it is called quantization noise (or quantization error).

The actual signal may differ from the code word by up to plus or minus q/2, where q is the size of the quantization interval.

Quantization

Intervals and

Resulting

Error

LINEAR VS. NON-LINEAR QUANTIZATION

In linear quantization, each code word represents a quantization interval of equal length.

In non-linear quantization, you use more digits to represent samples at some levels, and less for samples at other levels.

For sound, it is more important to have a finer-grained representation (i.e., more bits) for low amplitude signals than for high because low amplitude signals are more sensitive to noise. Thus, non-linear quantization is used.

LINEAR VS. NON-LINEAR QUANTIZATION

0123456789101112131415

Strong signal

Weak signal

0

1

2

3

456789

1011

12

13

14

15Quantizing level

Without nonlinear encoding With nonlinear encoding

3.23.9

2.8 3.4

1.2

4.2

3 4 3 3

1

4

011 100 011 011 001 100

Original signal

PAM pulse

PCM pulse

with quantized error

011100011011001100PCM output

PULSE CODE MODULATION (PCM)

PCM menggunakan pengkodean kuantisasi non-linear: spasi amplituda dari tiap level tidak linear

Ada step kuantisasi yang lebih banyak pada amplitudarendah (low)

Ini untuk mengurangi distorsi sinyal secara overall.

PULSE CODE MODULATION (PCM)

Teknik Elektro

Universitas

Brawijaya

DELTA MODULATION (DM)

Pada Delta Modulation, sinyal analog ditracking.

Analog input diaproksimasi dengan staircase function

Apakah Move up (naik) atau down (turun) satu level () padatiap interval sampel

Bit 1 digunakan untuk merepresentasi kenaikan level teganganpd sinyal, dan bit 0 untuk merepresentasi turunnya level tegangan. -> Output dari DM adalah bit tunggal untuk setiapsample

Digunakan juga pada berbagai teknik Kompresi Data

e.g. Interframe coding techniques for video

DELTA MODULATION

DELTA MODULATION

Staircase function

Delta Modulation output

HOW ABOUT MEMORY SPACE?

AUDIO: a sequence of microphone readings on several channels.

Readings (samples) are normally taken at 11000, 22K or 44K per second and may be 8, 12 or 16-bit values.

Q: Berapa banyak memori dibutuhkan untuk menyimpan rekaman audio selama 5 menit dengan menggunakan 2 channel dan 16 bit per sample?

HOW ABOUT MEMORY SPACE?KB = 1024 bytes MB = 1,048,576 bytes GB = 1,073,741,824 bytes

Jika: Fs = 11000 Hz

Nsampel = 5 (menit) x 60 (detik/menit) x 11000 (sampel/detik) = 3.300.000 sampel

Nbit = 16 (bit/sampel) x 3.300.000 sampel = 52.800.000 bit = 6.600.000 byte = 6,295 MB

Nbit Stereo (2 channel) = 6,295 MB x 2 ≈ 12,6 MB

Jika: Fs = 44100 Hz

Nsampel = 5 (menit) x 60 (detik/menit) x 44100 (sampel/detik)= 13.230.000 sampel

Nbit = 16 (bit/sampel) x 13.230.000 sampel= 211.680.000 bit = 26.460.000 byte ≈ 25,234 MB

Nbit Stereo = 25,234 MB x 2 ≈ 50,468 MB satu lagu pada CD audio

RAW DIGITAL AUDIO ..

Makin besar FS, makin baik kualitas rekaman audio, makin banyak jumlah bit yang dibutuhkan!

Makin besar jumlah bit / sampel, makin baik kualitas rekaman audio, makinbanyak jumlah bit yang dibutuhkan!, demikian pula sebaliknya

Kualitas Audio Digital adalah linear dengan kebutuhan memori!

AUDIO FILE FORMAT

Uncompressed Audio Formats

1. PCM

2. WAV

Lossy Compressed Audio Formats

1. MP3

2. AAC

Lossless Compressed Audio Formats

1. FLAC

2. ALAC

PCM

PCM stands for Pulse-Code Modulation, a digital representation of raw analog audio signals.

There is no compression involved. The digital recording is a close-to-exact representation of the analog sound.

PCM is the most common audio format used in CDs and DVDs

WAV

WAV stands for Waveform Audio File Format (also called Audio for Windows at some point but not anymore). It’s a standard that was developed by Microsoft and IBM back in 1991.

A lot of people assume that all WAV files are uncompressed audio files, but that’s not exactly true. WAV is actually just a Windows container for audio formats. This means that a WAV file can contain compressed audio, but it’s rarely used for that.

Most WAV files contain uncompressed audio in PCM format. The WAV file is just a wrapper for the PCM encoding, making it more suitable for use on Windows systems. However, Mac systems can usually open WAV files without any issues.

MP3

MP3 stands for MPEG-1 Audio Layer 3. It was released back in 1993 and quickly exploded in popularity, eventually becoming the most popular audio format in the world for music files.

The main pursuit of MP3 is to cut out all of the sound data that exists beyond the hearing range of most normal people and to reduce the quality of sounds that aren’t as easy to hear, and then to compress all other audio data as efficiently as possible.

AAC

AAC stands for Advanced Audio Coding. It was developed in 1997 as the successor to MP3.

The compression algorithm used by AAC is much more advanced and technical than MP3, so when you compare a particular recording in MP3 and AAC formats at the same bitrate, the AAC one will generally have better sound quality.

AAC is standard audio compression method used by YouTube, Android, iOS, iTunes, later Nintendo portables, and later PlayStations.

FLAC

FLAC stands for Free Lossless Audio Codec.

FLAC can compress an original source file by up to 60% without losing a single bit of data.

FLAC is an open source and royalty-free format rather than a proprietary one,

FLAC is supported by most major programs and devices and is the main alternative to MP3 for CD audio. With it, you basically get the full quality of raw uncompressed audio in half the file size

ALAC

ALAC stands for Apple Lossless Audio Codec.

It was developed and launched in 2004 as a proprietary format but eventually became open source and royalty-free in 2011.

iTunes and iOS both provide native support for ALAC and no support at all for FLAC.

MIDI

MIDI, which dates from the early 1980s, is an acronym that stands for Musical Instrument Digital Interface.

a protocol that enables computer, synthesizers, keyboards, and other musical device to communicate with each other.

Components of a MIDI System : Synthesizer, Sequencer, Track, Channel,Timbre, Pitch, Voice, Path

ROLE OF MIDI

MIDI makes music notes (among other capabilities), so is useful for inventing, editing, and exchanging musical ideas that can be encapsulated as notes.

One strong capability of MIDI-based musical communication is the availability of a single MIDI instrument to control other MIDI instruments, allowing a masterslave relationship: the other MIDI instruments must play the same music, in part, as the master instrument, thus allowing interesting music

MIDI is aimed at music, which can then be altered as the “user”wishes.

THAT’S ALL

Recommended