classification of voiced and unvoiced fourier transform

8/10/2019 classification of voiced and unvoiced fourier transform

1/16

Indian Instit

Instructor: Dr. R. B. P

Group No.

Group Members :

Name, Roll No.

Email ID :

te of Technology, I

EE202 : SIGNAL AND SYSTEMS

achori.

19

1. Aditi Kanjolia , 1200202

2. Keerthana Sravanthi, 120

[email protected]

1

DORE

313


2/16

2

Contents

Problem Statement and Objective____________________________________ 3

Introduction____________________________________________________________4

MATLAB Code__________________________________________________________8

Implementation________________________________________________________ 9

Bibliography___________________________________________________________16


3/16

3

PROBLEM STATEMENT AND

OBJECTIVE

CLASSIFICATION OF VOICED

And

UNVOICED SPEECH SIGNAL

Using

FOURIER TRANSFORM


4/16

4

Introduction

Speech is an acoustic signal produced from a speech production system. From our

understanding of signals and systems, the system characteristics depend on the design ofthe system. For the case of linear time invariant system, this is completely characterized in

terms its impulse response. However, the nature of response depends on the type of input

excitation to the system. A similar phenomenon happens in the production of speech also.

Based on the input excitation phenomenon, the speech production can be broadly

categorized into three activities. The first case where the input excitation is nearly periodic

in nature, the second case where the input excitation is random noise-like in nature and

third case where there is no excitation to the system. Accordingly, the speech signal can

be broadly categorized into three regions- voiced, unvoiced and silence speech.

Our aim is to classify between voiced and unvoiced speech.

Voiced sounds consist of fundamental frequency and its harmonic components produced by

vocal cords (vocal folds). The vocal tract modifies this excitation signal causing formant

(pole) and sometimes anti-formant (zero) frequencies. With purely unvoiced sounds, there

is no fundamental frequency in excitation signal and therefore no harmonic structure. The

airflow is forced through a vocal tract constriction which can occur in several places

between glottis and mouth. Some sounds are produced with complete stoppage of airflow

followed by a sudden release, producing an impulsive turbulent excitation often followed by

a more protracted turbulent excitation. Unvoiced sounds are also usually more silent and

less steady than voiced ones.

Voiced sounds, e.g., a, b, are essentially due to vibrations of the vocal cords, and are

oscillatory. Therefore, over short periods of time, they are well modelled by sums of

sinusoids. This makes short-time Fourier transform, a useful tool for speech processing.

Unvoiced sounds such as s, sh, are more noise-like, as shown in figure below. They have

wide band spectrum.


5/16

Figure- Disti

For many speech applications

speech. There are many ways

and it is based on the concep

Formants-

Wikipedia defines Formants speech science and phonetics

human vocal tract. It is often

the sound, though in vowels

child voice, the frequency of

and hence no peak is visible.

nction between voiced and unvoiced speec

, it is important to distinguish between voic

of doing it. We will use a basic method to d

of formants and the use of Fourier Transf

s the spectral peaks of the sound spectrum, formant is also used to mean an acoustic r

measured as an amplitude peak in the frequ

poken with a high fundamental frequency,

he resonance may lie between the widely-s

5

h.

d and unvoiced

this classification

rm.

of the voice". In esonance of the

ency spectrum of

s in a female or

read harmonics


6/16

6

Fourier Transform-

The Fourier transform, named after Joseph Fourier, is a mathematical transformation

employed to transform signals between time domain and frequency domain, which has

many applications in physics and engineering.

The Fourier Transform decomposes any function into a sum of sinusoidal basis functions.

Each of these basis functions is a complex exponential of a different frequency. The Fourier

Transform therefore gives us a unique way of viewing any function - as the sum of simple

sinusoids.

The Fourier Series showed us how to rewrite any periodic function into a sum of sinusoids.

The Fourier Transform is the extension of this idea to non-periodic functions.

The Fourier Transform of a function g(t) is defined by:

[Equation 1]

The result is a function off, or, frequency. As a result, G(f) gives how much power g(t)

contains at the frequencyf. G(f) is often called the spectrum of g. In addition, g can beobtained from G via the inverse Fourier Transform:

[Equation 2]

Equation [2] states that we can obtain the original function g(t) from the function G(f) via

the inverse Fourier transform. As a result, g(t) and G(f) form a Fourier Pair: they are distinct

representations of the same underlying identity. We can write this equivalence via the

following symbol:

[Equation 3]


7/16

7

Given below is a table of few examples of some alphabets with their classification. And in

parentheses are their phonetic transcriptions.

voiced unvoiced

b book

(b k)

p please

(pliz)

v vanilla

(v nIl )

f five

(faIv)

they

( eI)

thirty

( ti)

d dish

(dI )

t ten

(t n)

z zero

(z )

s sir

(s )

genre

( nr )

she

( i)


8/16

8

MATLAB CODE

We will use a MATLAB code to do our required experimentation. We record some sounds

using wavrecord command. Then we get the Fast Fourier Transform of each of them, using

fft command and then we classify them as voiced and unvoiced speech signal.

The MATLAB code is as follows

>> Fs= 11025; % Setting frequency

>>y=wavrecord(Fs,Fs,'int16'); %Recording sound

>> figure, plot(y)% Plotting the magnitude of the signal in time

domain

>> figure, plot(abs(fft(double(y)))) % Plotting the

frequency domain spectrum


9/16

9

IMPLEMENTATION

The above code was implemented on some vowels and consonants (A,P,B,S,Z,T and D).

Here are the results of the same:

A


10/16

10

P

Figure P speech signal in time domain.


11/16

11

Figure P speech signal in frequency domain.

B

Figure B speech signal in time domain.


12/16

12

Figure B speech signal in frequency domain.

S

Figure S speech signal in time domain.


13/16

13

Figure S speech signal in frequency domain.

Z

Figure Z speech signal in time domain.


14/16

14

Figure Z speech signal in frequency domain.

T

Figure T speech signal in time domain.


15/16

15

Figure T speech signal in frequency domain.

D

Figure D speech signal in time domain.


16/16

16

Figure D speech signal in frequency domain.

BIBLIOGRAPHY-

Signals and Systems, Oppenheim and Willsky

Signals and Systems Using MATLAB, Luis F. Chaparro

Separation of Voiced and Unvoiced using Zero crossing rate and

Energy of the Speech Signal -Bachu R.G., Kopparthi S., Adapa B., Barkana B.D.

Web sources- Wikipedia, Saakshat Lab, IITG.

Documents

classification of voiced and unvoiced fourier transform