Speech Recog Report - For Merge

8/4/2019 Speech Recog Report - For Merge

1/78

1

PART-I

Preliminary Investigation


2/78

2

1.1 Introduction

This report is a System analysis and design project, which is a study of global

positioning system software receiver Technology. In this project we studied

how gps receiver will works and processed the signal get desired location

,time and position . We start with with gps and its various components

,process and receiver tracking system . Hence, this system makes it possible

tracking the location of things which consists gps receiver. This processes

changes the signal to digits.. The process involves many models and theoriesthat makes the gps successful.

Gps is used in large number of areas. For examples mobile phone

tracking,vehicle tracking system information providing using automated call ,

defence uses, robotics, etc. It facilitates the human computer interaction and

also provides a way to communicate with satellite communication.

The ultimate goal of the technology is to be able to produce a system that can

recognize with 100% accuracy the time and location .. Even after years of

research in this area, the best gps software applications still cannot recognize

location with 100% accuracy. Some applications are able to recognize over 95%

position when environment factors are constant.

Computer software that tracks the location of real world objects enable user to

have conversations with the satellite.


3/78

3

2. Objective

To study global positioning system and its various hardware

components and software used for this. . In this project our aim is to:

Working of mobile phone gps receiver

Hardware components of gps

Software used for gps receiver

Algorithms used for software


4/78

4

3. Problem definitionSoftware GPS receivers can provide full access to base

Band signal processing inside the receiver channels. Thus,

It has become the key component when investigating and

Developing advanced GPS signal processing techniques.

In this presentation, a pure software gps receiver, developed in the

plan group of the university of Calgary , It consists of receivers thatdecode the signals from the satellites.

The receiver performs following tasks:

Selecting one or more satellites

Acquiring GPS signals

Measuring and tracking

Recovering navigation data


5/78

5

4.Working of gps For those who are unfamiliar with the term, GPS stands for Global

Positioning System, and is a way of locating a receiver in three dimensionalspace anywhere on the Earth, and even in orbit about it.

GPS is arguably one of the most important inventions of our time, and has

so many different applications that many technologies and ways of working

are continually being improved in order to make the most of it.

To understand exactly why it is so useful and important, we should first

look at how GPS works. More importantly, looking at what technological

achievements have driven the development of this fascinating positioning

system.

4.1Signals

In order for GPS to work, a network of satellites was placed into orbit

around planet Earth, each broadcasting a specific signal, much like a

normal radio signal. This signal can be received by a low cost, lowtechnology aerial, even though the signal is very weak.

Rather than carrying an actual radio or television program, the signals that

are broadcast by the satellites carry data that is passed from the aerial,

decoded and used by to the GPS software.


6/78

6

The information is specific enough that the GPS software can identify the

satellite, its location in s pace, and calculate the time that the signal took to

travel from the satellite to the GPS receiver.

Using different signals from different satellites, the GPS software is able to

calculate the position of the receiver. The principle is very similar to that

which is used in orienteering if you can identify three places on your map,

take a bearing to where they are, and draw three lines on the map, then you

will find out where you are on the map.

The lines will intersect, and, depending on the accuracy of the bearings, the

triangle that they form where they intersect will approximate your position,

within a margin of error.

GPS software performs a similar kind of exercise, using the known

positions of the satellites in space, and measuring the time that the signalhas taken to travel from the satellite to Earth.

The result of the trilateration (the term used when distances are used

instead of bearings) of at least three satellites, assuming that the clocks are

all synchronized enables the software to calculate, within a margin of error,

where the device is located in terms of its latitude (East-West) and

longitude (North-South) and distance from the center of the Earth.


7/78

7

4.2Timing & Correction

In a perfect world, the accuracy should be absolute, but there are many

different factors which prevent this. Principally, it is impossible to ensure

that the clocks are all synchronized.

Since the satellites each contain atomic clocks which are extremely

accurate, and certainly accurate with respect to each other, we can assumethat most of the problem lies with the clock inside the GPS unit itself.

Keeping the cost of the technology down to a minimum is a key part of the

success of any consumer device, and it is simply not possible to fit each

GPS unit with an atomic clock costing tens of thousands of dollars. Luckily,

in creating the system, the designers designed GPS to work whether the

receivers clock is accurate or not.

There are a few solutions. However the solution that was chosen uses a

fourth satellite to provide a cross check in the trilateration process. Since

trilateration from three signals should pinpoint the location exactly, adding

a fourth will move that location; that is, it will not intersect with the

calculated location.

This indicates to the GPS software that there is a discrepancy, and so it

performs an additional calculation to find a value that it can use to adjust all

the signals so that the four lines intersect.


8/78

8

Usually, this is as simple as subtracting a second (for example) from each

of the calculated travel times of the signals. Thus, the GPS software can

also update its own internal clock; and means that not only do we have an

accurate positioning device, but also an atomic clock in the palm of our

hands.

4.3 Mapping

Knowing where the device is in space is one thing, but it is fairly useless

information without something to compare it with. Thus, the mapping part

of any GPS software is very important; it is how GPS works our possible

routes, and allows the user to plan trips in advance.

In fact, it is often the mapping data which elevates the price of the GPS

solution; it must be accurate and updated reasonably frequently. There are,

however, several kinds of map, and each is intended for different users,

with different needs.

Road users, for example, require that their mapping data contains accurate

information about the road network in the region that they will be traveling

in, but will not require detailed information about the lie of the land they

do not really worry about the height of hills and so forth.

On the other hand, hiking GPS users might wish to have a detailed map of

the terrain, rivers, hills and so forth, and perhaps tracks and trails, but not

roads. They might also like to adorn their map with specific icons of things


9/78

9

that they find along the way and that they wish to keep a record of not to

mention waypoints; locations to make for on their general route.

Finally, marine users need very specific information relating to the sea bed,navigable channels, and other pieces of maritime data that enables them to

navigate safely. Of course, the sea itself is reasonably featureless, but

underneath quite some detail is needed to be sure that the boat will not

become grounded.

Fishermen also use marine GPS to locate themselves and track themovement of shoals of fish both in real time, and to predict where they will

be the next day. The advent of GPS fixing has also meant that co-operative

fishing has become much easier, where there are several boats all relaying

their locations to each other while they locate the best fishing waters.

Special kinds of marine GPS, known as fishfinders, also combine severalfunctions in one to help fishermen. A fishfinder comprises GPS and also

sonar, along with advanced tracking functions and storage for various kinds

of fishing and maritime information.

5.Requirements of gps

5.1Hardware components

Antenna

RF Board

RF Front End


10/78

10

RF/IF down-conversion board (with FPGA)

DSP Board

DSP

5.2Software components Firmware

RF Board FPGA

DSP Board FPGA

SW

Signal Processing SW

Navigation SW

Hardware components:


11/78

11

2: Architecture of Signal Tap

AntennaThe GPS antenna combines a planar antenna and a frequency converter,

which translates the high-frequency phase-modulated spread spectrum

signal of the GPS system to an intermediate frequency. This way a

standard coaxial cable (e.g. RG58) can be used for the connection with the

GPS clock and a distance of up to 300 meters (with RG58) or even 700

meters (with a low-loss cable type like RG213) between receiver and

antenna is possible without additional amplifier.

Ambient temperature: -40 ... 65C Warranty: Three-Year Warranty RoHS-

Status of the product: This product is fully RoHS compliant WEEE status of

the product: This product is handled as a B2B category product. In order to


12/78

12

secure a WEEE compliant waste disposal it has to be returned to the

manufacturer. Any transportation expenses for returning this product (at

2 RF BOARD

RF board stands for Radio Frequency Printed Circuit Boards. The

frequency for RF board is normally between 300MHz ~ 3GHz, or much

bigger, so normally FR4 board cannot meet the requirements, so we need to

use special material to achieve the high frequency and we named this kind

of boards as RF boards. RF board is excellent in high frequency

performance due to its low dielectric tolerance and loss of material.

RF board is ideal for applications with higher operating frequency

requirements. Right now, we normally use following material:The fabricate

process is similar like FR4, but the copper plating is more complex than

FR4, because material characteristics, its much harder to metalize the

through hole (copper plating), and other process is complex than FR4, so

need unique handling method and experienced workers.e from the computerfans, squeaking chairs, or heavy breathing. e.g., creative sound cards, intel

sound cards, acer sound card, philips sound cards.

3 RF FRONT:

In a radio receiver circuit, the RF front end is a generic term for all the

circuitry between the antenna and the first intermediate frequency (IF)
http://www.bestpcbs.com/products/FR4-pcb.htmhttp://www.bestpcbs.com/products/FR4-pcb.htmhttp://www.bestpcbs.com/products/FR4-pcb.htmhttp://en.wikipedia.org/wiki/Radio_receiverhttp://en.wikipedia.org/wiki/Radio_receiverhttp://en.wikipedia.org/wiki/Electrical_circuithttp://en.wikipedia.org/wiki/Electrical_circuithttp://en.wikipedia.org/wiki/Antenna_%28radio%29http://en.wikipedia.org/wiki/Antenna_%28radio%29http://en.wikipedia.org/wiki/Intermediate_frequencyhttp://en.wikipedia.org/wiki/Intermediate_frequencyhttp://en.wikipedia.org/wiki/Intermediate_frequencyhttp://en.wikipedia.org/wiki/Antenna_%28radio%29http://en.wikipedia.org/wiki/Electrical_circuithttp://en.wikipedia.org/wiki/Radio_receiverhttp://www.bestpcbs.com/products/FR4-pcb.htm


13/78

13

stage. It consists of all the components in the receiver that process the

signal at the original incoming radio frequency (RF), before it is converted

to a lower intermediate frequency (IF). In microwave and satellite receivers

it is often called the low-noise block (LNB) or low-noise downconverter

(LND) and is often located at the antenna, so that the signal from the

antenna can be transferred to the rest of the receiver at the more easily

handled intermediate frequency.

For most super-heterodyne architectures, the RF front end consists of:

An impedance matching circuit to match the input impedance of the

receiver with the antenna, so the maximum power is transferred

from the antenna;

A 'gentle' band-pass filter (BPF) to reduce input noise and image

frequency response; An RF amplifier , often called the low-noise amplifier (LNA). Its

primary responsibility is to increase the sensitivity of the receiver by

amplifying weak signals without contaminating them with noise, so

they are above the noise level in succeeding stages. It must have a

very low noise figure (NF).

The mixer , which mixes the incoming signal with the signal from a

local oscillator (LO) to convert the signal to the intermediate

frequency (IF).

RF/IF DOWN CONVERSION:
http://en.wikipedia.org/wiki/Radio_frequencyhttp://en.wikipedia.org/wiki/Radio_frequencyhttp://en.wikipedia.org/wiki/Microwavehttp://en.wikipedia.org/wiki/Microwavehttp://en.wikipedia.org/wiki/Microwavehttp://en.wikipedia.org/wiki/Satellite_receiverhttp://en.wikipedia.org/wiki/Satellite_receiverhttp://en.wikipedia.org/wiki/Satellite_receiverhttp://en.wikipedia.org/wiki/Low-noise_block_converterhttp://en.wikipedia.org/wiki/Low-noise_block_converterhttp://en.wikipedia.org/wiki/Superheterodyne_receiverhttp://en.wikipedia.org/wiki/Superheterodyne_receiverhttp://en.wikipedia.org/wiki/Superheterodyne_receiverhttp://en.wikipedia.org/wiki/Impedance_matchinghttp://en.wikipedia.org/wiki/Impedance_matchinghttp://en.wikipedia.org/wiki/Impedance_matchinghttp://en.wikipedia.org/wiki/Band-pass_filterhttp://en.wikipedia.org/wiki/Band-pass_filterhttp://en.wikipedia.org/wiki/Image_frequencyhttp://en.wikipedia.org/wiki/Image_frequencyhttp://en.wikipedia.org/wiki/Image_frequencyhttp://en.wikipedia.org/wiki/Amplifierhttp://en.wikipedia.org/wiki/Amplifierhttp://en.wikipedia.org/wiki/Low-noise_amplifierhttp://en.wikipedia.org/wiki/Low-noise_amplifierhttp://en.wikipedia.org/wiki/Noise_figurehttp://en.wikipedia.org/wiki/Noise_figurehttp://en.wikipedia.org/wiki/Noise_figurehttp://en.wikipedia.org/wiki/Frequency_mixerhttp://en.wikipedia.org/wiki/Frequency_mixerhttp://en.wikipedia.org/wiki/Local_oscillatorhttp://en.wikipedia.org/wiki/Local_oscillatorhttp://en.wikipedia.org/wiki/Intermediate_frequencyhttp://en.wikipedia.org/wiki/Intermediate_frequencyhttp://en.wikipedia.org/wiki/Intermediate_frequencyhttp://en.wikipedia.org/wiki/Intermediate_frequencyhttp://en.wikipedia.org/wiki/Intermediate_frequencyhttp://en.wikipedia.org/wiki/Local_oscillatorhttp://en.wikipedia.org/wiki/Frequency_mixerhttp://en.wikipedia.org/wiki/Noise_figurehttp://en.wikipedia.org/wiki/Low-noise_amplifierhttp://en.wikipedia.org/wiki/Amplifierhttp://en.wikipedia.org/wiki/Image_frequencyhttp://en.wikipedia.org/wiki/Image_frequencyhttp://en.wikipedia.org/wiki/Band-pass_filterhttp://en.wikipedia.org/wiki/Impedance_matchinghttp://en.wikipedia.org/wiki/Superheterodyne_receiverhttp://en.wikipedia.org/wiki/Low-noise_block_converterhttp://en.wikipedia.org/wiki/Satellite_receiverhttp://en.wikipedia.org/wiki/Microwavehttp://en.wikipedia.org/wiki/Radio_frequency


14/78

14

The LBC-4000 L-Band IF to 70 MHz IF (140 MHz optional) indoor

converter is a 1RU 19-inch chassis with

two front panel accessible up converter or down converter modules.

It contains two diode OR -ed internal

power supplies, for increased reliability and microprocessor-based

Monitor & Control (M&C) functions.

The LBC-4000 up converter module translates a 70 MHz IF input

signal (140 MHz optional) up to a userselected

frequency at L-Band (950 to 2000 MHz). The L-Band output candrive the input of the Comtech EF

Data MBT-4000 block up converter or other RF equipment with an L-

Band input.

The LBC-4000 down converter module translates an L-Band (950 to

2000 MHz) IF input signal down to a

user selected frequency in the 70 MHz (140 MHz optional) IF band.

The LBC-4000 can be locked to an

internal reference or an external 5 or 10 MHz reference signal. The

LBC-4000 is an excellent choice for

interfacing legacy 70 or 140 MHz equipment to quad-band or tri-

band block converters.

DSP BOARD:

DSP boards or digital signal processor computer boards are central to the

implementation of high-performance industrial systems. They collect and

process digital data from many sources, and distribute the results to other


15/78

15

elements of the system. There are three main sources of data in a real

system: signals (in and out from the DSP processor), messages to

communicate with system controllers, and messages to communicate with

other DSP boards. Important features of DSP boards include a fast

processor and good communication channels as DSP boards need to collect

and distribute data from/to many different sources.

Computer backplane or bus choices for DSP boards include PCI , ISA or

EISA, PCMCIA, PC/104, Mac PCI, SUN Sbus, PMC bus , PXI bus,

Multibus, STD bus, VME bus, VXI or MXI bus, and DT-connect I and II

interface. PCI is a local bus system designed for high-end computer

systems. ISA is a standard for I/O buses that was set back in 1984 when

IBM was the standard. PCMCIA devices (PC Cards) are credit-card-sized

peripherals predominantly used in laptop computers. PC/104 gets its name

from the desktop personal computers designed by IBM (PCs), and from thenumber of pins used to connect the cards together (104). Mac PCI is a local

bus standard developed by the Intel Corporation. Designed by Sun in 1989,

the SBus board was the standard I/O inter-connect for Sun computers,

which typically run under the Solaris or SunOS flavor of the UNIX

operating system. The PMC Bus is actually a form factor, not a bus -- it is

electrically the same as the PCI Bus, but the shape of the card and the bus

connectors are different. PXI is a superset of CompactPCI and adds timing

and triggering functions, imposes requirements for documenting

environmental tests, and establishes a standard Windows-based software

framework. STD bus is often referred to as the "Blue Collar Bus" because

of its rugged design and small size, the STD Bus was originally designed
http://www.globalspec.com/datasheets/76/areaspec/bus_pcihttp://www.globalspec.com/datasheets/76/areaspec/bus_pcihttp://www.globalspec.com/datasheets/76/areaspec/bus_pmchttp://www.globalspec.com/datasheets/76/areaspec/bus_pmchttp://www.globalspec.com/datasheets/76/areaspec/bus_pmchttp://www.globalspec.com/datasheets/76/areaspec/bus_pci


16/78

16

for factory and industrial environments. It uses 16-bit architecture. VME

bus is a 32-bit bus used in industrial, commercial and military

applications. Motorola developed the VME standard, with others, in the

late 1970s. DT-connect I and II is Data Translation's DT-Connect

Interface.

Important processor or DSP performance specifications to consider for DSP

boards include number of processors, clock speed, floating point

performance, integer performance, operations, maximum addressable

memory, and operating temperature. General features and options to

consider when looking for DSP boards include real-time clock, interrupt

controller, memory management unit, dual port memory, and direct

memory access. Communications options include serial I/O ports, parallel

I/O ports, on board A/D converter, and on board D/A converter. Some DSP

boards can accept daughter boards and some DSP boards are daughterboards. An important environmental parameter to consider when searching

for DSP boards is the operating temperature.

DSP:-

Digital signal processing algorithms typically require a large number of

mathematical operations to be performed quickly and repetitively on a set

of data. Signals (perhaps from audio or video sensors) are constantly

converted from analog to digital, manipulated digitally, and then converted

again to analog form, as diagrammed below. Many DSP applications have

constraints on latency ; that is, for the system to work, the DSP operation
http://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Latency_%28engineering%29http://en.wikipedia.org/wiki/Latency_%28engineering%29http://en.wikipedia.org/wiki/Latency_%28engineering%29http://en.wikipedia.org/wiki/Algorithm


17/78

17

must be completed within some fixed time, and deferred (or batch)

processing is not viableA simple digital processing system

Most general-purpose microprocessors and operating systems can executeDSP algorithms successfully, but are not suitable for use in portable devices

such as mobile phones and PDAs because of power supply and space

constraints. A specialized digital signal processor, however, will tend to

provide a lower-cost solution, with better performance, lower latency, and

no requirements for specialized cooling or large batteries.

The architecture of a digital signal processor is optimized specifically for

digital signal processing. Most also support some of the features as an

applications processor or microcontroller, since signal processing is rarely

the only task of a system. Some useful features for optimizing DSP

algorithms are outlined below.

SOFTWARE COMPONENTS:-

FIRMWARE:

Firmware is software that is embedded in hardware. You can update your

firmware in most GPS receivers. Firmware is the software that controls

how hardware works and responds to inputs. Its called firmware instead of

software because users generally arent supposed to play around with it. But

youre not just any old user, are you? Almost all electronic hardware

contains some form of firmware. A television remote control containsfirmware that controls what signals are sent via IR depending on what
http://en.wikipedia.org/wiki/Firmwarehttp://en.wikipedia.org/wiki/Firmwarehttp://en.wikipedia.org/wiki/Firmware


18/78

18

button is pressed. A cell phone contains a lot of firmware controlling cell

access, phone books, security, and much, much more.

A GPS contains a lot of firmware controlling many of the key functions of

the device (as shown in Figure 6-1):

Reception of satellite data

Decoding of positional information

Processing of data

Conversion of data into different formats

Interpretation and display of information

External communication with devices

Storing and managing route/waypoint data

RFPGA:- The FPGA (Field-Programmable Gate Array)

implementation of an adaptive filter for narrow band

interference excision in Global Positioning Systems is

described. The algorithm implemented is a delayed LMS

(Least Mean Squares) adaptive algorithm improved by

incorporating a leakage factor, rounding and constant

resetting of the filter weights. This was necessary as the

original adaptive algorithm had stability problems : the

filter weights did not remain fixed, and tended to drift

until they overflowed, causing the filter response to

degrade. Each model was first tested in Simulink,

implemented in VHDL (Verilog Hardware Description

Language) and then downloaded to an FPGA board forfinal testing. Experimental measurements of anti-jam


19/78

19

margins were obtained

Single channel adaptive filtering techniques have been

shown to be an effective technique for mitigating

multiple narrowband interferences to GPS systems

(Robert, 1999, Landry et al., 1997). Since they can be

seamlessly inserted between the existing GPS antenna

and receiver.

they offer a cost effective solution that involves minimum

system disruption. However to become a fully practicalsolution the size and power demands of their hardware

implementation should be minimised. FPGAs (Field-

Programmable Gate Arrays) offer the potential for

achieving the goals of small size, weight and power

consumption and in this paper the implementation of an

adaptive filter using an FPGA device is described.

In Section 2 an experimental system, termed mini-

GISMO, is described and an overview of the system

architecture is presented. The use of interpolation and

decimation filters within the FPGA is also described.

The main adaptive algorithm implemented is the delayed

LMS (Least Mean Squares) adaptive algorithm (Haykin,

2002). As discussed in Section 3 this algorithm is well

suited to FPGA implementations. However, particularly

in the presence of strong interferences, the original

adaptive algorithm had stability problems (Sethares et al.,1986), as on convergence, the filter weights did not


20/78

20

remain fixed, and tended to drift until they overflowed,

causing the filter response to degrade. In Section 4 it is

shown that incorporating a leakage term (Nascimento et

al.,1999) and rounding instead of truncating resulted in

the weights remaining near the optimal values. However,

this solution introduced memory effects, which produced

a second null when the interference frequency was

changed. Resetting the weights every second removed

this problem and appeared to have the least stabilityeffects, as a short pulse in the output every second didnt

cause any undesirable results in this algorithm. Also, the

bit allocations were optimised to reduce the quantisation

error. By reducing the quantisation noise power a smaller

leakage factor is required to stabilise the adaptive

algorithm resulting in a slower drift of the weight towards

DIGITAL SIGNAL :-

Digital signal processing has traditionally been done using enhanced

microprocessors. While the high volume of generic product provides a low cost

solution, the performance falls seriously short for many applications. Until recently,

the only alternatives were to develop custom hardware (typically board level or

ASIC designs), buy expensive fixed function processors (eg. an FFT chip), or use

an array of microprocessor.

Signal processing:

The antenna preamplifier of a GPS receiver generally converts the incoming signal (see Figure 1below) to a signal of a lower frequency. This intermediate frequency is obtained by mixing the


21/78

21

incoming signal with a pure sinusoidal signal generated by the local oscillator (the quartz "clock").

The frequency of this beat frequency is the difference between the original (doppler-shifted) received

carrier frequency and the local oscillator. The intermediate or beat frequency is then processed by

the signal tracking e

NEVIGATIONAL SIGNAL PROCEESING:

Digital signal processing is the processing of digitised discrete time

sampled signals. Processing is done by general-purpose computers or by

digital circuits such as ASICs , field-programmable gate arrays or

specialized digital signal processors (DSP chips). Typical arithmetical

operations include fixed-point and floating-point , real-valued and complex-

valued, multiplication and addition. Other typical operations supported by

the hardware are circular buffers and look-up tables . Examples of

algorithms are the Fast Fourier transform (FFT), finite impulse response

(FIR) filter, Infinite impulse response (IIR) filter, and adaptive filters such

as the Wiener and Kalman filters .
http://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Application-specific_integrated_circuithttp://en.wikipedia.org/wiki/Application-specific_integrated_circuithttp://en.wikipedia.org/wiki/Field-programmable_gate_arrayhttp://en.wikipedia.org/wiki/Field-programmable_gate_arrayhttp://en.wikipedia.org/wiki/Digital_signal_processorhttp://en.wikipedia.org/wiki/Digital_signal_processorhttp://en.wikipedia.org/wiki/Fixed-point_arithmetichttp://en.wikipedia.org/wiki/Fixed-point_arithmetichttp://en.wikipedia.org/wiki/Fixed-point_arithmetichttp://en.wikipedia.org/wiki/Floating-pointhttp://en.wikipedia.org/wiki/Floating-pointhttp://en.wikipedia.org/wiki/Floating-pointhttp://en.wikipedia.org/wiki/Circular_bufferhttp://en.wikipedia.org/wiki/Circular_bufferhttp://en.wikipedia.org/wiki/Look-up_tablehttp://en.wikipedia.org/wiki/Look-up_tablehttp://en.wikipedia.org/wiki/Fast_Fourier_transformhttp://en.wikipedia.org/wiki/Fast_Fourier_transformhttp://en.wikipedia.org/wiki/Finite_impulse_responsehttp://en.wikipedia.org/wiki/Finite_impulse_responsehttp://en.wikipedia.org/wiki/Infinite_impulse_responsehttp://en.wikipedia.org/wiki/Infinite_impulse_responsehttp://en.wikipedia.org/wiki/Adaptive_filterhttp://en.wikipedia.org/wiki/Adaptive_filterhttp://en.wikipedia.org/wiki/Wiener_filterhttp://en.wikipedia.org/wiki/Wiener_filterhttp://en.wikipedia.org/wiki/Wiener_filterhttp://en.wikipedia.org/wiki/Kalman_filterhttp://en.wikipedia.org/wiki/Kalman_filterhttp://en.wikipedia.org/wiki/Kalman_filterhttp://en.wikipedia.org/wiki/Kalman_filterhttp://en.wikipedia.org/wiki/Wiener_filterhttp://en.wikipedia.org/wiki/Adaptive_filterhttp://en.wikipedia.org/wiki/Infinite_impulse_responsehttp://en.wikipedia.org/wiki/Finite_impulse_responsehttp://en.wikipedia.org/wiki/Fast_Fourier_transformhttp://en.wikipedia.org/wiki/Look-up_tablehttp://en.wikipedia.org/wiki/Circular_bufferhttp://en.wikipedia.org/wiki/Floating-pointhttp://en.wikipedia.org/wiki/Fixed-point_arithmetichttp://en.wikipedia.org/wiki/Digital_signal_processorhttp://en.wikipedia.org/wiki/Field-programmable_gate_arrayhttp://en.wikipedia.org/wiki/Application-specific_integrated_circuithttp://en.wikipedia.org/wiki/Computer


22/78

22

Statistical signal processing analyzing and extracting information from

signals and noise based on their stochastic properties

Audio signal processing for electrical signals representing sound,such as speech or music

Speech signal processing for processing and interpreting spoken

words

Image processing in digital cameras, computers, and various

imaging systems Video processing for interpreting moving pictures

Array processing for processing signals from arrays of sensors

Time-frequency signal processing for processing non-stationary

signals [3]

Filtering used in many fields to process signals

Software based receiver:

Global Navigation Satellite System has become a necessity tool for navigation and positioning

in both civilian and military field and applications. Global Positioning System (GPS) is a

satellite-based navigation system. It is based on the computation of range from the receiver to

multiple satellites by multiplying the time delay that a GPS signal needs to travel from the

satellites to the receiver by velocity of light. GPS has already been used widely both in civilian

and military community for positioning, navigation, timing and other position related

applications. The system has already proved its reliability, availability and good accuracy for

many applications. Due to this nature, in future, other countries like Europe are going to launch

new satellite-based navigation system called Galileo. There is also a proposal to launch Quasi

Zenith Satellite System for navigation in Japan.

It is necessary to simulate and analyze new signal structures for the development of new

satellite-based navigation systems. In the research community, many researchers come outwith
http://en.wikipedia.org/wiki/Statistical_signal_processinghttp://en.wikipedia.org/wiki/Statistical_signal_processinghttp://en.wikipedia.org/wiki/Audio_signal_processinghttp://en.wikipedia.org/wiki/Audio_signal_processinghttp://en.wikipedia.org/wiki/Speech_signal_processinghttp://en.wikipedia.org/wiki/Speech_signal_processinghttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Video_processinghttp://en.wikipedia.org/wiki/Video_processinghttp://en.wikipedia.org/wiki/Array_processinghttp://en.wikipedia.org/wiki/Array_processinghttp://en.wikipedia.org/wiki/Time-frequency_analysishttp://en.wikipedia.org/wiki/Time-frequency_analysishttp://en.wikipedia.org/wiki/Signal_processing#cite_note-2http://en.wikipedia.org/wiki/Signal_processing#cite_note-2http://en.wikipedia.org/wiki/Filter_%28signal_processing%29http://en.wikipedia.org/wiki/Filter_%28signal_processing%29http://en.wikipedia.org/wiki/Filter_%28signal_processing%29http://en.wikipedia.org/wiki/Signal_processing#cite_note-2http://en.wikipedia.org/wiki/Time-frequency_analysishttp://en.wikipedia.org/wiki/Array_processinghttp://en.wikipedia.org/wiki/Video_processinghttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Speech_signal_processinghttp://en.wikipedia.org/wiki/Audio_signal_processinghttp://en.wikipedia.org/wiki/Statistical_signal_processing


23/78

23

new ideas and algorithms for better accuracy of GPS by mitigating or minimizing various types

of errors and effects like multipath. However, it is quite difficult to implement the user

developed algorithms in the current hardware-based GPS receivers. The hardware-based GPS

receivers contain ASICs that provide the least user flexibility. Thus, it is necessary to have

Software-based GPS receivers, at least in the research community for easy and quickimplementation, simulation and analysis of algorithms, parameters and threshold values. Since,

the CPU processing power is increasing with reduced cost, it is now possible to build real-time

software-based GPS receivers at least for static or low dynamic environments. As predicted by

Moors Law, the CPU power is increasing and we hope that this trend will continue in future as

well and hence, it will be possible to develop real-time all environment software-based GPS

receivers. In this paper, we briefly introduce the architecture of a SGR, signal processing

technique and give some examples of simulation using SGR.

2 SOFTWARE-BASED GPS RECEIVER ARCHITECTURE

The architecture of a conventional GPS receiver is shown in Figure 1. It consists of RF front-

end and signal processor that are all built upon IC chips. The outputs of the signal processor

are either displayed directly on the receiver display unit or fed to a PC for further processing or

integration with other devices. Since, the signal processing is all done inside the hardware

chips,users have limited access to change the parameters or install new algorithms. Figure 2 shows

architecture of a software-based GPS receiver (SGR). It consists of a RF front-end device,

which is still a hardware component. The rest of the signal processing is done using high level

programming language like C/C++, Matlab etc. If we compare Figure 1 and Figure 2, the only

difference we see is the replacement of hardware components by software tools for signal

processing. We still need RF front-end since the present capacity of CPU is still not able to

process the signal directly from the antenna at 1.5GHz. Figure 3 shows the merits and demerits

of using hardware-based and software-based receiver. A hardware-based receiver is fastest in

signal processing however, it has the least level of flexibility, where as a software-based

receiver has the highest level of flexibility but is the slowest in processing speed. There are

products using FPGA-based receivers which is the compromise between the two.

Processing


24/78

24

GPS SIGNAL PROCESSING:


25/78

25

L1 band GPS signal is transmitted at 1.5 Ghz and since the receiver can not process the signal

directly at this frequency, the RF front-end device down converts from 1.5Ghz to a much lower

frequency of about 4Mhz. This frequency is called Intermediate Frequency (IF). During this

conversion process, the signal is also digitized (A/D conversion) at 1bit, 2bit or higher rate and

sampled at some frequency, e.g. 16Mhz. We use the down-converted signal for further

processing. The first task of signal processing is to identify the visible satellites by finding the

satellite code phase and Doppler frequency. The code phase provides the beginning of C/A

code.

Since, the satellites are moving all the time (and probably the receiver may also move) we

always have some Doppler frequency. The rough estimation process of code phase and

Doppler frequency is called acquisition. Basically, for acquisition, we generate C/A code for the

satellite and modulate with the carrier wave. This receiver generated signal is then correlatedwith incoming signal and the correlation value is evaluated to make decision whether a satellite

visible. If we think that the satellite is visible, then the code phase value and Doppler frequency

is noted. Once, we complete acquisition successfully, we know the satellites that are visible at

that time.

In the next step, we track the visible satellites continuously for fine tuning of the code phase and

Doppler frequency. This process is called tracking. The tracking process removes the C/A code

and carrier wave from the GPS signal and hence the remaining signal represents navigation

data and some noise. Thus, from navigation output, we can extract navigation data parameters

which are necessary to compute pseudorange from the receiver to satellite. Please refer, [2] for

details on GPS signal processing. Figure 4 (a) shows raw GPS data collected from antenna and

downconverted

to IF. This data just looks like noise and no information can be known unless we

perform acquisition and tracking on the data. This is due to the fact that the GPS signal level is

below the noise level or the signal is weaker than the noise. Figure 4 (b) shows the result of

acquisition from raw data shown in Figure 4 (a). The acquisition output shows the code phase(beginning point of C/A code) and Doppler frequency. Figure 5 shows tracking results. The


26/78

26

tracking result extracts navigation data bits as shown in Figure 6, which are simply the

sequence


27/78

27

SGR AS RESEARCH AND SIMULATION TOOL

We mentioned earlier that SGR has much flexibility compared to

conventional receiver. We will discuss and give some examples how

these flexibilities of SGR are used to extract information that are

otherwise not possible in conventional GPS receiver. Figure 7 shows


28/78

28

SGR AS RESEARCH AND SIMULATION TOOL

some of the fundamental parameters of signal processing in SGR. IF

frequency and sampling frequency are fixed for a particular front-enddevice. By changing these two values, we can use

the same software tool for different types of frontend device that

acquire GPS signal from the

antenna. Below we will discuss some of the flexibilities point by

point.


29/78

29

Weak Signal Processing:

The Doppler frequency search step, code period acquisition integration time,noise bandwidth code period tracking integration time depends on the signal

quality. If the signal level is normal, we can use 1000Hz Doppler frequency

step and 1ms code period integration time for acquisition.

However, if the signal is weak, and then we need Figure 7: Basic parameters

that can be changed by a user in SGR for various types of signal processing

and simulation to reduce the Doppler frequency search step and increase the

code period integration time in acquisition. For example, if we integrate raw

data for 3ms for acquisition then we need to reduce the Doppler frequency

search step to 300Hz. This will increase processing speed but help us in

detecting weak signals. Also, we need to increase the integration time in

tracking loop. This type of signal processing by changing the parameter values

is not possible in conventional GPS receiver. Figure 8 shows an example for

increase in integration time from 1ms to 3ms. When the integration time is

1ms, the correlation peak is not clear enough to make a decision for satellite

visibility. But, when the integration time is increased to 3ms, we can see a very


30/78

30

clear correlation peak and we can make a decision that a particular satellite is

now detected. Figure 8: (a) Signal acquisition using 1ms integration time. The

result is not so clear with multiple peaks. (b) Signal acquisition using 3ms

integration time with the same data as in (a). Now, the correlation peak is quiteclear and a decision can be made regarding visibility of satellite.

Multipath Mitigation Technique

In spite of continuing improvements in GPS receivers and antenna technology,

multipath signal has remained a major source of error in GPS positioning. Inorder to minimize the error due to multipath, we need to understand the

multipath behaviour and corresponding signal characteristics. In order to

understand the effect of multipath we can analyze the signal by using

various types of correlators (narrow, wide etc) by defining chip delay (listed in

Figure 7) between early and late chips. We can compute the correlation peak

for every code period. A correlation peak will appear as a perfect triangle had


31/78

31

there been no effect from multipath. Due to multipath, the two sides of the

triangle will be neither symmetrical nor straight lines. The shape

and amplitude of the triangle is deformed by the amount of multipath and

some other noise. Thus by analysing the correlation peak (triangularshape) we

can estimate the amount of multipathand hence develop a technique

to minimize or mitigate the multipath. In this regard, we are

conducting research using left hand and right hand circular polarized

GPS antenna to analyze how the reflected signal (which accounts

formultipath) affects a correlation peak. Refer [1] for details of this

experiment. Figure 9 shows a correlation peak obtained by

processing a raw GPS signal shown in Figure 4. Correlation peak

computed from raw GPS signal for 0.5 chip delay. The peak shape is

not a perfect triangular due to effect


32/78

32

from multipath

Remote Sensing using GPS Signal :

Recently, GPS signals have been used for remote sensing purpose.

GPS signals are transmitted at 1.2Ghz and 1.5Ghz in two different

bands. This is similar to microwave remote sensing. GPS signals are

transmitted with right hand circular polarization. When, this signal is

reflected by some object the polarization may change from righthand to left hand and vice versa. Thus by observing the reflected

signal together with two different types of antennas with right hand

and left hand polarization, we can predict the object type that reflects

the GPS signal. Using this technique, soil moisture and wind velocity

has been estimated. Refer [3] for details on this research. In order to

conduct this type of analysis, we need software-based receiver so

that we can process the received signal with different parameter

values using our own algorithms. The reflected signals are much

weaker than direct signal and hence a conventional receiver can not

be used. Also, we need to compute many intermediate values like

shape of the correlation peak and it s amplitude rather than theposition of the GPS antenna itself. This is possible only in software-based receivers.Besides these analysis and simulation listed above,we need software-based receiver for analyzing noise andinterference (jamming), simulate new codes, limitation of navigationdata length and many other things. In current GPS signal, thenavigation data length is limited to20ms. This impose a restriction ondata integration beyond 20ms during the tracking process.


33/78

33

However, for tracking very weak signal, we do need to integratelonger data period. Thus weneed to see what will happen if wechange the navigation data length from 20ms to somethingelse in our new design. On the other hand we can also have a data

less component of the signal in one of the phases of the signal whichis now implemented in new forthcoming GPS signals.This assists the receiver in processing weak signals and hencemake the receiver capable of indoor positioning. All these can besimulated if we have software-based receivers. In SGR, wecan generate different types of signals for interference analysis. Thiswill help us how different types of signal with different level ofstrength affect GPS signal processing. For example, we can

simulate the effect of a TV signal on GPS or we can analyze theeffect of other GNSS signals on GPS or vice versa. These are againpossible in software-based .

part -11


34/78

34

Flow chart of gps working:


35/78

35

Models for gps:


36/78

36

SIGNAL ANALYSIS TOOL:


37/78

37

Some speech recognizers support the ability todynamically adjust to the voice of a speaker andoften the ability to store adaptation data for that voicefor future use. The speaker data may also includelists of words more often spoken by the user

Speech Recognizer configuration:

It holds some standard setting and functions forrecognizer.

Lexicons :Grammar holds the pronunciation of the wordsreferenced by grammar.

Other speech Processing Capabilities :Grammar has a capability of recognizing language,speaker identification and verification. Thesecapabilities may be associated with the recognizer.

1.7.3 Speech Recognizer :-

It is software which performs the tasks involved in speechrecognition. The speech recognizer software may be available as a freeproduct or may have to buy. This software varies from platform to platform.

e.g., For windows:

Dragon Natural Speaking, Microsoft Speech Recognition Voice Assist for window from creative labs.

For LINUX:

IN CUBE Pure Speech


38/78

38

Myers HMM Software

For integrated circuit and dedicated hardware:

Speech commander Voice control system Recognition

1.8 Applications

Speech recognition is emerging technology in computer science. It

has some weakness but despite of that it is used in many areas to solve

problems. These are listed below:

Playing back simple information: In many call centres customers

require quick information and do not actually want to speak to

like operator. So speech recognition is useful to provide such

quick information.

Call steering: By introducing speech recognition, you can allowcallers to cho ose a self -service route or alternatively say what

they want and are directed to the correct department or

individual.

Defence uses: Speech recognition is also used in defence

applications. It is used to quickly perform some action by


39/78

39

responding to voice rather than pressing the buttons or other

input methods.

Artificial intelligence: Used in many applications of artificial

intelligence and is most useful in robotics to interact with robots

and machines. In fact speech recognition is a part of artificial

intelligence.

Hands Free computing: Speech recognition is used for handsfree computing because it can provide a user interface in which

user interact with computer by dictation.

Language learning: The person who wants to learn a new

language can use speech recognition system.

People with disabilities: People with physical disabilities can

benefit from speech recognition system. It is especially useful

for computer who has difficulties using their hands or paralysed

people.

Court reporting: For replacing the court reporter by computer.


40/78

40

1.9 Feasibility Study

Economical feasibility :

To design a speech recognition system we require following things:-

1. Microphone(Rs.600- 5000)

2. Sound card(Rs.1200 - 25000)3. Computer(min. 400 MHz processor Rs.15000 or above)

4. Good programmers.

So it makes Rs.16800 + programmers pay hence it is feasibleeconomically.

Social Feasibility:

We can make computer very decent by adding the vocabulary

that is socially feasible.

Technical feasibility:

The microphones available today are sufficient

Processing speed of todays processors is more than enough


41/78

41

The sound cards available can perform A/D conversion very

efficiently

PART -II

System Analysis


42/78

42

2.1 Components of speech recognition system:

FIG 2.1 : components of speech recognition system

Speech representation can be done by:

representation,

modelling and

searching

Here three models are used to recognize speech. One of the three model is

used to match correct word is used. These models are:-

Acoustic Model Lexical Models

Input speech Output


43/78

43

Language Models

2.1.1 Acoustic Model:

In this type of model we have a stored pattern of representation for each

word. This technique uses this pattern to match with the pattern that is

obtained after processing. This technique selects that pattern which has

minimum acoustical difference from stored pattern. Every processed

pattern has a probability associated with it such that it can occur in speech.The word which has maximum probability is chosen in speech. This type of

model uses pattern matching for recognizing speech.

2.1.2 Lexical Model:-

Lexical means related to words or dictionary. It is a neural

network based approach to model the lexicon of the language with a limited

amount training data. The training data is necessarily a database of a

language with the phone set of the language. The neural network learns

how the phones of the language vary with different instances of context.

The trained network is capable of recognizing the pronunciation of a word

given its native phonetic composition.


44/78

44

Example:

Consider the following words:

START S-T-AA-R-TD

STARTING S-T-AA-R-DX-IX-NG

STARTED S-T-AA-R-DX-IX-DD

STARTUP S-T-AA-R-T-AX-PD

START-UP S-T-AA-R-T-AX-PD

FIG 2.2 Lexical Tree Structure of above words


45/78

45


46/78

46

2.1.3 Language Model :-

The language model attempts to convey the behaviour of the language. It

aims to predict the occurrence of specific word sequences possible in the

language. From the perspective of the recognition system, the language

model helps narrow down the search space for a valid combination of

words. Most Speech Recognition systems use the stochastic language

models. SLMs use the N-gram LM where it is assumed that the probability

of occurrence of a word is dependent only on the past N-1 words.

Language Models help a speech recognizer figure out how likely a word

sequence is, independent of the acoustics. A lot of candidates can be

eliminated and it is possible to give other words higher probabilities. This

lets the recognizer make the right guess when two different sentences sound

the same.

For example:

Its fun to r ecognize speech?

Its fun to wreck a nice beach?

Another type of language model is Hidden Markov Model.


47/78

47

2.2Flow chart of the System

Working of the Speech Recognition System

FIG 2.3 Flow Chart of the System


48/78

48

In the matching and comparison step, we may obtain two or more than two

units of a words, phone or utterance depending upon the approach in use.

These matched units are stored in memory and various models are applied

to select appropriate unit, which forms a recognized output. Depending

upon the result of the matching and comparison unit, corresponding action

can be performed.


49/78

49

1.3 Data Flow Diagrams For Speech Recognition:

2.3 .1 Level 0 DFD:

FIG 2.4 : Level 0 Data Flow diagram for speech Recognition


50/78

50

2.3.2 Level 1 DFD:

FIG 2.5 : Level 1 Data Flow Diagram for Speech Recognition


51/78

51

2.3.3 Level 2 DFD:


52/78

52

FIG 2.6 : Level 2 Data Flow Diagram for Pattern matching


53/78

53

2.4 Training data types:-

The speech grammar can be designed in different ways and these vary on

the basis of the size of the grammar and the accuracy by which you want

your speech to be recognized. These models are:-

1. Whole-word models :-

Whole words fea tures are stored in the grammar, so while extracting

the features of the sound signal the whole words feature are

calculated and compared. This type of model is suitable for small

vocabulary recognition. With whole word model high accuracy rate

can be attained.

2. Phone models :-

In this small set of speech sounds that can be distinguished by the

speakers of a particular language are used for speech grammar. This is

suitable for large vocabulary recognition. The accuracy rate for this

model is very low.

3. Syllable models :-

In this model the units larger then phone are used to do feature extraction.

This model can be used for large language grammar with high accuracy

rate.


54/78

54

PART-III

System Design


55/78

55

3.1 Interface Design :

This project report is a case study of existing speech recognition system. In

this project we have taken the reference of Dragon Natural Speaking, a

software package for speech recognition in windows. The latest version of

this software is version 10.0. Which has following interface design:-

Icon Design:

FIG 3.1 : Startup Interface Design:

FIG 3.2 : Training Interface Design:


56/78

56

3.2 Using the interface of Dragon Naturally speaking :-

When you start Dragon Naturally Speaking we have to perform followingsteps:-

FIG 3.3 : Create a user profile for a user. The interface is as shown:

After creating a user if you have selected training, you have to dictate

few text on your specified microphones.

This will help the software to recognize user.

After this, the software prompt user to check your microphones by

making user dictates little text.


57/78

57

After this the software will create some user files and prepare the

software for first use.


58/78

58

3.3 Utilities of Dragon Naturally Speaking software:

3.3.1 Tasks Performed by Dragon Naturally Speaking software :-

Dragon Naturally Speaking software performs following tasks:-

Speech to text conversion

Have some inbuilt commands which perform some tasks

Dictation

3.3.2 Additional tools in Dragon Naturally Speaking software:-

- Add a new user

- Add a new command

- Managing Users( creating deleting, other changes)

- Train a User

This software is 97% accurate. That is why it is mostly used speech

recognition package.

3.4 Technical Features of Dragon Natural ly Speaking:

- Sampling rate 512 samples(16 Khz Sampling Rate)

- 30 ms of window for frequency domain analysis.

- It is programmed in C language and Uses Hidden Markov Model and

viterbi search.

- It contains following Basic files:

-mdef.c definition of basic phones on the basis of HMM in

form of matrix.


59/78

59

-dict.c It is a pronunciation dictionary

-lextree.c Lexical tree Search

-hmm.h contains implementation of HMM using Viterbi

Search.

PART-IV

Appendices


60/78

60

Appendix A

Hidden Markov Model (HMM):-

HMM is a statistical model in which the system being modelled is

assumed to be a Markov process with unobserved state. Hidden

Markov models are especially known for their application in temporal

pattern recognition such as speech, handwriting, gesturerecognition.so input and output of a HMM will be:

Input: A sequence of feature vectors.

Output: Words with highest probability being spoken.

There are following four things in a HMM:-

States (words, phones or syllables)

State transition probabilities

Symbol emission probabilities

Observations (features of the signal)

In HMM we will find the most probable state (words) on the basis of the

observations (audio input).


61/78

61

FIG A.1Probabilistic parameters of a hidden Markov model (example)

x states y possible observations

a state transition probabilities

b output probabilities

The diagram below shows the general architecture of an instantiated HMM.Each oval shape represents a random variable that can adopt any of a


62/78

62

number of values. The random variable x(t ) is the hidden state at

time t (with the model from the above diagram x(t ) { x1, x2, x3 } ). The

random variable y(t ) is the observation at time t ( y(t ) { y1, y2, y3, y4 }). The

arrows in the diagram (often called a trellis diagram) denote conditional

dependencies.

From the diagram, it is clear that the conditional probability distribution of

the hidden variable x(t ) at time t , given the values of the hidden variable x at

all times, depends only on the value of the hidden variable x(t 1). This is

called the Markov property. Similarly, the value of the observed

variable y(t ) only depends on the value of the hidden variable x(t ) (both at

time t ).

FIG A.2

There are three main functions in an HMM

1. Evaluation :-

Given the observation sequence O and the model , how

do we efficiently compute P(O| ), the probability of the observation

sequence, given the mode:

Enumerate all possible state sequences S of length T Sum up all probabilities of these sequences
http://en.wikipedia.org/wiki/Trellis_(graph)http://en.wikipedia.org/wiki/Conditional_probability_distributionhttp://en.wikipedia.org/wiki/Markov_propertyhttp://en.wikipedia.org/wiki/File:Hmm_temporal_bayesian_net.svghttp://en.wikipedia.org/wiki/Markov_propertyhttp://en.wikipedia.org/wiki/Conditional_probability_distributionhttp://en.wikipedia.org/wiki/Trellis_(graph)


63/78

63

Probability of path S (calculate for all paths):

State sequence probability.

2. Decode :-

Finding the sequence of hidden states that most probably

generated an observed sequence

Given the parameters of the model and a particular output

sequence, find the state sequence that is most likely to have

generated that output sequence.

This requires finding a maximum over all possible state

sequences

3. Learning:-

Adjust the model parameter to maximize the joint

probability

First make an initial guess of the parameters (which may be

entirely wrong)

Refine it by assessing its worth, attempt to reduce provoked

errors when fitted to the given data

Feed sample speech data along with phonemes of spoken words


64/78

64


65/78

65

Appendics B

Digitizing the Analog signal:

The must be in digital form so that computer can understand it. A signalmust be converted to analog signal by using following steps:

The bandlimited signal is first sampled, converting the analog signal

into a discrete time continuous-amplitude signal.

The amplitude of each sample is quantised into 2 n levels, where n is

the number of bits used to represent a sample.

The discrete amplitude levels are represented or encoded into distinct

binary words each of n bits.

This process is shown in following figure:

FIG B.1 : Block Diagram for digitizing an analog signal


66/78

66

The process of converting a continuous-time continuous-signal to

discrete- time continuous-signal is called Sampling.

The process of converting a discrete-time continuous-signal to discrete-

time discrete-signal is called Quantization.

Sampling is done by multiplying the input signal with a periodic train of

unit amplitude as shown:

FIG B.2 : Sampling Analog Signal

This sampling is carried out with sampling frequency 2F M , where F M is the

maximum frequency component of input signal. This signal can be

accurately reconstructed at receiver end.


67/78

67

Quantisation of signal is done by using a step size which is taken very

small and increased whenever signal value increases as shown:

FIG B.3 Quantization of Signal


68/78

68

Appendics C

History

The speech recognitions foundation was with the turning model given by

Alan Turning (1950). Turning test was to know whether the computer can

think or not? In this there was three participants one computer and other

two human. Each of participants was separated from each other by a wall

and the will talk to each other. One of the human participants was aninterrogator. And the remaining two will prove the he is human and other is

not human. This test led many developers to do research on the speech

recognition.

AT&T Bell Laboratories developed a primitive device that could recognize

speech in the 1950s.

In the 1960s, researchers turned their focus towards a series of smaller

goals that would aid in developing the larger speech recognition system. As

a first step, developers created a device that would use discrete speech.

In the 1970s, continuous speech recognition, which does not require the

user to pause between words, began. This technology became functional

during the 1980s and is still being developed and refined today.

Technological advances have made speech recognition software anddevices more functional and user friendly, today speech recognition has


69/78

69

accuracy more than 90 %. The error rates of various types of recognition

are:

FIG C.1 : errors rate of different speech recognition


70/78

70

Conclusion

In this project we discussed basic concept that are used in speech

recognition. Speech recognition engines work in a similar manner. There

are following things that can be concluded from the study:-

The Knowledge of language and linguistics for that language is

required.

Most of the Speech Recognition packages use Hidden Markov Model

in implementing speech recognition.

Speech recognized can be represented in many ways. e.g., speech to

text conversion, speech production, language learning, information

extraction, etc.

This project allows us to differentiate between the accuracy that can

be achieved by Appling different models.

We can specify the best hardware and software requirement for

Speech Recognition. We can us efficiently use Dragon Naturally

Speaking software with following specification:

Intel Pentium IV processor of 1.5 GHz or above speed.

1GB of RAM

WINDOWS XP or above version of windows.

Creative Microphones having ambient noise removingtechnique.


71/78

71

Intel Sound Card with 16 kHz of Sample rate and signal to

noise ratio of 100 dB.


72/78

72

Bibliography

[1]. Speech and Language Processing-2 nd edition by Jurafsky & Martin

[2].Schaums ouTlines Discrete Mathematics 3 rd edition by Seymour

Lipschutz and Marc Lipson

[3].Principles of digital communication by Taub and Schilling.

[4] http://www.faqs.org/docs/Linux-HOWTO/Speech-Recognition-

HOWTO.html

[5].http://en.wikipedia.org/wiki/Speech_recognition

[6].http://cslu.cse.ogi.edu/HLTsurvey/ch1node4.html

[7].www.ee.ic.ac.uk/hp/staff/pnaylor/notes/recog.pdf


73/78

73


74/78

74

Index

A

acoustic model 22

artificial intelligence 17

B

Bibliography 43

C

call steering 17

Comparison 10

court reporting 18

D

defence uses 17

digitizing 40

discourse analysis 5

dragon natural

speaking 32 34

E

economical 19


75/78

75

feasibility

F

feature extraction 8

Filtering 10

Framing 8

H

History 43

hidden markov

model 36-39

K

knowledge based

approach 10

L

language learning 18

language model 23

lexical model 22

Lexicon 16

M

10


76/78

76

matching

Memory 14

Microphone 13

Morphology 5

N

neural network

approach 11

P

pattern matching

approach 10

people with

disabilities 18

phone model 30

phonetics 5

Phonology 5

play back of

information 17

Pragmatics 5

pre filtering 7

probabilistic model 11

processor 14

Q


77/78

77

quantization 41

S

sampling 41

semantics 5

social feasibility 19

sound card 13

spectral features 9

speech grammar 15

speech recognizer 16

syllable model 30

Syntax 5

T

technical feasibility 19

temporal features 9

W

whole word model 30

windowing 8

word detection 7


78/78

Documents

Speech Recog Report - For Merge