Upload
jishnu-rajendran
View
220
Download
0
Embed Size (px)
Citation preview
7/31/2019 AI SPEECH
1/17
Abstract and full paper on Artificial Intelligence
For Speech RecognitionTUESDAY, MARCH 22, 2011 | TAGS:ABSTRACT OF MCA IN IEEE FORMAT,MBA/MCA PPT ABSTRACTS
DIGG IT|STUMBLE IT|SAVE TO DEL.ICO.US
Artificial Intelligence For Speech Recognition
(Abstract)
DEFINITION: It is the science and engineering of making intelligent machines,
especially intelligent computer programs.
APPLICATIONS: Game Playing Speech Recognition Understanding Natural Language Computer Vision Expert Systems Robotics
SPEECH RECOGNITION:
Artificial intelligence involves two basic ideas. First, it involvesstudying the thought processes of human beings. Second, it deals withrepresenting those processes via machines (like computers, robots, etc.).
One of the main benefits of speech recognition system is that it letsuser do other works simultaneously. The user can concentrate onobservation and manual operations, and still control the machinery by voice
input commands.
A number of algorithms for speech enhancement have been proposed.These include the following:
1. Spectral subtraction of DFT coefficients2. MMSE techniques to estimate the DFT coefficients of corruptedspeech
http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.htmlhttp://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.htmlhttp://www.creativeworld9.com/search/label/abstract%20of%20MCA%20in%20ieee%20formathttp://www.creativeworld9.com/search/label/MBA%2FMCA%20%20PPT%20ABSTRACTShttp://www.creativeworld9.com/search/label/MBA%2FMCA%20%20PPT%20ABSTRACTShttp://digg.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://digg.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.stumbleupon.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.stumbleupon.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.stumbleupon.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://del.icio.us/post?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://del.icio.us/post?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.htmlhttp://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.htmlhttp://www.creativeworld9.com/search/label/abstract%20of%20MCA%20in%20ieee%20formathttp://www.creativeworld9.com/search/label/MBA%2FMCA%20%20PPT%20ABSTRACTShttp://digg.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.stumbleupon.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://del.icio.us/post?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognition7/31/2019 AI SPEECH
2/17
3. Spectral equalization to compensate for convoluted distortions4. Spectral subtraction and spectral equalization.
CONCLUSION:By using this speaker recognition technology we can achieve many
uses. This technology helps physically challenged skilled persons. Thesepeople can do their works by using this technology with out pushing anybuttons. This ASR technology is also used in military weapons and inResearch centers. Now a days this technology was also used by CIDofficers. They used this to trap the criminal activities.
INDEX
Concepts
1. INTRODUCTION
2. DEFINITION
3. HISTORY
4. FOUNDATION
5. SPEAKER INDEPENDENCY
6. ENVIRONMENTAL INFLUENCE
7. SPEAKER SPECIFIC FEATURES
8. SPEECH RECOGNITION
9. APPLICATIONS
10. GOAL
11. CONCLUSION
12. BIBLIOGRAPHY
Artificial Intelligence For Speech Recognition
7/31/2019 AI SPEECH
3/17
Introduction:
Artificial intelligence involves two basic ideas. First, it involves
studying the thought processes of human beings. Second, it deals with
representing those processes via machines (like computers, robots, etc.).
AI is behavior of a machine, which, if performed by a human being,
would be called intelligent. It makes machines smarter and more useful,
and is less expensive than natural intelligence.
Natural language processing (NLP) refers to artificial intelligence
methods of communicating with a computer in a natural language like
English. The main objective of a NLP program is to understand input and
initiate action.
Definition:
It is the science and engineering of making intelligent machines,
especially intelligent computer programs.
AI means Artificial Intelligence. Intelligence however cannot be
defined but AI can be described as branch of computer science dealing withthe simulation of machine exhibiting intelligent behavior.
History:
Work started soon after World-WarII.
Name is coined in 1957.
Several names that are proposed are
Complex Information Processing
Heuristic programming Machine Intelligence
Computational Rationally
Foundation:
Philosophy (428 B.C.-present)
7/31/2019 AI SPEECH
4/17
Mathematics (c.800-present)
Economics (1776-present)
Neuroscience (1861-present)
Psychology (1879-present)
Computer Engineering (1940-present)
Control theory and cybernetics (1948-present)
Linguistics (1957-present)
Speaker independency:
The speech quality varies from person to person. It is therefore
difficult to build an electronic system that recognizes everyones voice. By
limiting the system to the voice of a single person, the system becomes not
only simpler but also more reliable. The computer must be trained to the
voice of that particular individual. Such a system is called speaker-
dependent system.
Speaker independent systems can be used by anybody, and can
recognize any voice, even though the characteristics vary widely from one
speaker to another. Most of these systems are costly and complex. Also,
these have very limited vocabularies.
It is important to consider the environment in which the speech
recognition system has to work. The grammar used by the speaker and
accepted by the system, noise level, noise type , position of the
microphone, and speed and manner of the users speech are some factors
that may affect the quality of speech recognition.
Environmental influence:
Real applications demand that the performance of the recognition system
be unaffected by changes in the environment. However, it is a fact that
when a system is trained and tested under different conditions, the
recognition rate drops unacceptably. We need to be concerned about the
variability present when different microphones are used in training and
testing, and specifically during development of procedures. Such care can
7/31/2019 AI SPEECH
5/17
significantly improve the accuracy of recognition systems that use desktop
microphones.
Acoustical distortions can degrade the accuracy of recognition systems.
Obstacles to robustness include additive noise from machinery, competing
talkers, reverberation from surface reflections in a room, and spectral
shaping by microphones and the vocal tracts of individual speakers. These
sources of distortions fall into two complementary classes; additive noise
and distortions resulting from the convolution of the speech signal with an
unknown linear system.
A number of algorithms for speech enhancement have been proposed.
These include the following:
1. Spectral subtraction of DFT coefficients
2. MMSE techniques to estimate the DFT coefficients of corrupted
speech
3. Spectral equalization to compensate for convoluted distortions
4. Spectral subtraction and spectral equalization.
Although relatively successful, all these methods depend on the assumption
of independence of the spectral estimates across frequencies. Improved
performance can be got with an MMSE estimator in which correlation
among frequencies is modeled explicitly.
Speaker-specific features:
Speaker identity correlates with the physiological and behavioral
characteristics of the speaker. These characteristics exist both in the vocal
tract characteristics and in the voice source characteristics, as also in the
dynamic features spanning several segments.
The most common short-term spectral measurements currently used are
the spectral coefficients derived from the Linear Predictive Coding(LPC) and
their regression coefficients. A spectral envelope reconstructed from a
truncated set of spectral coefficients is much smoother than one
reconstructed from LPC coefficients.
7/31/2019 AI SPEECH
6/17
Therefore, it provides a more stable representation from one repetition to
another of a particular speakers utterances.
As for the regression coefficients, typically the first and second order
coefficients are extracted at every frame period to represent the spectral
dynamics.
These coefficients are derivatives of the time function of the spectral
coefficients and are called the delta and delta-delta-spectral coefficients
respectively.
Speech Recognition:
The user communicates with the application through the appropriate input
device i.e. a microphone. The Recognizer converts the analog signal into
digital signal for the speech processing. A stream of text is generated after
the processing. This source-language text becomes input to the Translation
Engine, which converts it to the target language text.
Salient Features:
Input Modes
Through Speech Engine
Through soft copy
Interactive Graphical User Interface
Format Retention
Fast and standard translation Interactive Preprocessing tool
Spell checker.
Phrase marker
Proper noun, date and other package specific identifier
Input Format
txt,.doc.rtf
7/31/2019 AI SPEECH
7/17
User friendly selection of multiple output
Online thesaurus for selection of contextually appropriate
synonym
Online word addition, grammar creation and updating facility
Personal account creation and inbox management
Applications:
One of the main benefits of speech recognition system is that it lets
user do other works simultaneously. The user can concentrate on
observation and manual operations, and still control the machinery by voice
input commands.
Another major application of speech processing is in military
operations. Voice control of weapons is an example. With reliable speech
recognition equipment, pilots can give commands and information to the
computers by simply speaking into their microphones - they dont have to
use their hands for this purpose.
Another good example is a radiologist scanning hundreds of X-rays,
ultra sonograms, CT scans and simultaneously dictating conclusions to a
speech recognition system connected to word processors. The radiologist
can focus his attention on the images rather than writing the text.
Voice recognition could also be used on computers for making airline
and hotel reservations. A user requires simply to state his needs, to make
reservation, cancel a reservation, or make enquiries about schedule.
Ultimate Goal:
The ultimate goal of the Artificial Intelligence is to build a person, or,
more humbly, an animal.
Conclusion:
7/31/2019 AI SPEECH
8/17
By using this speaker recognition technology we can achieve many
uses. This technology helps physically challenged skilled persons. These
people can do their works by using this technology with out pushing any
buttons. This ASR technology is also used in military weapons and in
Research centers. Now a days this technology was also used by CID
officers. They used this to trap the criminal activities.
Bibliography:
www.google.co.in/Artificial intelligence for speech recognition
www.google.com
www.howstuffworks.comwww.ieeexplore.ieee.org
Seminar Report On SIXTH SENSE TECHNOLOGY
7/31/2019 AI SPEECH
9/17
Although miniaturized versions of computers help us to connect to the digital world even while we are
travelling there arent any device as of now which gives a direct link between the digital world and our
physical interaction with the real world. Usually the informations are stored traditionally on a paper or a
digital storage device. Sixth sense technology helps to bridge this gap between tangible and non-tangible
world. Sixth Sense device is basically a wearable gestural interface that connects the physical world around
us with digital information and lets us use natural hand gestures to interact with this information .The sixth
sense technology was developed by Pranav Mistry, a PhD student in the Fluid Interfaces Group at the MIT
Media Lab. The sixth sense technology has a Web 4.0 view of human and machine interactions. Sixth Sense
integrates digital information into the physical world and its objects, making the entire world your
computer. It can turn any surface into a touch-screen for computing, controlled by simple hand gestures. It
is not a technology which is aimed at changing human habits but causing computers and other machines to
adapt to human needs. It also supports multi user and multi touch provisions. Sixth Sense device is a mini-
projector coupled with a camera and a cell phonewhich acts as the computer and your connection to the
Cloud, all the information stored on the web. The current prototype costs around $350. The Sixth Sense
prototype is used to implement several applications that have shown the usefulness, viability and flexibility
of the system.
2. DEFINITION
Sixth Sense' is a wearable gestural interface that augments the physical world around
us with digital information and lets us use natural hand gestures to interact with that information. By using
a camera and a tiny projector mounted in a pendant like wearable device, 'Sixth Sense' sees what you see
and visually augments any surfaces or objects we are interacting with. It projects information onto surfaces,walls, and physical objects around us, and lets us interact with the projected information through natural
hand gestures, arm movements, or our interaction with the object itself. 'Sixth Sense' attempts to free
information from its confines by seamlessly integrating it with reality, and thus making the entire world
your computer. All of us are aware of the five basic senses seeing, feeling, smelling, tasting and hearing.
But there is also another sense called the sixth sense. It is basically a connection to something greater than
what their physical senses are able to perceive. To a layman, it would be something supernatural. Some
might just consider it to be a superstition or something psychological. But the invention ofsixth sense
technology has completely shocked the world. Although it is not widely known as of now but the time is not
far when this technology will change our perception of the world.
The Sixth Sense prototype is comprised of a pocket projector, a mirror and a camera.
The hardware components are coupled in a pendant-like mobile wearable device. Both the projector and
the camera are connected to the mobile computing device in the users pocket. The device projects visual
information, enabling surfaces, walls and physical objects around the wearer to be used as interfaces; while
the camera recognizes and tracks the user's hand gestures and physical objects using computer-vision
based techniques. The software program processes the video stream data captured by the camera and
7/31/2019 AI SPEECH
10/17
tracks the locations of the colored markers at the tip of the users fingers using simple computer-vision
techniques. The movements and arrangements of these fiducials are interpreted into gestures that act as
interaction instructions for the projected application interfaces. The maximum number of tracked fingers is
only constrained by the number of unique fiducials, thus Sixth Sense also supports multi-touch and multi-
user interaction.
The Sixth Sense prototype implements several applications that demonstrate the
usefulness, viability and flexibility of the system. The map application lets the user navigate a map
displayed on a nearby surface using hand gestures, similar to gestures supported by multi-touch based
systems, letting the user zoom in, zoom out or pan using intuitive hand movements. The drawing
application lets the user draw on any surface by tracking the fingertip movements of the users index finger.
Sixth Sense also recognizes users freehand gestures (postures). For example, it implements a gestural
camera that takes photos of the scene the user is looking at by detecting the framing gesture. The user can
stop by any surface or wall and flick through the photos he/she has taken. Sixth Sense also lets the user
draw icons or symbols in the air using the movement of the index finger and recognizes those symbols as
interaction instructions. For example, drawing a magnifying glass symbol takes the user to the mapapplication or drawing an @ symbol lets the user check his mail. The Sixth Sense system also augments
physical objects the user is interacting with by projecting more information about these objects projected
on them. For example, a newspaper can show live video news or dynamic information can be provided on a
regular piece of paper. The gesture of drawing a circle on the users wrist projects an analog watch. The
current prototype system costs approximately $350 to build."
The device sees what we see but it lets out information that we want to know while
viewing the object. It can project information on any surface, be it a wall, table or any other object and uses
hand / arm movements to help us interact with the projected information. The device brings us closer to
reality and assists us in making right decisions by providing the relevant information, thereby, making the
entire world a computer.
The world has shrunk. Distances have dissolved. Communication lines and
interaction with countless systems have been rendered feasible. However this technological overhaul has
been peripheral and not so much related to the human body; researchers and innovators have constantly
grappled with the issue of bridging the gaps which limit the human-environment contact. This device,
tentatively name as the Sixth Sense, is a wearable machine that assists unexplored interactions between the
real and the virtual sphere of data. It consists of certain commonly available components, which are
intrinsic to its functioning. These include a camera, a portable battery-powered projection system coupled
with a mirror and a cell phone. All these components communicate to the cell phone, which acts as the
communication and computation device. The entire hardware apparatus is encompassed in a pendant-
shaped mobile wearable device. Basically the camera recognizes individuals, images, pictures, gestures one
makes with their hands and the projector assists in projecting any information on whatever type of surface
is present in front of the person. The usage of the mirror is significant as the projector dangles pointing
downwards from the neck. To bring out variations on a much higher plane, in the demo video which was
broadcasted to showcase the prototype to the world, Mistry uses colored caps on his fingers so that it
becomes simpler for the software to differentiate between the fingers, demanding various applications. The
software program analyses the video data caught by the camera and also tracks down the locations of the
7/31/2019 AI SPEECH
11/17
colored markers by utilizing single computer vision techniques. One can have any number of hand gestures
and movements as long as they are all reasonably identified and differentiated for the system to interpret it,
preferably through unique and varied. This is possible only because the Sixth Sense device supports multi-
touch and multi-user interaction. There was once a clear divide between the virtual world and the real
world, but that line is getting blurrier every day.
3. GESTURES
The software recognizes three kinds of gestures:
Multi-Touch Gestures:
Like the ones we see in the iphone-where we touch the screen and make
the map move by pinching and dragging.
Freehand Gestures:
Like when you take a picture or Namaste gesture to start the projection on the wall.
ICONIC Gestures:
Drawing a icon in the air. Like, whenever we draw a star, shows us the weather details.
When we draw a magnifying glass, shows us the map.
4. COMPONENTS
The devices which are used in sixth sense technology ra :
1. Camera
2. Coloured Marker
3. Mobile component
4. Projector
5. Mirror
Camera:
It captures the image of the object in view and tracks the users hand gesture. The camera recognizes
individuals, images, pictures, gestures that user makes with his hand. The camera then sends this data to a
smart phone for processing. Basically the camera forms a digital eye which connects to the world of digital
information.
Coloured Marker:There are color markers placed at the tip of users finger. Marking the users fingers with red, yellow green
and blue coloured tape helps the webcam to recognize the hand gestures. The movements and arrangement
of these markers are interpreted into gestures that act as a interaction instruction for the projected
application interfaces.
Mobile Component:
7/31/2019 AI SPEECH
12/17
The SixthSense device consists of a web enabled smart phone which process the data send by the camera.
The smart phone searches the web and interprets the hand gestures with help of the colored markers placed
at the finger tips.
Projector:
The information that is interpreted through the smart phone can be projected into any surface. The
projector projects the visual information enabling surfaces and physical objects to be used as interfaces.
The projector itself consists of a battery which have3 hours of battery life.
A tiny LED projector displays the data sent from the smart phone on any surface in view- object, wall or
person. The downward facing projector projects the image on to a mirror.
Mirror:
The usage of a mirror is important as the projector dangles pointing downwards from the neck. The mirror
reflects the image on to a desire surface. Thus finally the digital image is freed from its confines and placed
in the physical world.
5. HOW IT WORKS
The Sixth Sense prototype is comprised of a pocket projector, a mirror and acamera.
The hardware components are coupled in a pendant like mobile wearable device. Both the projector and the
camera are connectedto the mobile computing device in the users pocket. The projector projects visual
information enabling surfaces, walls and physical objects around us to be used as interfaces; while the
camera recognizes and tracks user's hand gestures and physical objects using computer-vision based
techniques. The software program processes the video stream data captured by the camera and tracks the
locations of the colored markers at the tip of the users fingers using simple computer-vision techniques.
The movements andarrangements of these fiducials are interpreted into gestures that act as
interaction instructions for the projectedapplicationinterfaces. The maximum number of tracked fingers is
only constrained by the number of unique fiducials, thus Sixth Sense also supports multi-touch and multi-
user interaction.
Both the projector and the camera are connected to the mobile computing device in the
users pocket. The projector projects visual information enabling surfaces, walls and physical objects
around us to be used as interfaces; while the camera recognizes and tracks users hand gestures and
physical objects using computer-vision based techniques. The software program processes the video stream
data captured by the camera and tracks the locations of the colored markers at the tip of the users fingers
using simple computer-vision techniques. It also supports multi touch and multi user interaction.
The technology in itself is nothing more than the combination of some stunning
technologies. The technology is mainly based on hand gesture recognition, image capturing, processing, and
manipulation, etc. The software of the technology uses the video stream, which is captured by the camera,
and also tracks the location of the tips of the fingers to recognize the gestures. This process is done using
some techniques of computer vision. However, instead of requiring you to be in front of a big screen like
Tom Cruise, Sixth Sense can do its magicand a lot moreeverywhere. The camera recognizes objects
around you instantly, with the micro-projector overlaying the information on any surface, including the
http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=91041058272748995197/31/2019 AI SPEECH
13/17
object itself or your hand. Then, you can access or manipulate the information using your fingers. The key
here is that Sixth Sense recognizes the objects around you, displaying information automatically and letting
you access it in any way you want, in the simplest way possible. Clearly, this has the potential of becoming
the ultimate "transparent" user interface for accessing information about everything around us. If they can
get rid of the colored finger caps and it ever goes beyond the initial development phase, that is. But as it is
now, it may change the way we interact with the real world and truly give everyone complete awareness of
the environment around us.
The Sixth Sense technology works as follows:
1. It captures the image of the object in view and track the users hand gesture.
2. There are colour markers placed at the tip of users finger. Marking the users fingers with red, yellow
green and blue coloured tape helps the webcam to recognize the hand gestures. The movements and
arrangement of these markers are interpreted into gestures that act as a interaction instruction for the
projected application interfaces.
3. The smart phone searches the web and interprets the hand gestures with help of the coloured markers
placed at the finger tips
4. The information that is interpreted through the smart phone can be projected into any surface.
5. The mirror reflects the image on to a desire surface.
6. TECHNOLOGIES RELATED TO SIXTH SENSE TECHNOLOGY
Augmented Reality:
The augmented Reality is a visualization technology that allows the user to
experience the virtual experience added over real world in real time. With the help of advanced AR
technology the information about the surrounding real world of the user becomes interactive and digitally
usable. Artificial information about the environment and the objects in it can be stored and retrieved as aninformation layer on top of the real world view. When we compare the spectrum between virtual reality,
which creates immersive, computer-generated environments, and the real world, augmented reality is
closer to the real world. Augmented reality adds graphics, sounds, haptic feedback and smell to the natural
world as it exists. Both video games and cell phones are driving the development of augmented reality. The
augmented systems will also superimpose graphics for every perspective available and try adjust to every
movement of the users head and eyes. The three basic components of an augmented reality system are the
head-mounted display, tracking system and mobile computer for the hardware. The main goal of this new
technology is to merge these three components into a highly portable unit much like a combination of a
high tech Walkman and an ordinary pair or eyeglasses. The head-mounted display used in augmented
reality systems will enable the user to view superimposed graphics and text created by the system. Another
component of an augmented reality system is its tracking and orientation system. This system pinpoints the
users location in reference to his surroundings and additionally tracks the users eye and head movements.
Augmented reality systems will need highly mobile computers. As of now many computers arent there to
satisfy to provide this option.
Gesture Recognition:
7/31/2019 AI SPEECH
14/17
It is a technology which is aimed at interpreting human gestures with the
help of mathematical algorithms. Gesture recognition technique basically focuses on the emotion
recognition from the face and hand gesture recognition. Gender recognition technique enables humans to
interact with computers in a more direct way without using any external interfacing devices. It can provide
a much better alternative to text user interfaces and graphical user interface which requires the need of a
keyboard or mouse to interact with the computer. Interfaces which solely depends on the gestures requires
precise hand pose tracking. In the early versions of gesture recognition process special type of hand gloves
which provide information about hand position orientation and flux of the fingers. In the SixthSense
devices coloured bands are used for this purpose. Once hand pose has been captured the gestures can be
recognised using different techniques. Neural network approaches or statistical templates are the
commonly used techniques used for the recognition purposes. This technique have an high accuracy usually
showing accuracy of more than 95%. Time dependent neural network will also be used for real time
recognition of the gestures.
Computer Vision:Computer vision is the technology in which machines are able to
interpret/extract necessary information from an image. Computer vision technology includes various fields
like image processing, image analysis and machine vision. It includes certain aspect of artificial intelligence
techniques like pattern recognition. The machines which implement computer vision techniques require
image sensors which detect electromagnetic radiation which are usually in the form of ultraviolet rays or
light rays. The computer vision find itself applicable in varies field of interest. One such field is bio medical
image processing. Its also used in autonomous vehicle like SUVs. The computer vision technique basically
includes four processes.
1. Recognition: One of the main task of computer vision technique is to determine whether the particular
object contain the useful data or not.
2. Motion Analysis: Motion analysis includes several tasks related to estimation of motion where an image
sequence is processed continuously to detect the velocity at each point of the image or in the 3D scene.
3. Scene Reconstruction: Computer vision technique employs several methods to recreate a 3D image from
the available images of a scene.
4. Image Restoration: The main of aim of this step is to remove noise from an given image. The simplest
method involves using simple filters like low pass or median filters. In order to get better quality images
more complex methods like Inpainting are used.
Radio Frequency Identification:
Radio frequency identification systems transmit the identity of
an object wirelessly, using radio magnetic waves. The main purpose of a radio frequency identification
system is to enable the transfer of a data via a portable device. The portable device is commonly known as
tag. The data send by the tag is received and processed by a reader according to the needs of the
application. The data send by the tag contains various informations identification or location of the
information, or specifics about the product that has been tagged, for example price, colour, date of
7/31/2019 AI SPEECH
15/17
purchase, etc. This technology gained importance because of its ability to track moving object. A typical
radio frequency tag consists of a microchip attached to a radio antenna which is mounted on a substrate. To
retrieve the data from the tag a reader is needed. A typical radio frequency reader consists of two antennas
that emit radio waves and at the same are capable of accepting the signals from the tag. The reader then
passes the information that it has received to a computer device in digital form. The computer then
interprets this digital data and processes it. Radio frequency identification techniques are widely used in
the fields like asset tracking, supply chain management, manufacturing, payment systems etc. One of the
major advantages of devices using radio frequency technology over other similar devices is that RFID
devices need not be positioned precisely relative to the scanner. But till then there are certain areas of
concern for this technology. Some problem related to this technology is tag collision and reader collision.
Usually the reader collision occurs when the signals from two or more readers overlap, while tag collision
occurs when many tags are present in a small area.
7. APPLICATION
The sixth sense technology finds a lot of application in the modern world. The sixthsense devices bridge the gap by bringing the digital world into the real world and in that process allowing
the users to interact with the information without the help of any machine interfaces. Prototypes of the
sixth sense device have demonstrated viability, usefulness and flexibility of this new technology. According
to the words of its developers the extend of use of this new device is only limited by the imagination of
human beings. Some practical applications of the sixth sense technology is given below
Viewing Map:
With the help of a map application the user can call upon any map of his/her choice
and navigate through them by projecting the map on to any surface. By using the thumb and index fingers
movements the user can zoom in, zoom out or pan the selected map.
Taking Pictures:
Another application of sixth sense devices is the implementation of a gestural
camera. This camera takes the photo of the location user is looking at by detecting the framing gesture.
After taking the desired number of photos we can project them onto any surfaces and then use gestures to
sort through those photos and organize and resize them.
Drawing Application:
The drawing application allows the user you to draw on any surface by
tracking the fingertip movements of the users index finger. The pictures that are drawn by the user can be
stored and replaced on any other surface. The user can also shuffle through various pictures and drawing by
using the hand gesture movements
Making Calls:
We can make calls with help of sixth sense device .The sixth sense device is used
to project the keypad into your palm and the using that virtual keypad we can make calls to anyone.
7/31/2019 AI SPEECH
16/17
Interacting with physical objects:
The SixthSense system also helps to interact with physical objects we use in a
better way. It augments physical objects by projecting more information about these objects projected on
them. For example, a gesture of drawing a circle on the users wrist projects an analog watch on the users
hand. Similarly a newspaper can show live video news or dynamic information can be provided on a
regular piece of paper
Getting Information:
Sixth sense devices can be used for getting various information relating to our
everyday life by getting in contact with objects
Product Information:
Sixth sense technology uses marker technology or image recognition
techniques to recognize the objects we pick in our hand and then provide information relating to that
product.
Book Information:
By holding and shuffling through the book pages, the sixth sense provides
Amazon ratings on that book, other reviews and other relevant things related to the book.
Flight Updates:
With the help of the sixth sense technology it is no longer required to log into
any sites for checking the status of the flights. The system will recognize your boarding pass and let you
know whether the flight is on time or not.
Information About People:
With help of face recognition techniques the sixth sense devices are able to
provide information about the people when we meet them. The sensor detects the face and checks the data
base for the relevant information. The system will then project the relevant information about a person like
what they do, where they work,
8. ADVANTAGES
Portable:
One of the main advantages of the sixth sense devices is its small size and
portability. It can be easily carried around without any difficulty. The prototype of the sixth sense is
designed in a such a way that it gives more importance to the portability factor. All the devices are light in
weight and the smart phone can easily fit into the users pocket.
Support multi touch and multi user interaction:
Multi touch and multi user interaction is another added feature of the sixth sense
devices. Multi sensing technique allows the user to interact with system with more than one than one finger
7/31/2019 AI SPEECH
17/17
at a time.. Sixth sense devices also in-cooperates multi user functionality. This is typically useful for large
interaction scenarios such as interactive table tops and walls.
Cost effective:
The cost incurred for the construction of the sixth sense proto type is quiet low. It was
made from parts collected together from common devices. And a typical sixth sense device cost up to $300.
The sixth sense devices have not been made in large scale for commercial purpose. Once that happens its
almost certain that the device will cost much lower than the current price.
Connectedness between real world and digital world:
Forming a connection between the real world and the digital world was the main aim
of the sixth sense technology.
Data access directly from the machines in real time:
With help of a sixth sense device the user can easily access data from any machine at
real time speed. The user doesnt require any machine-human interface to access the data. The data accessthrough recognition of hand gestures is much easier and user friendlier compared to the text user interface
or graphical user interface which requires keyboard or mouse.
Mind map the idea anywhere:
With the advent of the sixth sense device, requirement of a platform or a screen to
analyze and interpret the data has become obsolete. We can project the information into any surface and
can work and manage the data as per our convenience..
Open source software:
The software that is used to interpret and analysis the data collected by the deviceis made open source. This enables other developers to contribute to the development of the system
9. CONCLUSION
Sixth Sense recognizes the objects around us, displaying information automatically and letting
us to access it in any way we need. This prototype implements several applications that demonstrate the
usefulness, viability and flexibility of the system, allowing us to interact with this information via natural
hand gestures. This became the ultimate transparent user interface for accessing information about
everything around us.