AI SPEECH

Embed Size (px)

Citation preview

  • 7/31/2019 AI SPEECH

    1/17

    Abstract and full paper on Artificial Intelligence

    For Speech RecognitionTUESDAY, MARCH 22, 2011 | TAGS:ABSTRACT OF MCA IN IEEE FORMAT,MBA/MCA PPT ABSTRACTS

    DIGG IT|STUMBLE IT|SAVE TO DEL.ICO.US

    Artificial Intelligence For Speech Recognition

    (Abstract)

    DEFINITION: It is the science and engineering of making intelligent machines,

    especially intelligent computer programs.

    APPLICATIONS: Game Playing Speech Recognition Understanding Natural Language Computer Vision Expert Systems Robotics

    SPEECH RECOGNITION:

    Artificial intelligence involves two basic ideas. First, it involvesstudying the thought processes of human beings. Second, it deals withrepresenting those processes via machines (like computers, robots, etc.).

    One of the main benefits of speech recognition system is that it letsuser do other works simultaneously. The user can concentrate onobservation and manual operations, and still control the machinery by voice

    input commands.

    A number of algorithms for speech enhancement have been proposed.These include the following:

    1. Spectral subtraction of DFT coefficients2. MMSE techniques to estimate the DFT coefficients of corruptedspeech

    http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.htmlhttp://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.htmlhttp://www.creativeworld9.com/search/label/abstract%20of%20MCA%20in%20ieee%20formathttp://www.creativeworld9.com/search/label/MBA%2FMCA%20%20PPT%20ABSTRACTShttp://www.creativeworld9.com/search/label/MBA%2FMCA%20%20PPT%20ABSTRACTShttp://digg.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://digg.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.stumbleupon.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.stumbleupon.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.stumbleupon.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://del.icio.us/post?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://del.icio.us/post?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.htmlhttp://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.htmlhttp://www.creativeworld9.com/search/label/abstract%20of%20MCA%20in%20ieee%20formathttp://www.creativeworld9.com/search/label/MBA%2FMCA%20%20PPT%20ABSTRACTShttp://digg.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://www.stumbleupon.com/submit?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognitionhttp://del.icio.us/post?url=http://www.creativeworld9.com/2011/03/abstract-and-full-paper-on-artificial_22.html&title=Abstract%20and%20full%20paper%20on%20Artificial%20Intelligence%20For%20Speech%20Recognition
  • 7/31/2019 AI SPEECH

    2/17

    3. Spectral equalization to compensate for convoluted distortions4. Spectral subtraction and spectral equalization.

    CONCLUSION:By using this speaker recognition technology we can achieve many

    uses. This technology helps physically challenged skilled persons. Thesepeople can do their works by using this technology with out pushing anybuttons. This ASR technology is also used in military weapons and inResearch centers. Now a days this technology was also used by CIDofficers. They used this to trap the criminal activities.

    INDEX

    Concepts

    1. INTRODUCTION

    2. DEFINITION

    3. HISTORY

    4. FOUNDATION

    5. SPEAKER INDEPENDENCY

    6. ENVIRONMENTAL INFLUENCE

    7. SPEAKER SPECIFIC FEATURES

    8. SPEECH RECOGNITION

    9. APPLICATIONS

    10. GOAL

    11. CONCLUSION

    12. BIBLIOGRAPHY

    Artificial Intelligence For Speech Recognition

  • 7/31/2019 AI SPEECH

    3/17

    Introduction:

    Artificial intelligence involves two basic ideas. First, it involves

    studying the thought processes of human beings. Second, it deals with

    representing those processes via machines (like computers, robots, etc.).

    AI is behavior of a machine, which, if performed by a human being,

    would be called intelligent. It makes machines smarter and more useful,

    and is less expensive than natural intelligence.

    Natural language processing (NLP) refers to artificial intelligence

    methods of communicating with a computer in a natural language like

    English. The main objective of a NLP program is to understand input and

    initiate action.

    Definition:

    It is the science and engineering of making intelligent machines,

    especially intelligent computer programs.

    AI means Artificial Intelligence. Intelligence however cannot be

    defined but AI can be described as branch of computer science dealing withthe simulation of machine exhibiting intelligent behavior.

    History:

    Work started soon after World-WarII.

    Name is coined in 1957.

    Several names that are proposed are

    Complex Information Processing

    Heuristic programming Machine Intelligence

    Computational Rationally

    Foundation:

    Philosophy (428 B.C.-present)

  • 7/31/2019 AI SPEECH

    4/17

    Mathematics (c.800-present)

    Economics (1776-present)

    Neuroscience (1861-present)

    Psychology (1879-present)

    Computer Engineering (1940-present)

    Control theory and cybernetics (1948-present)

    Linguistics (1957-present)

    Speaker independency:

    The speech quality varies from person to person. It is therefore

    difficult to build an electronic system that recognizes everyones voice. By

    limiting the system to the voice of a single person, the system becomes not

    only simpler but also more reliable. The computer must be trained to the

    voice of that particular individual. Such a system is called speaker-

    dependent system.

    Speaker independent systems can be used by anybody, and can

    recognize any voice, even though the characteristics vary widely from one

    speaker to another. Most of these systems are costly and complex. Also,

    these have very limited vocabularies.

    It is important to consider the environment in which the speech

    recognition system has to work. The grammar used by the speaker and

    accepted by the system, noise level, noise type , position of the

    microphone, and speed and manner of the users speech are some factors

    that may affect the quality of speech recognition.

    Environmental influence:

    Real applications demand that the performance of the recognition system

    be unaffected by changes in the environment. However, it is a fact that

    when a system is trained and tested under different conditions, the

    recognition rate drops unacceptably. We need to be concerned about the

    variability present when different microphones are used in training and

    testing, and specifically during development of procedures. Such care can

  • 7/31/2019 AI SPEECH

    5/17

    significantly improve the accuracy of recognition systems that use desktop

    microphones.

    Acoustical distortions can degrade the accuracy of recognition systems.

    Obstacles to robustness include additive noise from machinery, competing

    talkers, reverberation from surface reflections in a room, and spectral

    shaping by microphones and the vocal tracts of individual speakers. These

    sources of distortions fall into two complementary classes; additive noise

    and distortions resulting from the convolution of the speech signal with an

    unknown linear system.

    A number of algorithms for speech enhancement have been proposed.

    These include the following:

    1. Spectral subtraction of DFT coefficients

    2. MMSE techniques to estimate the DFT coefficients of corrupted

    speech

    3. Spectral equalization to compensate for convoluted distortions

    4. Spectral subtraction and spectral equalization.

    Although relatively successful, all these methods depend on the assumption

    of independence of the spectral estimates across frequencies. Improved

    performance can be got with an MMSE estimator in which correlation

    among frequencies is modeled explicitly.

    Speaker-specific features:

    Speaker identity correlates with the physiological and behavioral

    characteristics of the speaker. These characteristics exist both in the vocal

    tract characteristics and in the voice source characteristics, as also in the

    dynamic features spanning several segments.

    The most common short-term spectral measurements currently used are

    the spectral coefficients derived from the Linear Predictive Coding(LPC) and

    their regression coefficients. A spectral envelope reconstructed from a

    truncated set of spectral coefficients is much smoother than one

    reconstructed from LPC coefficients.

  • 7/31/2019 AI SPEECH

    6/17

    Therefore, it provides a more stable representation from one repetition to

    another of a particular speakers utterances.

    As for the regression coefficients, typically the first and second order

    coefficients are extracted at every frame period to represent the spectral

    dynamics.

    These coefficients are derivatives of the time function of the spectral

    coefficients and are called the delta and delta-delta-spectral coefficients

    respectively.

    Speech Recognition:

    The user communicates with the application through the appropriate input

    device i.e. a microphone. The Recognizer converts the analog signal into

    digital signal for the speech processing. A stream of text is generated after

    the processing. This source-language text becomes input to the Translation

    Engine, which converts it to the target language text.

    Salient Features:

    Input Modes

    Through Speech Engine

    Through soft copy

    Interactive Graphical User Interface

    Format Retention

    Fast and standard translation Interactive Preprocessing tool

    Spell checker.

    Phrase marker

    Proper noun, date and other package specific identifier

    Input Format

    txt,.doc.rtf

  • 7/31/2019 AI SPEECH

    7/17

    User friendly selection of multiple output

    Online thesaurus for selection of contextually appropriate

    synonym

    Online word addition, grammar creation and updating facility

    Personal account creation and inbox management

    Applications:

    One of the main benefits of speech recognition system is that it lets

    user do other works simultaneously. The user can concentrate on

    observation and manual operations, and still control the machinery by voice

    input commands.

    Another major application of speech processing is in military

    operations. Voice control of weapons is an example. With reliable speech

    recognition equipment, pilots can give commands and information to the

    computers by simply speaking into their microphones - they dont have to

    use their hands for this purpose.

    Another good example is a radiologist scanning hundreds of X-rays,

    ultra sonograms, CT scans and simultaneously dictating conclusions to a

    speech recognition system connected to word processors. The radiologist

    can focus his attention on the images rather than writing the text.

    Voice recognition could also be used on computers for making airline

    and hotel reservations. A user requires simply to state his needs, to make

    reservation, cancel a reservation, or make enquiries about schedule.

    Ultimate Goal:

    The ultimate goal of the Artificial Intelligence is to build a person, or,

    more humbly, an animal.

    Conclusion:

  • 7/31/2019 AI SPEECH

    8/17

    By using this speaker recognition technology we can achieve many

    uses. This technology helps physically challenged skilled persons. These

    people can do their works by using this technology with out pushing any

    buttons. This ASR technology is also used in military weapons and in

    Research centers. Now a days this technology was also used by CID

    officers. They used this to trap the criminal activities.

    Bibliography:

    www.google.co.in/Artificial intelligence for speech recognition

    www.google.com

    www.howstuffworks.comwww.ieeexplore.ieee.org

    Seminar Report On SIXTH SENSE TECHNOLOGY

  • 7/31/2019 AI SPEECH

    9/17

    Although miniaturized versions of computers help us to connect to the digital world even while we are

    travelling there arent any device as of now which gives a direct link between the digital world and our

    physical interaction with the real world. Usually the informations are stored traditionally on a paper or a

    digital storage device. Sixth sense technology helps to bridge this gap between tangible and non-tangible

    world. Sixth Sense device is basically a wearable gestural interface that connects the physical world around

    us with digital information and lets us use natural hand gestures to interact with this information .The sixth

    sense technology was developed by Pranav Mistry, a PhD student in the Fluid Interfaces Group at the MIT

    Media Lab. The sixth sense technology has a Web 4.0 view of human and machine interactions. Sixth Sense

    integrates digital information into the physical world and its objects, making the entire world your

    computer. It can turn any surface into a touch-screen for computing, controlled by simple hand gestures. It

    is not a technology which is aimed at changing human habits but causing computers and other machines to

    adapt to human needs. It also supports multi user and multi touch provisions. Sixth Sense device is a mini-

    projector coupled with a camera and a cell phonewhich acts as the computer and your connection to the

    Cloud, all the information stored on the web. The current prototype costs around $350. The Sixth Sense

    prototype is used to implement several applications that have shown the usefulness, viability and flexibility

    of the system.

    2. DEFINITION

    Sixth Sense' is a wearable gestural interface that augments the physical world around

    us with digital information and lets us use natural hand gestures to interact with that information. By using

    a camera and a tiny projector mounted in a pendant like wearable device, 'Sixth Sense' sees what you see

    and visually augments any surfaces or objects we are interacting with. It projects information onto surfaces,walls, and physical objects around us, and lets us interact with the projected information through natural

    hand gestures, arm movements, or our interaction with the object itself. 'Sixth Sense' attempts to free

    information from its confines by seamlessly integrating it with reality, and thus making the entire world

    your computer. All of us are aware of the five basic senses seeing, feeling, smelling, tasting and hearing.

    But there is also another sense called the sixth sense. It is basically a connection to something greater than

    what their physical senses are able to perceive. To a layman, it would be something supernatural. Some

    might just consider it to be a superstition or something psychological. But the invention ofsixth sense

    technology has completely shocked the world. Although it is not widely known as of now but the time is not

    far when this technology will change our perception of the world.

    The Sixth Sense prototype is comprised of a pocket projector, a mirror and a camera.

    The hardware components are coupled in a pendant-like mobile wearable device. Both the projector and

    the camera are connected to the mobile computing device in the users pocket. The device projects visual

    information, enabling surfaces, walls and physical objects around the wearer to be used as interfaces; while

    the camera recognizes and tracks the user's hand gestures and physical objects using computer-vision

    based techniques. The software program processes the video stream data captured by the camera and

  • 7/31/2019 AI SPEECH

    10/17

    tracks the locations of the colored markers at the tip of the users fingers using simple computer-vision

    techniques. The movements and arrangements of these fiducials are interpreted into gestures that act as

    interaction instructions for the projected application interfaces. The maximum number of tracked fingers is

    only constrained by the number of unique fiducials, thus Sixth Sense also supports multi-touch and multi-

    user interaction.

    The Sixth Sense prototype implements several applications that demonstrate the

    usefulness, viability and flexibility of the system. The map application lets the user navigate a map

    displayed on a nearby surface using hand gestures, similar to gestures supported by multi-touch based

    systems, letting the user zoom in, zoom out or pan using intuitive hand movements. The drawing

    application lets the user draw on any surface by tracking the fingertip movements of the users index finger.

    Sixth Sense also recognizes users freehand gestures (postures). For example, it implements a gestural

    camera that takes photos of the scene the user is looking at by detecting the framing gesture. The user can

    stop by any surface or wall and flick through the photos he/she has taken. Sixth Sense also lets the user

    draw icons or symbols in the air using the movement of the index finger and recognizes those symbols as

    interaction instructions. For example, drawing a magnifying glass symbol takes the user to the mapapplication or drawing an @ symbol lets the user check his mail. The Sixth Sense system also augments

    physical objects the user is interacting with by projecting more information about these objects projected

    on them. For example, a newspaper can show live video news or dynamic information can be provided on a

    regular piece of paper. The gesture of drawing a circle on the users wrist projects an analog watch. The

    current prototype system costs approximately $350 to build."

    The device sees what we see but it lets out information that we want to know while

    viewing the object. It can project information on any surface, be it a wall, table or any other object and uses

    hand / arm movements to help us interact with the projected information. The device brings us closer to

    reality and assists us in making right decisions by providing the relevant information, thereby, making the

    entire world a computer.

    The world has shrunk. Distances have dissolved. Communication lines and

    interaction with countless systems have been rendered feasible. However this technological overhaul has

    been peripheral and not so much related to the human body; researchers and innovators have constantly

    grappled with the issue of bridging the gaps which limit the human-environment contact. This device,

    tentatively name as the Sixth Sense, is a wearable machine that assists unexplored interactions between the

    real and the virtual sphere of data. It consists of certain commonly available components, which are

    intrinsic to its functioning. These include a camera, a portable battery-powered projection system coupled

    with a mirror and a cell phone. All these components communicate to the cell phone, which acts as the

    communication and computation device. The entire hardware apparatus is encompassed in a pendant-

    shaped mobile wearable device. Basically the camera recognizes individuals, images, pictures, gestures one

    makes with their hands and the projector assists in projecting any information on whatever type of surface

    is present in front of the person. The usage of the mirror is significant as the projector dangles pointing

    downwards from the neck. To bring out variations on a much higher plane, in the demo video which was

    broadcasted to showcase the prototype to the world, Mistry uses colored caps on his fingers so that it

    becomes simpler for the software to differentiate between the fingers, demanding various applications. The

    software program analyses the video data caught by the camera and also tracks down the locations of the

  • 7/31/2019 AI SPEECH

    11/17

    colored markers by utilizing single computer vision techniques. One can have any number of hand gestures

    and movements as long as they are all reasonably identified and differentiated for the system to interpret it,

    preferably through unique and varied. This is possible only because the Sixth Sense device supports multi-

    touch and multi-user interaction. There was once a clear divide between the virtual world and the real

    world, but that line is getting blurrier every day.

    3. GESTURES

    The software recognizes three kinds of gestures:

    Multi-Touch Gestures:

    Like the ones we see in the iphone-where we touch the screen and make

    the map move by pinching and dragging.

    Freehand Gestures:

    Like when you take a picture or Namaste gesture to start the projection on the wall.

    ICONIC Gestures:

    Drawing a icon in the air. Like, whenever we draw a star, shows us the weather details.

    When we draw a magnifying glass, shows us the map.

    4. COMPONENTS

    The devices which are used in sixth sense technology ra :

    1. Camera

    2. Coloured Marker

    3. Mobile component

    4. Projector

    5. Mirror

    Camera:

    It captures the image of the object in view and tracks the users hand gesture. The camera recognizes

    individuals, images, pictures, gestures that user makes with his hand. The camera then sends this data to a

    smart phone for processing. Basically the camera forms a digital eye which connects to the world of digital

    information.

    Coloured Marker:There are color markers placed at the tip of users finger. Marking the users fingers with red, yellow green

    and blue coloured tape helps the webcam to recognize the hand gestures. The movements and arrangement

    of these markers are interpreted into gestures that act as a interaction instruction for the projected

    application interfaces.

    Mobile Component:

  • 7/31/2019 AI SPEECH

    12/17

    The SixthSense device consists of a web enabled smart phone which process the data send by the camera.

    The smart phone searches the web and interprets the hand gestures with help of the colored markers placed

    at the finger tips.

    Projector:

    The information that is interpreted through the smart phone can be projected into any surface. The

    projector projects the visual information enabling surfaces and physical objects to be used as interfaces.

    The projector itself consists of a battery which have3 hours of battery life.

    A tiny LED projector displays the data sent from the smart phone on any surface in view- object, wall or

    person. The downward facing projector projects the image on to a mirror.

    Mirror:

    The usage of a mirror is important as the projector dangles pointing downwards from the neck. The mirror

    reflects the image on to a desire surface. Thus finally the digital image is freed from its confines and placed

    in the physical world.

    5. HOW IT WORKS

    The Sixth Sense prototype is comprised of a pocket projector, a mirror and acamera.

    The hardware components are coupled in a pendant like mobile wearable device. Both the projector and the

    camera are connectedto the mobile computing device in the users pocket. The projector projects visual

    information enabling surfaces, walls and physical objects around us to be used as interfaces; while the

    camera recognizes and tracks user's hand gestures and physical objects using computer-vision based

    techniques. The software program processes the video stream data captured by the camera and tracks the

    locations of the colored markers at the tip of the users fingers using simple computer-vision techniques.

    The movements andarrangements of these fiducials are interpreted into gestures that act as

    interaction instructions for the projectedapplicationinterfaces. The maximum number of tracked fingers is

    only constrained by the number of unique fiducials, thus Sixth Sense also supports multi-touch and multi-

    user interaction.

    Both the projector and the camera are connected to the mobile computing device in the

    users pocket. The projector projects visual information enabling surfaces, walls and physical objects

    around us to be used as interfaces; while the camera recognizes and tracks users hand gestures and

    physical objects using computer-vision based techniques. The software program processes the video stream

    data captured by the camera and tracks the locations of the colored markers at the tip of the users fingers

    using simple computer-vision techniques. It also supports multi touch and multi user interaction.

    The technology in itself is nothing more than the combination of some stunning

    technologies. The technology is mainly based on hand gesture recognition, image capturing, processing, and

    manipulation, etc. The software of the technology uses the video stream, which is captured by the camera,

    and also tracks the location of the tips of the fingers to recognize the gestures. This process is done using

    some techniques of computer vision. However, instead of requiring you to be in front of a big screen like

    Tom Cruise, Sixth Sense can do its magicand a lot moreeverywhere. The camera recognizes objects

    around you instantly, with the micro-projector overlaying the information on any surface, including the

    http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519http://www.blogger.com/post-create.g?blogID=9104105827274899519
  • 7/31/2019 AI SPEECH

    13/17

    object itself or your hand. Then, you can access or manipulate the information using your fingers. The key

    here is that Sixth Sense recognizes the objects around you, displaying information automatically and letting

    you access it in any way you want, in the simplest way possible. Clearly, this has the potential of becoming

    the ultimate "transparent" user interface for accessing information about everything around us. If they can

    get rid of the colored finger caps and it ever goes beyond the initial development phase, that is. But as it is

    now, it may change the way we interact with the real world and truly give everyone complete awareness of

    the environment around us.

    The Sixth Sense technology works as follows:

    1. It captures the image of the object in view and track the users hand gesture.

    2. There are colour markers placed at the tip of users finger. Marking the users fingers with red, yellow

    green and blue coloured tape helps the webcam to recognize the hand gestures. The movements and

    arrangement of these markers are interpreted into gestures that act as a interaction instruction for the

    projected application interfaces.

    3. The smart phone searches the web and interprets the hand gestures with help of the coloured markers

    placed at the finger tips

    4. The information that is interpreted through the smart phone can be projected into any surface.

    5. The mirror reflects the image on to a desire surface.

    6. TECHNOLOGIES RELATED TO SIXTH SENSE TECHNOLOGY

    Augmented Reality:

    The augmented Reality is a visualization technology that allows the user to

    experience the virtual experience added over real world in real time. With the help of advanced AR

    technology the information about the surrounding real world of the user becomes interactive and digitally

    usable. Artificial information about the environment and the objects in it can be stored and retrieved as aninformation layer on top of the real world view. When we compare the spectrum between virtual reality,

    which creates immersive, computer-generated environments, and the real world, augmented reality is

    closer to the real world. Augmented reality adds graphics, sounds, haptic feedback and smell to the natural

    world as it exists. Both video games and cell phones are driving the development of augmented reality. The

    augmented systems will also superimpose graphics for every perspective available and try adjust to every

    movement of the users head and eyes. The three basic components of an augmented reality system are the

    head-mounted display, tracking system and mobile computer for the hardware. The main goal of this new

    technology is to merge these three components into a highly portable unit much like a combination of a

    high tech Walkman and an ordinary pair or eyeglasses. The head-mounted display used in augmented

    reality systems will enable the user to view superimposed graphics and text created by the system. Another

    component of an augmented reality system is its tracking and orientation system. This system pinpoints the

    users location in reference to his surroundings and additionally tracks the users eye and head movements.

    Augmented reality systems will need highly mobile computers. As of now many computers arent there to

    satisfy to provide this option.

    Gesture Recognition:

  • 7/31/2019 AI SPEECH

    14/17

    It is a technology which is aimed at interpreting human gestures with the

    help of mathematical algorithms. Gesture recognition technique basically focuses on the emotion

    recognition from the face and hand gesture recognition. Gender recognition technique enables humans to

    interact with computers in a more direct way without using any external interfacing devices. It can provide

    a much better alternative to text user interfaces and graphical user interface which requires the need of a

    keyboard or mouse to interact with the computer. Interfaces which solely depends on the gestures requires

    precise hand pose tracking. In the early versions of gesture recognition process special type of hand gloves

    which provide information about hand position orientation and flux of the fingers. In the SixthSense

    devices coloured bands are used for this purpose. Once hand pose has been captured the gestures can be

    recognised using different techniques. Neural network approaches or statistical templates are the

    commonly used techniques used for the recognition purposes. This technique have an high accuracy usually

    showing accuracy of more than 95%. Time dependent neural network will also be used for real time

    recognition of the gestures.

    Computer Vision:Computer vision is the technology in which machines are able to

    interpret/extract necessary information from an image. Computer vision technology includes various fields

    like image processing, image analysis and machine vision. It includes certain aspect of artificial intelligence

    techniques like pattern recognition. The machines which implement computer vision techniques require

    image sensors which detect electromagnetic radiation which are usually in the form of ultraviolet rays or

    light rays. The computer vision find itself applicable in varies field of interest. One such field is bio medical

    image processing. Its also used in autonomous vehicle like SUVs. The computer vision technique basically

    includes four processes.

    1. Recognition: One of the main task of computer vision technique is to determine whether the particular

    object contain the useful data or not.

    2. Motion Analysis: Motion analysis includes several tasks related to estimation of motion where an image

    sequence is processed continuously to detect the velocity at each point of the image or in the 3D scene.

    3. Scene Reconstruction: Computer vision technique employs several methods to recreate a 3D image from

    the available images of a scene.

    4. Image Restoration: The main of aim of this step is to remove noise from an given image. The simplest

    method involves using simple filters like low pass or median filters. In order to get better quality images

    more complex methods like Inpainting are used.

    Radio Frequency Identification:

    Radio frequency identification systems transmit the identity of

    an object wirelessly, using radio magnetic waves. The main purpose of a radio frequency identification

    system is to enable the transfer of a data via a portable device. The portable device is commonly known as

    tag. The data send by the tag is received and processed by a reader according to the needs of the

    application. The data send by the tag contains various informations identification or location of the

    information, or specifics about the product that has been tagged, for example price, colour, date of

  • 7/31/2019 AI SPEECH

    15/17

    purchase, etc. This technology gained importance because of its ability to track moving object. A typical

    radio frequency tag consists of a microchip attached to a radio antenna which is mounted on a substrate. To

    retrieve the data from the tag a reader is needed. A typical radio frequency reader consists of two antennas

    that emit radio waves and at the same are capable of accepting the signals from the tag. The reader then

    passes the information that it has received to a computer device in digital form. The computer then

    interprets this digital data and processes it. Radio frequency identification techniques are widely used in

    the fields like asset tracking, supply chain management, manufacturing, payment systems etc. One of the

    major advantages of devices using radio frequency technology over other similar devices is that RFID

    devices need not be positioned precisely relative to the scanner. But till then there are certain areas of

    concern for this technology. Some problem related to this technology is tag collision and reader collision.

    Usually the reader collision occurs when the signals from two or more readers overlap, while tag collision

    occurs when many tags are present in a small area.

    7. APPLICATION

    The sixth sense technology finds a lot of application in the modern world. The sixthsense devices bridge the gap by bringing the digital world into the real world and in that process allowing

    the users to interact with the information without the help of any machine interfaces. Prototypes of the

    sixth sense device have demonstrated viability, usefulness and flexibility of this new technology. According

    to the words of its developers the extend of use of this new device is only limited by the imagination of

    human beings. Some practical applications of the sixth sense technology is given below

    Viewing Map:

    With the help of a map application the user can call upon any map of his/her choice

    and navigate through them by projecting the map on to any surface. By using the thumb and index fingers

    movements the user can zoom in, zoom out or pan the selected map.

    Taking Pictures:

    Another application of sixth sense devices is the implementation of a gestural

    camera. This camera takes the photo of the location user is looking at by detecting the framing gesture.

    After taking the desired number of photos we can project them onto any surfaces and then use gestures to

    sort through those photos and organize and resize them.

    Drawing Application:

    The drawing application allows the user you to draw on any surface by

    tracking the fingertip movements of the users index finger. The pictures that are drawn by the user can be

    stored and replaced on any other surface. The user can also shuffle through various pictures and drawing by

    using the hand gesture movements

    Making Calls:

    We can make calls with help of sixth sense device .The sixth sense device is used

    to project the keypad into your palm and the using that virtual keypad we can make calls to anyone.

  • 7/31/2019 AI SPEECH

    16/17

    Interacting with physical objects:

    The SixthSense system also helps to interact with physical objects we use in a

    better way. It augments physical objects by projecting more information about these objects projected on

    them. For example, a gesture of drawing a circle on the users wrist projects an analog watch on the users

    hand. Similarly a newspaper can show live video news or dynamic information can be provided on a

    regular piece of paper

    Getting Information:

    Sixth sense devices can be used for getting various information relating to our

    everyday life by getting in contact with objects

    Product Information:

    Sixth sense technology uses marker technology or image recognition

    techniques to recognize the objects we pick in our hand and then provide information relating to that

    product.

    Book Information:

    By holding and shuffling through the book pages, the sixth sense provides

    Amazon ratings on that book, other reviews and other relevant things related to the book.

    Flight Updates:

    With the help of the sixth sense technology it is no longer required to log into

    any sites for checking the status of the flights. The system will recognize your boarding pass and let you

    know whether the flight is on time or not.

    Information About People:

    With help of face recognition techniques the sixth sense devices are able to

    provide information about the people when we meet them. The sensor detects the face and checks the data

    base for the relevant information. The system will then project the relevant information about a person like

    what they do, where they work,

    8. ADVANTAGES

    Portable:

    One of the main advantages of the sixth sense devices is its small size and

    portability. It can be easily carried around without any difficulty. The prototype of the sixth sense is

    designed in a such a way that it gives more importance to the portability factor. All the devices are light in

    weight and the smart phone can easily fit into the users pocket.

    Support multi touch and multi user interaction:

    Multi touch and multi user interaction is another added feature of the sixth sense

    devices. Multi sensing technique allows the user to interact with system with more than one than one finger

  • 7/31/2019 AI SPEECH

    17/17

    at a time.. Sixth sense devices also in-cooperates multi user functionality. This is typically useful for large

    interaction scenarios such as interactive table tops and walls.

    Cost effective:

    The cost incurred for the construction of the sixth sense proto type is quiet low. It was

    made from parts collected together from common devices. And a typical sixth sense device cost up to $300.

    The sixth sense devices have not been made in large scale for commercial purpose. Once that happens its

    almost certain that the device will cost much lower than the current price.

    Connectedness between real world and digital world:

    Forming a connection between the real world and the digital world was the main aim

    of the sixth sense technology.

    Data access directly from the machines in real time:

    With help of a sixth sense device the user can easily access data from any machine at

    real time speed. The user doesnt require any machine-human interface to access the data. The data accessthrough recognition of hand gestures is much easier and user friendlier compared to the text user interface

    or graphical user interface which requires keyboard or mouse.

    Mind map the idea anywhere:

    With the advent of the sixth sense device, requirement of a platform or a screen to

    analyze and interpret the data has become obsolete. We can project the information into any surface and

    can work and manage the data as per our convenience..

    Open source software:

    The software that is used to interpret and analysis the data collected by the deviceis made open source. This enables other developers to contribute to the development of the system

    9. CONCLUSION

    Sixth Sense recognizes the objects around us, displaying information automatically and letting

    us to access it in any way we need. This prototype implements several applications that demonstrate the

    usefulness, viability and flexibility of the system, allowing us to interact with this information via natural

    hand gestures. This became the ultimate transparent user interface for accessing information about

    everything around us.