Embedding Intelligence in Everyday Objects with TJBot

Preview:

Citation preview

IBM Research

1

IBM Research

Victor Dibia

Embedding Intelligence in Everyday Objects with TJBot.An open source DIY project powered by Watson Cognitive Services.

Human – Agent Collaboration Lab, IBM Researchdibiavc@us.ibm.com@vykthur | github.com/victordibia

Feb 20, 2017

IBM Research

2

TJBot : What and Why?- Open source DIY project to get you engaged

with Watson Services

IBM Research

3What is TJBot?- A cardboard robot

- Simple, approachable- Open Source (design, code)- Cognitive (IBM Watson services)- Extensible (prototyping platform)

Components: Raspberry Pi, LED, Camera, Microphone, Speaker, Servo.

IBM Research

4

3D Print Laser cut

IBM Research

5ibm.biz/mytjbot

IBM Research

6Recipes.Step by step instructions +Code (node.js) to help you prototype capabilities for TJBot powered by Watson services.

http://www.instructables.com/member/TJBot/

IBM Research

7Project GoalsHow can we make it easier to engage a community of enthusiasts experimenting with embodied cognition – the idea of embedding intelligence in everyday objects within the physical world?

IBM Research

8Project GoalsDesign principle – Approachable Design

- Use of familiar material (cardboard) that can be altered with ease.

- Simplified part assembly: no soldering or adhesive required.

- Simplified programming model and language interface (JavaScript).

IBM Research

9Project OutcomeA prototyping platform to help democratize Embodied Cognition.

Target communities:

- Makers- Developers- Students (Education and Learning)

IBM Research

10How Does Watson Enable TJBot?ListenWatson Speech to Text service converts spoken speech to text that can be analyzed

SpeakWatson Text to Speech service service converts text to sound using various voices.

Understand EmotionsWatson Tone Analyzer service can infer the emotion within text. E.g.. it can tell if a message contains emotions like happy , sad, angry

Understand ConversationsWatson Conversation Service can respond to users in a way that simulates a conversation between humans.

SeeWatson Visual Recognition service can understand the content of an image and describe it.

IBM Research

11TJBotSensors

ExampleCapabilities

ExampleWatson Services

ExampleUse cases

LED

Speakers

Camera Servo Motor Arm

Microphone Listen

Speak

Shine

Show emotion

Wave

See

Speech to text

Tone Analyzer

Vision Recognition

Conversation

Text to speech

Sentiment Analysis

Virtual Agents (eldercare, home care)

Education (language learning)

IBM Research

12

Demo.- Watson Services

- Speech to text- Text to speech- Conversation- Visual Recognition

IBM Research

13

Overview of Watson Services

IBM Research

14

IBM Watson Cognitive

Take your first step into the cognitive era with our variety of smart services.

Services.

- Natural interaction- Semi-structured data processing- Trained and continuously improved via machine learning and deep

learning.- Restful API services with SDKs for node.js, java, python.

IBM Research

15

Language

IBM Research

16

Speech

Vision

Data Insights

IBM Research

17

Speech

Vision

Data Insights

IBM Research

18Speech to TextConverts audio voice into written text.

• Transcription • Voice-controlled applications: allows for custom

models

https://speech-to-text-demo.mybluemix.net/

IBM Research

19Text To SpeechConverts written text into natural sounding audio in a variety of languages and voices. • Customize and control the pronunciation of specific words to

deliver a seamless voice interaction that catered s to your audience.

• Interactive voice based applications.

https://text-to-speech-demo.mybluemix.net/

IBM Research

20Tone AnalyzerUses linguistic analysis to detect three types of tones in written text: emotions, social tendencies, and writing style. • Understand emotional context in conversations or

communications• Taylor interaction based on sentiment.

https://tone-analyzer-demo.mybluemix.net/

IBM Research

21Visual RecognitionUnderstands the contents of images - visual concepts tag the image, find human faces, approximate age and gender, and find similar images in a collection. • Train the service by creating your own custom concepts.

Use Visual Recognition to detect a dress type in retail, identify spoiled fruit in inventory, and more.

https://visual-recognition-demo.mybluemix.net/

IBM Research

22AlchemyLanguage

Analyzes text to help you understand its concepts, entities, keywords, sentiment, and more.

• Additionally, you can create a custom model for some APIs to get specific results that are tailored to your domain.

https://alchemy-language-demo.mybluemix.net/

IBM Research

23ConversationQuickly build, test and deploy a bot or virtual agent across mobile devices, messaging platforms like Slack or even on a physical robot.

• Visual dialog builder to help you create natural conversations between your apps and users, without any coding experience required.

https://conversation-demo.mybluemix.net/

IBM Research

24

Programming TJBot- Getting started - Tying stuff together

IBM Research

25Steps.

- Wifi Setup, - Raspberry Pi Software update - Hardware setup : LED, Servo, Microphone, etc- Credential Setup : Bluemix- Recipe software setup : Clone Github repo - Ready to go.

IBM Research

26

http://www.instructables.com/member/TJBot/Instructions

IBM Research

27Libraries UsedDepends on several npm packages.

- RGB LED – ws281x library- Servo – pigpio software PWM library- Microphone – mic library- Speaker – aplay library- Camera – raspistill wrapper

IBM Research

28Code Walk through: Control LED on TJBot using voice.- http://www.instructables.com/i

d/Use-Your-Voice-to-Control-a-Light-With-Watson/

- Code Walk through

IBM Research

29

TJBot Library [Beta]- Experimental work to encapsulate basic

functions of the bot.- https://github.com/ibmtjbot/tjbotlib

IBM Research

30The TJBot Library

Encapsulate basic functions for TJBot such as listening, speaking, led color change, waving, seeing.

IBM Research

31The TJBot Library

tj.listen(transcript callback)tj.speak(“text”)tj.converse()tj.see()tj.shine(“red”)

IBM Research

32Code Walk through: Control LED using the TJBot library.

- https://github.com/ibmtjbot/recipes

- Code Walk through

IBM Research

33

Open Issues- Improving accuracy- Bot “Interruptibility” - Gracefully managing latency

IBM Research

34Improving AccuracyHow do we improve interaction (voice) accuracy? Improving Speech-to-Text models may not be enough!

- Customized language models?- Intent Matching?- Multi-turn conversations?

IBM Research

35Bot “Interruptibility”When and how should the robot be interrupted (while performing an activity like speaking, waving etc.)?

- Vision? (monitoring a user’s facial expression, raised hand)

- Hardware button or sensor?

IBM Research

36Latency ToleranceLatency can severely degrade quality of interaction. How do we minimize its effect?

- Managing and ordering service responses- Leverage cues to provide additional information- Balancing capabilities – cloud vs local

processing.

IBM Research

37

Next Steps

IBM Research

38Next Steps3 pronged

- Conduct basic research that address open issues.

- Make TJBot simpler and easier to use (tjbotlibrary, visual programming tool)

- Build and sustain the TJBot community.

IBM Research

39Learn more?

- Ibm.biz/mytjbot- http://www.instructables.com/

member/TJBot/

IBM Research

40

Thank You!

Recommended