OTTO-FRIEDRICH-UNIVERSITY BAMBERG Cognitive Systems Group Shall we build a tower together? A study of human-robot interaction with the humanoid robot NAO Bachelor Thesis in the degree course Applied Computer Science Faculty of Information Systems and Applied Computer Science Author: Ioulia Kalpakoula Supervisor: Prof. Dr. Ute Schmid
Shall we build a tower together?
A study of human-robot interaction with the humanoid robot
NAO
Bachelor Thesis
in the degree course Applied Computer Science Faculty of
Information Systems and Applied Computer Science
Author: Ioulia Kalpakoula
Abstract
One main aspect in robot research is the use of robots in service
duty for humans. Be
it a service robot or a robot designed for educational purposes,
they all have one main
aspect in common: the foundations of artificial intelligence and
thus the research on
human-robot interaction in robotics.
The goal of this thesis is to present a possible implementation
solution for a tower
building game with simple blocks, played by a human player and NAO,
under consid-
eration of non-verbal communication aspects. Therefore it is
necessary to explore the
possible interaction strategies that achieve a successful
communication session between
human player and NAO.
2.1. Game strategy . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 3
2.2. Object recognition . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 5
3.1.1. Robotics . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . 7
3.2. Components of Human-Robot-Interaction . . . . . . . . . . . .
. . . . . 13
3.2.1. Intelligence and Consciousness . . . . . . . . . . . . . . .
. . . . 13
3.2.2. Perception and Expression . . . . . . . . . . . . . . . . .
. . . . 14
3.2.3. Expressions . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . 15
4.1. Robot Specifications . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 19
6.1. Implementation . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 25
6.2. Improvements . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 29
6.3.1. NAO camera . . . . . . . . . . . . . . . . . . . . . . . . .
. . . 29
6.3.2. Joints overheating . . . . . . . . . . . . . . . . . . . . .
. . . . . 30
iv
2.2. Simple diagram of playing one round . . . . . . . . . . . . .
. . . . . . 6
3.1. Interaction of a robot with its environment . . . . . . . . .
. . . . . . . 8
3.2. Interaction of a robot with its environment . . . . . . . . .
. . . . . . . 16
3.3. Interaction of a robot with its environment . . . . . . . . .
. . . . . . . 16
4.1. NAO Parts - NAO H25 . . . . . . . . . . . . . . . . . . . . .
. . . . . . 19
4.2. NAO body parts 1 . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 20
5.1. NAO´s software components2 . . . . . . . . . . . . . . . . . .
. . . . . 22
5.2. NAOqi Process . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 23
6.2. The Init box including all initial behaviors . . . . . . . . .
. . . . . . . 26
6.3. The Main box contains the core modules . . . . . . . . . . . .
. . . . . 26
6.4. NAO perceives human player´s actions . . . . . . . . . . . . .
. . . . . 27
6.5. Landmark Detection triggers Action “Grabbing Block” . . . . .
. . . . 28
6.6. NAO Head joints3 . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 30
v
1. Introduction
1.1. Motivation
Nowadays there already are various application domains in which
humanoid robots are
used as service assistants to support humans. The use of robots
gets more and more
essential as they provide a variety of skills to be able to cover a
large field of duties.
Humanoids assist people not only in governmental tasks but also as
service robots in
home environments, e.g. taking care old or handicapped people: as a
caretaker for
medication intake or as a housekeeper. The further development in
society requires
more qualified interaction solutions to cover upcoming deficits as
labor shortage of
nurse practitioners. Hence humanoids assisting in medical centers.
1
The challenging task in human-robot interaction is the achievement
in robots explo-
ration and utilization of interaction strategies. Components hereby
are the perception
of verbal and non-verbal expressions of the interaction partner.
Following by own ex-
pressions within the given context of interaction.
1.2. Objectives
The aim of this thesis is to develop and validate a possible
solution of human-robot-
interaction by implementing a tower-building game with the humanoid
robot NAO on
the basis of the standard animation software and modules.
The focus is put on NAO´s ability to recognize and react on
non-verbal clues as well
as his performance of grabbing a block and putting it on another
one.
Therefore a recognition module has to be implemented which allows
NAO to distinguish
his own blocks from the blocks of his game partner and also the
blocks that have been
already placed.
NAO also has to be able to recognize whether it is his turn or not
by monitoring his
game partner’s nonverbal signals and the change of state of the
play area.
1http://www.atp.nist.gov/eao/sp950-1/helpmate.htm
1.3. Project structure
The first part will give an overview about the general aspects of
human-robot interac-
tion and related work in this field with the main focus on modes of
interaction.
A short introduction to the architecture of the NAO robot will
especially focus on
issues of Nao´s components that have made the implementation more
difficult. The
project has been conducted using solely Software and Hardware
provided by NAO dis-
tribution.
The basis of the realization is given by an introduction into game
strategy including
it´s position on human-robot interaction, in particular the
planning of the object recog-
nition and turn detection as also their possible implementation up
to a certain point.
The evaluation part presents the analysis results of the
implementation (with reference
to efficiency and effectiveness) and points out faced issues and
suggests improvements
referring to future work related to this tower-building game.
2
2.1. Game strategy
Given the focus on communication strategy and also on communication
pathways, the
rules of the tower-building game are kept deliberately
simple.
In this thesis the rules are intended for two participants who
alternately build together
a tower by stacking blocks one above the other. The following
images 2.1 illustrate
how a successful game shall work.
Each player gets an equal number of small styrofoam blocks, in this
case two
blocks each player were intended
Landmarks are only fixed on the blocks of the human player, so NAO
can detect
where the current block is
The human player starts the game by placing one of his blocks
Note: To create an equal basis the starting order may be determined
on a rotation
basis
In next turn NAO has to grab one of his blocks and place it on top
of the block
that has been placed by the human participant in previous
round
Now there shall be a tower of two blocks, the block has been placed
by the human
participant and on top of it the block has been placed by NAO
The second round begins as the human player places another of his
blocks on the
top of the tower The game goes on in turn based mode until one of
the conditions
as set out below terminate the game:
1. Game terminates successfully, if all blocks of both game
participants have
been placed on the tower without the tower falling off
2. Game terminates unsuccessful if the tower gets shaky and falls
over the
playground
3
2. Building a tower together
(a) Initial game set up (b) Human player begins and puts his block
on table
(c) NAO continues putting its block on top of the other
(d) Human puts his second and last block
(e) NAO finishes game by putting its last block
Figure 2.1.: Optimal course of the game
4
2.2. Object recognition
For the realization of the object recognition part also NAO‘s
qualities in grabbing ob-
jects had to be taken into consideration concerning the block
object attributes like size,
material and perhaps color of objects surface.
A number of solution approaches came up that were tested:
Wooden blocks, which children play with, had the perfect size but
were not usable
due to the flat surface that could not be held by NAO as they
slipped out of the
handle.
Block of foam material were too soft and therefore the joints
clenched the foam
block when grabbing them and the blocks snapped out as NAO was
about to
open its hand.
Blocks of styrofoam material seem to fit perfect for NAO as they
can be hold
cause to the rough surface and the low weight.
After the shape and the material were chosen it was about to find a
solution how
NAO will be able to recognize and distinguish the blocks.
For this task the landmarks were printed, cut out to the block´s
size and for every
block of the human player a landmark was tacked on it.
Actually it does not matter if the same landmark number is used or
not as NAO only
recognizes the current block on the top of the tower which is
placed by the human
player.
2.3. Goal
As it gets clear the game has no competitive part as games usually
do, the significant
and necessary factor is the communication part between NAO and the
human co-player.
The game ends successfully if the both players were able to built
up the tower by using
all of their blocks.
This in turn depends on how well the interaction strategies have
been explored.
The diagram below gives a simple overview of Nao´s events of
perceiving the game
area 2.2.
Figure 2.2.: Simple diagram of playing one round
6
3.1. Foundations of HRI
The research on autonomous robots and agents is based on the
fundamentals of artifi-
cial intelligence, which is the core discipline in informatics for
research and creation of
intelligent systems. Besides the improvement and development of the
technical compo-
nents for robots as sensors for perception and effectors for
movement, one should also
consider the psychological aspect. Therefore artificial
intelligence is the key to under-
standing human cognition and implementing this knowledge in a
robot, thus enabling
autonomous human-robot interaction. Russell and Norvig´s
“Artificial Intelligence”
[1], is the standard reference that briefly documents the field of
artificial intelligence
and the correlated subfields as intelligent agents and
robotics.
3.1.1. Robotics
According to the definition of Russell and Norvig:
”Robots are physical agents that perform tasks by manipulating the
physical world.”
[1, 971]
Sensors and Effectors
A robot can be equipped with a variety of sensors that allow him to
perceive his
environment. [1, 973-975] Cameras evaluate visual stimuli of a
certain environ-
ment e.g. detect objects, movements or providing coordinate
information of the
environment for calculation of distances to a certain
subject.
Microphones scan the robots environment for any acoustical input
such as speech
command which triggers a specified action, e.g. greeting the user
as he says
“hello”.
Through the tactile sensors a robot is able to evaluate physical
contact. Through
it the robot recognizes whether any obstacles like a wall hinders a
movement
action or a touch by a human that initiates an action.
7
Figure 3.1.: Interaction of a robot with its environment
The effectors of a robot are equated with humans limbs and provide
flexibility
of physical movements and therefore a wider action space in which
the robot
can manipulate the environment. [1, 975-978] A robot in industrial
production
usually does not need any leg effectors since these robots
typically are installed
in a fixed position and therefore only able to use their arm and
hand effectors.
Robots that are used for exploration tasks clearly need more leg
and foot effectors
in order to move and be able to explore the environment more
efficiently. Effectors
of a robot are essential for providing a higher level of autonomy
and increase the
complexity of realizable implementations. Figure 3.1 shows a simple
interaction
sequence of a robot in its environment.
8
3. Human-Robot interaction
Types of robots
The technical construction of robots depends on the field of
application and thus on
the tasks a robot has to perform. Russell and Norvig refer to this
topic and [1, 971-973]
define the following three categories of robot types
The mobility of Manipulators is limited to a small space within the
workplace
they have been firmly fixed to. Their main focus lies on perceiving
assigned fixed
environments and responding to certain conditions which trigger the
manipula-
tors to perform a particular processing task on objects.
Due to their simple technical construction in comparison to other
robots, ma-
nipulators are used in industrial production such as automobile
industries or
electronics manufacturing. As manipulators usually execute only a
predefined
program sequence for a certain task, they are quite simple and
therefore can not
really be called intelligent agents.
The reason for this is straightforward: manipulators are incapable
of learning
behavior: the implemented code specifies the condition that will
trigger a certain
executable task of the robot, for instance grabbing an object off
the conveyor
belt for further processing as soon as the manipulator sensors the
particularized
object on the conveyor belt inside its defined workspace.
It can be concluded that manipulators also can not be called
autonomous agents
because of the lack of capability to independently gain knowledge
of the environ-
ment as the knowledge is given by the programmer itself. [1,
39]
The second category of robot types is defined as mobile robots. In
contrast to
manipulators this kind of robots have a much larger action space
due to mobility
features as legs or wheels. As a result mobile robots can move and
perform
actions autonomously which allows for a much wider application
range than the
one of the manipulators.
Possible application areas for mobile robots are using them as
transport assistants
either in hospitals for food delivery or containerized cargo.
Mobile robots are not
only used in business sector but also as assistants in domestic
use, for instance
as vacuum cleaners.
The combination of manipulators and mobile robots is leading to the
third cate-
gory: mobile manipulators, of which humanoid robots are part
of.
The abilities of mobile manipulators expand to the perception of
the environ-
ment and the situational manipulation of it by applying their
effectors to reach
requested aim states of the environment, triggered by a certain
action.
The crucial factor on this is the higher flexibility of moving the
effectors, as there
9
3. Human-Robot interaction
are no fixed workplace sets, thus the robots environment gains a
higher complex-
ity. That implies the need of also more complex algorithms in order
to ensure
the best possible interaction result opportunity among the robot
and the envi-
ronment.
Challenging technical factors in humanoid robot researches are the
development
of the best possible degree of motion of robots effectors and the
perception and in-
sertion of facial expressions. The tough part is realizing the
psychological aspects
which are inevitable for the human-robot interaction since most of
interaction be-
tween humans is based on non-verbal cues.
3.1.2. Application area of robotics/human-robot interaction
Robots as service assistants already include many roles and the
application area is
continually growing with the technical progress in robotics.
Patrick Lin describes the three task attributes dull, dirty and
dangerous as the key
attributes that determine the application area of a robot.
As the key advantage of robots over humans he names the lack of
emotional expres-
sions which make some tasks for robots easier to handle than
humans. As example
he mentions use of robots as volcano explorers, bomb squads or
assistants in difficult
surgeries. (cf. [2, 4])
Some of the covered fields in which humans take advantage of
benefits of robot
assistants are:
Autism therapy
There have been many studies with humanoid robots in autism
therapy, espe-
cially the NAO robot seems, in contrast to other humanoids, to be
more suitable
as interaction partner to children, surely also based on its cute
and childlike ap-
pearance.
The team around Syamimi Shamsuddin has published many studies about
human-
robot-interaction between NAO and Children with Autism. One of
their studies
has focused on human-robot-interaction where NAO teaches emotions
to children
with autism as a significant deficit of autistic people is the
inability of recognizing
and expressing emotions.
The study demonstrates a high acceptance of the robot with
reference to NAO´s
human-lookalike body shape. The acceptance of NAO as an equal
communica-
tion partner expresses into children´s high level of attention and
highly motivated
cooperation towards NAO. (cf. [3])
This study ideally represents the successful integration of
humanoid robots, not
only as simple assistants for tasks humans can´t or don´t want to
do but also as
10
3. Human-Robot interaction
assistants in tasks of direct communication with people or as in
this case with
children in psychological therapy.
Personal Care and Home help
The most common home service-robots are the Roomba Vacuum Cleaning
Robots
by the company iRobot 1 which also produces also other varieties of
floor cleaners
as Floor Mopping robots.
Confronted by an ageing population resulting from decreasing
fertility rates and
increasing of life expectancy [4] as well as lack of workers in
health care sector,
the question arises about care services for elder people.
So there is an increasing tendency in the use of robots as
assistants for elderly
people by not only supporting them in their activities but also in
monitoring and
maintaining the household in that the person lives. [5]
Such a robot is the nursebot Pearl developed by the Carnegie Mellon
University.
Pearl is able to move autonomous and provides many interaction
features as
speech recognition and facial detection.
As the interaction with humans is the key feature of the robot the
developers
also considered abilities in communication skills. Pearl plans and
coordinates any
activities and schedules, for instance for taking medicine and is
able to intervene
if any irregularities may occur. (cf. [6])
Military
The maybe most controversial and challenging task area is the use
of robots for
military purposes. Basically military robots cover a wide area of
fields of ap-
plication, all with the objective to take over tasks which are too
dangerous for
humans or to provide safety functions for them, such as unmanned
explorations
in dangerous or impassable areas, as bomb squad assistants,
monitoring a certain
territory for any enemies and in case attack those.
Despite the obvious advantages of using military robots, heavy
failures have
shown the weaknesses of such complex constructions. [7]
The authors Lin, Bekey and Abney refer in their reference book [2,
7] to an in-
cident that happened in 2007, where a semi-autonomous robot cannon
fired at
and killed against nine fellow combatants.
This and other similar incidents raise the criticism towards
malfunctions of robot,
which must be reliable, especially when it comes to protect
civilians or being a
fighting comrade, as the consequences in these cases are much more
fatal.
1http://www.irobot.com/us/learn/home/roomba.aspx
3.1.3. Autonomous agents vs. humanoid robots
As defined by Russell and Norvig:
“An agent is anything that can be viewed as perceiving its
environment through
sensors and acting upon that environment through actuators.” [1,
4]
Both, the interaction of humans and a software agent or the
interaction with a
physical agent, thus a humanoid robot, have different effects
depending on how well or
what kind of reservations they can be performed.
A significant study about the differences between a software agent
and a robot
is given by the publication of Shinozawa. [8] The comparison based
on an exper-
iment in which a user had to select a color name that was
recommended either
by a software agent or a physical robot. To avoid any distortions
by humans
subjective, rating the appearance was set up similar for
both.
By taking the robots three-dimensional appearance and the software
agents two-
dimensional space into consideration, the results of their
experiment pointed out
the conformity of dimension between the robot, the agent and their
interaction
environment.
The three-dimensional robot had greater influence with its
recommendations if
the experiment was set up in a three-dimensional environment and
vice versa to
the experiment with the two-dimensional software agent.
In the study of Powers [9] the comparison was based on health
interviews between
a test person with a software agent, a robot which was projected on
a computer
monitor and a robot being present.
The emphasis of this study was to research the influence of each of
the agents on
the behavior and attitude of the user.
The results showed that the social influence of robots on users was
higher than
on software agents. The test persons classified the robots as more
helpful and
spent more time with them then with software agents.
Otherwise the test persons revealed much less information to the
robot located
in the same room as to the software agents and also could remember
more details
about the interview when it was performed with software agents
rather then
robots.
12
3.2. Components of Human-Robot-Interaction
3.2.1. Intelligence and Consciousness
The principle of human-robot interaction is based on the ability of
a robot to establish
a successfully interaction with a human. Therefore the robot should
be capable to
recognize non-verbal cues correctly to provide the most natural
approach in human-
robot interaction. Accordingly, isn’t´it primarily necessary to be
aware of the own
existence in order to have an internal knowledge base of own
cognitive processes as
also of the environment?
For humans the cognitive state of subconsciousness provides a
significant support in
everyday life. The main challenge for human´s mind are the coping
strategies to reduce
daily information overload down to the important matters.
Subconscious perception
unburdens the senses and the mind from collapsing that would result
from processing
all the information on a full conscious state of mind.
In context to human-human interaction usually this would mean:
subconscious per-
ception of non-verbal cues that are expressed by the interaction
partner. Nevertheless,
those non-verbal cues also get processed and affect the behavior
towards the interaction
partner.
Although, it should be noted that the classification of information
split into subcon-
sciousness and consciousness, is characterized as an individual
matter and depends on
individual experience.
Inevitably that leads into the question of a robots´s self
consciousness and the role of
consciousness in context with the ability of perceiving non-verbal
cues.
In the publication “Creation of a Conscious Robot” by Junichi
Takeno [10] he addresses
in detail the understanding and development of conscious
robots.
His work not only covers principles of human consciousness and the
approaches in de-
velopment of conscious robot but also inspires to raise fundamental
questions regarding
the psychological aspects of consciousness and its implementation
in a robot.
Researches on intelligent robots and their affection on human-robot
interaction have
shown, that the development of intelligent robots has become
important: Humans tend
to accept and emphasize with robots with social skills rather then
those without. [11],
[12]
13
3.2.2. Perception and Expression
To be able to interact with its environment a robot needs to gather
and process in-
formation from it, thus a robot needs the ability of perception
over the application
domain.
This can be realized using different sensors whereby for autonomous
robots the focus
is put on the acoustic, visual and tactile perception.
The acoustic perception provides to the robot with the ability to
filter, process
and to respond to audio stimuli from its environment but also to
record and play
back sounds triggered by a certain task.
Beside the perception of simple stimuli, acoustic cognition must
also include the
perception of complex voice signals, such as speech in order to be
able to achieve
the most possible authentically communication basis between a
humanoid robot
and a human.
Even in a limited way, spoken words can trigger a certain action in
the robot or
support the robot to recognize and locate a known interaction
partner based on
individual voice attributes, such as loudness and tone.
Through the use of visual-based sensors, such as cameras, a robot
is able to vi-
sually perceive its application environment. The accuracy of visual
detection of
a robot not only depends on a various attributes of camera
qualities, such as res-
olution, color rang and lightning compensation, but also on the
robot´s internal
representation of three rules of Norvig.
The robots internal representation of the environment should not
only have a
clear structure that enables him to react to environmental changes
in a quick,
efficient way, but also include meaningful and sufficient
information as a basis for
decision-making.
Furthermore, the internal representation values should be modeled
with regard
to ensuring the consistency of the natural state and also the
values that are rep-
resented in the real world.[1, 978]
As the natural environment changes continuously and unforeseen
events can cre-
ate problems, the robot has to be able to react in certain time
frame, for instance
timely avoiding previously not present obstacles.
The challenging part of robotics surely is the realization of
sensors which keep
the work reliable even if characteristics of an environment change
constantly, for
example lightning conditions. Siegwart and Nourbakhsh [13, 93-94]
deal in detail
14
with attributes that influence sensors performance:
A sensors sensitivity rate specifies the degree to which a change
of input values
affects the output values, for instance level of light sensitivity
of robots cameras.
An error of a sensor produces inconsistency between the real values
of an envi-
ronment and the output values which are measured by the sensor.
There are two
kinds of errors, predictable and unpredictable ones.
Predictable errors, or also called systematic are measurable
errors, are triggered
by modeled processes.
In contrast to that unpredictable errors or random errors cannot be
calculated in
advance, as they occur irregular. These random errors include the
color levels of
the camera as well as hue errors concerning the level of brightness
and contrast.
Precision defines the level of reproducibility of a sensors input
values whereas ac-
curacy measures the level of agreement between the true and
recorded values of
a sensors output. Hence a high degree of accuracy is equivalent to
low error rates.
3.2.3. Expressions
Another essential part in human-robot communication tasks is the
perception of mimic,
gesture and speech expressions. They support us on reinforcement of
expressing emo-
tions and thoughts.
When humans communicate, they pass their expressions to their
dialogue partner ei-
ther explicit or implicit. The direct path, also called verbal
communication, means to
express something by speaking to each other. Non-verbal
communication on the other
hand consists mimic and gesture expressions and is therefore not
always easily visible
to the dialogue partner.
The realization of non-verbal clues in humanoid robots is therefore
challenging, as there
is no explicit content of information available, that could be
predefined right away. [14]
3.2.4. Manipulation and Locomotion
Another important aspect of human-robot-interaction is the degree
of freedom of
motion of a robot that specifies its physical active participation
in human-robot-
interaction.
Motion is the hypernym for both terms, manipulation and
locomotion.
15
3. Human-Robot interaction
Manipulator robots are a category of robots that are fixed in a
certain workplace
3.1.1 and are just able to move objects from one point to another,
usually through
hand joint manipulation. As shown in figure 3.2, the robots
position within the de-
fined application environment is fixed, while objects can be moved
to various points,
though only within robots workspace area.
Locomotion on the other hand means the ability of a robot to move
itself from one
point to any other within a defined environment 3.3.
Movement types are similar to the human ones as, for instance,
walking, running 2
or even swimming 3, though NAO robot is limited only to
walking.
Figure 3.2.: Interaction of a robot with its environment
Figure 3.3.: Interaction of a robot with its environment
2http://www.bostondynamics.com/robot_bigdog.html
Beside technical factors, environmental aspects also have to be
taken into consid-
eration during development as a possible source of influence on the
functionality of
manipulation and locomotion.
Environmental aspects are, for example, the composition and
structure of the ground,
the inclination level or the range of contacts points on the robots
action path.(cf.[13,
17])
There are different types of motion mechanisms. NASA Mars Rover 4
or the four
legged-robots as Sony´s AIBO 5 are two examples. However, this
paper will focus on
two-legged robos, such as the NAO, and their ability to keep
balance.
Especially humanoid service-robots in housekeeping must be able to
master more dif-
ficult motion sequences, such as walking down and up the stairs
without loosing their
balance. Honda´s ASIMO 6 is an example for successful movement
implementation of
two-legged robots.
The focus of researches on improvements in human-robot-interaction
field is targeted
at communication skills of robots, as they are intended to interact
with humans on
a much more complex level of cognition. As a consequence a robot
should have the
skills of being conscious about its own behaviorism as well as its
interaction partners
behaviorism. Do do so, robot needs to process and assign the
non-verbal clues in a
logically manner so it can act or react in reasonable ways.
Many studies have been investigating techniques in human-robot
interaction by using
non-verbal cues. [15], [16]
Bakker and Kuniyoshi [17] as also the team Chen Yu and Dana H.
Ballard [18]
described in their publications, approaches of robots learning to
recognize humans be-
haviorism.
One method is the explicit implementation of default behavior
modes, the other one is
by reinforcement learning.
In addition to those methods they introduce a third learning
method: it´s based on
learning by imitating human behavior.
The robot learns, actually it could be said like a child, by
imitating behaviors that are
shown by humans.
3.4. Goal of HRI
The aim in human-robot-interaction research area certainly is to
achieve a successful
communication between humanoid robots and humans on a much more
complex level
of cognition on the part of the robots.
One objective is to improve or invent solution implementations for
representation of
interaction modes on robots considering robots ability on
perception and expression of
verbal and non-verbal cues.
With the current state-of-the-art in human-robot-interaction
humanoids already used
in therapy tasks as described in 3.1.2. Nevertheless, robots are
not fully accepted so-
ciety members at the present as there is yet the absence of
progress regarding in outer
appearance of robots and their ability to interact with humans
fully autonomously and
consciously.
As a consequence of creating conscious robots, new issues arise
regarding the dealing
of ethical issues and secure rights of robots in society thus this
should raise awareness
of all resulting social and ethical consequences.
This in turn leads to further questions about how moral thinking
can be realized as
well as what or who will be the source that passes on rules which
regulate definitions
about right thinking and correct behavior?
What rules shall be the underlying principles for
human-robot-interaction consid-
ering cultural and social differences or shall humanoid robots be
constructed into the
corresponding culture-specific manner?
Shall robots be able to execute their interaction skills fully
autonomously or shall
be humans able to intervene on the robots autonomy and if so, up to
what state and
circumstances are humans allowed to take control over the
robot?
In sum, human-robot-interaction research should be aimed not only
on successful
achievements in communication skills of robots but also deal with
the series of ethical
consequences resulting from that.
4.1. Robot Specifications
NAO is a 58 cm tall humanoid robot developed by the french company
Aldebaran
products. The model which was used for this thesis is NAO H25.
1
According to the impact of Nao´s appearance on its human
interaction partner,
there have been studies [19] that confirm how humans attention
towards a robot rises
as the robots appearance tends to be human-like. The most
significant feature about
Nao´appearance surely is its cute, round and childlike face design
that makes NAO
being a likeable interaction partner.
Figure 4.1.: NAO Parts - NAO H25
1https://community.aldebaran-robotics.com/doc/1-14/
• Motion
Nao´s movement can be controlled in various ways as its body is
parted in single
joints which allow more specific motion implementation as also the
in group of
joints for each body part, depending on the used method. Figure
below gives an
overview about the partition of Nao´s body. 4.2
Figure 4.2.: NAO body parts 2
• Interaction
Nao´s interaction modules allow him to interact on a human-like
level.
Four microphones and two loudspeakers on his head allow NAO to
recognize ei-
ther specific words inside a sentence or recognize and react
autonomously on a
complete sentence.
For visual perception NAO is equipped with two VGA cameras that
provide a
resolution of 640x480 and with a performance over 30 frames per
second.
Nao´s cameras revealed some issues that complicated the
implementation part
of object recognition and will be described in detail in next
section 6.3.
Infrared support allows NAO to communicate with any other infrared
support-
ing device which means that it is possible to use NAO as a remote
controller or
control NAO via remote control.
Further on NAO can connect to other NAO robots and communicate with
them.
2https://community.aldebaran-robotics.com/doc/1-14/naoqi/motion/index.html
20
• Sensors and Bumpers
Bumpers and tactile sensors on Nao´s head, chest, hand and feet
allow the per-
ception and communication via tactile cues.
These components can be associated with a specific predefined
behavior that
gets triggered by touching the sensors. Furthermore NAO is equipped
with sonar
rangefinders so he can estimate distance information about objects
placed up to
max. 70 cm.
5.1. Embedded Software
NAOqi
The main software of NAO that runs on the robot is NAOqi. It loads
at start up
a list of modules with default behavior methods on the robot.2 The
figure below 5.3
gives an overview about the structure of the NAOqi process which is
called broker when
it runs on the robot. It provides a directory containing all
modules and bound methods.
1https://community.aldebaran-robotics.com/doc/1-14/getting_started/software_in_
Choregraphe
The desktop software Choregraphe 5.3 offers an easy way to interact
and control the
real or a simulated robot and create behaviors in less time that to
do it with NAOqi
alone.
It already includes behaviors as predefined template-boxes which
can be extended
by own python code, therewith Choregraphe gives the possibility to
create complex
behaviors in a convenient way.
An advance is certainly the possibility to use a simulated robot
since it is feasible to
test movements on NAO without endangering NAO to fall or damage the
joints of his
body if parameters for movement and rotation have been handed on
wrong.
• Choregraphe also includes the application ”Monitor”, that gives
access to further
settings of Nao´s memory and the camera module.
The camera module includes a small widget that displays what NAO
current
sees. It is also possible to record retrieving images or taking
pictures.
• The Simulation Helper Tool was very helpful when it was about to
test behavior
without the real robot and is kindly offered by community members.
3
It comes with a graphical interface which provides a solution by
simulating all
NAOqi modules, thus makes possible to test behaviors without
necessarily need-
ing the real robot.
Figure 5.3.: Choregraph User Interface
The tool provides all modules needed for speech recognition,
simulates a fake
module for visual recognition as well as the sensor modules to
simulate tactile
communication options.
5.3. SDK/IDE
The programming with NAO was done by using Python code as Python
supports the
access on the robot and is already used in Choregraphe´s
implemented behaviors. The
advances of Python inhere in the minimalistic code structure which
results into having
a clear and short code even thought it provides flexibility as it
can be combined with
software components of the C++ API.
For structuring, testing and modification of Python modules the IDE
PyCharm3 Free
Community Edition 4 was used as it comes with a well defined Code
Editor and useful
features that allow to write code in a very comfortable and
therefore in a productive
way.
4http://www.jetbrains.com/pycharm/
with NAO
6.1. Implementation
The implementation partially was done in Choregraphe as also direct
in Python SDK.
For the initialization part Choregraphe was used, as it provides
many predefined tem-
plate boxes which can be used for standard positions and easily can
be modified by
import of own Python code modules.
For the part where Nao has to put his block on the top of the
other, it was more con-
venient to write and test the code directly in the Python IDE ,
described in 5.3. The
main advantage was the clearer structure, especially when it comes
to identify bugs in
the code.
6.1.1. Object structure in Choregraphe
The figure below 6.1 shows the main object structure of the
project. For this part two
flow diagram boxes were used. Flow diagram boxes contain various
number of script
boxes and were useful to split the script boxes according to what
part of the game they
have to be executed.
25
Init flow diagram box
The initialization flow diagram box includes the basics needed
before starting the game
at all. The basic modules for the initialization part are:
The first module in the row sets the stiffness for NAO´s joints to
on. In this case,
the joints can be moved with their full power, which is needed in
this case as the main
focus on Nao´s hardware, is the movement of it´s arm joints within
a small spaced
game area.
The second box triggers NAO´s joints to sit down, as it is not
needed for him to stand
for this game.
The third box behavior starts, right after and is setting NAO into
it´s initial position.
Actually it modifies the sitting position, so it provides to NAO a
wider field during the
game.
Finally NAO invites the human player to join the game.
Figure 6.2.: The Init box including all initial behaviors
Main flow diagram box
This flow diagram box includes the core behaviors implementation of
the game.
Thus are spitted into another two flow diagram boxes: One includes
the perception
part of NAO and the other one the manipulation method
Figure 6.3.: The Main box contains the core modules
26
Human player´s Turn
The human player is always considered to start the game. He puts
his block, with a
landmark attached on it, on the game field.
This box contains modules for NAO´s perception on it´s co-player´s
actions while on
the human player´s actions within the game area. At beginning, NAO
has to switch to
the bottom camera as it´s visual component to able to look for the
landmark attached
on the block that has been put on the ground.
Thus it´s NAO´s turn to perceive the game area and detect the
landmark which implies,
that the human co-player has put his block and finishes the
round.
The landmark detection triggers the behavior on NAO to open it´s
left hand and ask
for the block: As soon as the co-player places the block into it´s
hand, NAO waits
for the signal to close his hand and start it´s turn. The behavior
box Tactile L. Hand
passes the signal to NAO to close it´s hands when the back of the
left hand is touched.
Figure 6.4.: NAO perceives human player´s actions
27
NAO´s Turn
In this case, “NAO´s Turn’ recognition” was realized by landmark
detection. The
landmark detection could be implemented quite easy, as the modules
for this task
are predefined python modules that were taken from the Aldebaran
site 1 and were
modified to the particular requirements of the task.
Figure 6.5.: Landmark Detection triggers Action “Grabbing
Block”
1https://community.aldebaran-robotics.com/doc/1-14/dev/python/examples.html#
python-examples
6.2. Improvements
My turn - recognition
The recognition when it´s his turn should be improved by
implementing the
recognition based on non-verbal cues as, eye contact or gesture
expressions.
Grabbing a block
Actually NAO does not grab the block by himself, the block is
placed in his hand
and by touching the tactile sensors on the hand he closes and holds
the block is
given to him.
Placement of a block
That was actually the part that occurred the most difficulties
according to Nao´s
gross joint motion skills.
6.3. Issues with Nao´s components
During the work with NAO two major issues with Nao´s hardware have
been experi-
enced.
The bad camera quality resulted in difficulties on object
recognition, considering the
use of software provided by Aldebaran. Specific problem were the
following.
•••• Low camera resolution of 640x480 results in monitoring objects
pixelated, thus
objects with many details difficult to be recognized
accurately.
• High sensitivity to lightning conditions, for example incidents
of light cloud cover
coming up or soft shadows covering small areas around the block
lead to distortion
of recognizing objects including the need to replace the
block.
• Objects shall be placed within the range of motion of Nao´s head
6.6, preferably
an object sould be pplaced directly in front the cameras.
Figure 6.6.: NAO Head joints2
6.3.2. Joints overheating
The fast overheating of Nao´s hand joints restricted the practical
execution of imple-
mented motion sequences.
• The part of testing to grab the block and putting it on another
could not be
tested for more that half an hour.
• Sometimes NAO issued a warning of overheating after 15 minutes
and had to be
turned off for quarter-hour to cool down before it could be turned
on again.
30
7. Conclusion
The aim of this bachelor thesis was to implement a round-based
tower-building game
with the NAO robot in terms of human robot interaction. The tasks
were exploring
the best recognition and interaction strategies with the aim to
communicate on a non-
verbal level.
The implementation of the game while considering above-mentioned
aspects was not
successful. Reasons for that were among other things problems
caused by Nao´s hard-
ware components as the bad resolution quality of the camera as also
the gross joint
functionality. It was not possible for NAO to put the block exactly
above the other,
reason for that is the inaccuracy that results from the
mathematical difference that
occurs when it is about to test code on the real NAO the real
robot´s body even makes
small movements while sitting or just standing, this happening is
an attribute to the
motor joints.
Possible improvements can be to implement object recognition via
the opencv frame-
work and calculating an error tolerance for the joint movements. So
some tasks could
not been completed but by using other frameworks a successful
implementation can be
realized.
31
Bibliography
[1] Peter Norvig, co aut., Ernest Davis, and contrib. Artificial
intelligence : A Modern
Approach. Pearson, Boston, 3rd ed. edition, 2010.
[2] Lin Patrick, co aut., Abney Keith, A.Bekey George, and contrib.
Robot Ethics -
The ethical and social omplications of Robotics. The MIT Press,
2012.
[3] Syamimi Shamsuddin, Hanafiah Yussof, Mohd Azfar Miskam, A Che
Hamid, Nor-
jasween Abdul Malik, and Hafizan Hashim. Humanoid robot nao as hri
mediator
to teach emotions using game-centered approach for children with
autism. In HRI
2013 Workshop on Applications for Emotional Robots, Tokyo, Japan,
2013.
[4] Global health and ageing, 2011.
[5] J. Broekens, M. Heerink, and H. Rosendal. Assistive social
robots in elderly care:
a review. Gerontechnology, 8(2), 2009.
[6] Martha E Pollack, Laura Brown, Dirk Colbry, Cheryl Orosz, Bart
Peintner, Sailesh
Ramakrishnan, Sandra Engberg, Judith T Matthews, Jacqueline
Dunbar-Jacob,
Colleen E McCarthy, et al. Pearl: A mobile robotic assistant for
the elderly.
[7] Michael Goodrich and A C Schultz. Human-robot interaction: a
survey. Founda-
tions and Trends Human-Computer Interaction, 1:203–275, 02/2007
2007.
[8] Kazuhiko Shinozawa, Futoshi Naya, Junji Yamato, and Kiyoshi
Kogure. Differ-
ences in effect of robot and screen agent recommendations on human
decision-
making. Int. J. Hum.-Comput. Stud., 62(2):267–279, 2005.
[9] Aaron Powers, Sara B. Kiesler, Susan R. Fussell, and Cristen
Torrey. Comparing
a computer agent with a humanoid robot. In Cynthia Breazeal, Alan
C. Schultz,
Terry Fong, and Sara B. Kiesler, editors, HRI, pages 145–152. ACM,
2007.
[10] Junichi Takeno. Creation of a Conscious Robot: Mirror Image
Cognition and
Self-Awareness. Pan Stanford Publishing, 1st edition, 2012.
[11] Kerstin Dautenhahn. Socially intelligent robots: dimensions of
human–robot in-
teraction. Philosophical Transactions of the Royal Society B:
Biological Sciences,
362(1480):679–704, 2007.
32
BIBLIOGRAPHY
[12] Allison Bruce, Illah Nourbakhsh, and Reid Simmons. The role of
expressiveness
and attention in human-robot interaction, 2002.
[13] Roland Siegwart and Illah R. Nourbakhsh. Introduction to
Autonomous Mobile
Robots. Bradford Company, Scituate, MA, USA, 2004.
[14] AnthonyL. Threatt, KeithEvan Green, JohnellO. Brooks, Jessica
Merino, IanD.
Walker, and Paul Yanik. Design and evaluation of a nonverbal
communication
platform between assistive robots and their users. In Norbert
Streitz and Con-
stantine Stephanidis, editors, Distributed, Ambient, and Pervasive
Interactions,
volume 8028 of Lecture Notes in Computer Science, pages 505–513.
Springer Berlin
Heidelberg, 2013.
[15] Jingguang Han, Nick Campbell, Kristiina Jokinen, and Graham
Wilcock. G.:
Investigating the use of non-verbal cues in human-robot interaction
with a nao
robot. In In: Proceedings of 3rd IEEE International Conference on
Cognitive
Infocommunications (CogInfoCom 2012). Kosice, 2012.
[16] Chrystopher L Nehaniv. Classifying types of gesture and
inferring intent. In Procs
of the AISB 05 Symposium on Robot Companions, pages 74–81. AISB,
2005.
[17] Paul Bakker and Yasuo Kuniyoshi. Robot see, robot do: An
overview of robot
imitation. In AISB96 Workshop on Learning in Robots and Animals,
pages 3–11,
1996.
[18] Chen Yu and Dana H. Ballard. Learning to recognize human
action sequences.
Development and Learning, International Conference on, 0:28,
2002.
[19] Guido Schillaci, Sasa Bodiroza, and Verena Vanessa Hafner.
Evaluating the effect
of saliency detection and attention manipulation in human-robot
interaction. I.
J. Social Robotics, 5(1):139–152, 2013.
33
Appendix
34
Folder : Towerbuildingincludingthepythonfiles
Choregraphebehaviors
35
Disclaimer
Ich erklare hiermit gemaß §17 Abs. 2 APO, dass ich die vorstehende
Bachelorarbeit
selbststandig verfasst und keine anderen als die angegebenen
Quellen und Hilfsmittel
benutzt habe.
Components of Human-Robot-Interaction
Intelligence and Consciousness
Perception and Expression
Implementation