Shall we build a tower together?

Shall we build a tower together?
A study of human-robot interaction with the humanoid robot NAO
Bachelor Thesis
in the degree course Applied Computer Science Faculty of Information Systems and Applied Computer Science
Author: Ioulia Kalpakoula
Abstract
One main aspect in robot research is the use of robots in service duty for humans. Be
it a service robot or a robot designed for educational purposes, they all have one main
aspect in common: the foundations of artificial intelligence and thus the research on
human-robot interaction in robotics.
The goal of this thesis is to present a possible implementation solution for a tower
building game with simple blocks, played by a human player and NAO, under consid-
eration of non-verbal communication aspects. Therefore it is necessary to explore the
possible interaction strategies that achieve a successful communication session between
human player and NAO.
2.1. Game strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. Object recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1. Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2. Components of Human-Robot-Interaction . . . . . . . . . . . . . . . . . 13
3.2.1. Intelligence and Consciousness . . . . . . . . . . . . . . . . . . . 13
3.2.2. Perception and Expression . . . . . . . . . . . . . . . . . . . . . 14
3.2.3. Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1. Robot Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.1. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2. Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.3.1. NAO camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.3.2. Joints overheating . . . . . . . . . . . . . . . . . . . . . . . . . . 30
iv
2.2. Simple diagram of playing one round . . . . . . . . . . . . . . . . . . . 6
3.1. Interaction of a robot with its environment . . . . . . . . . . . . . . . . 8
4.1. NAO Parts - NAO H25 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2. NAO body parts 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.1. NAO´s software components2 . . . . . . . . . . . . . . . . . . . . . . . 22
5.2. NAOqi Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.2. The Init box including all initial behaviors . . . . . . . . . . . . . . . . 26
6.3. The Main box contains the core modules . . . . . . . . . . . . . . . . . 26
6.4. NAO perceives human player´s actions . . . . . . . . . . . . . . . . . . 27
6.5. Landmark Detection triggers Action “Grabbing Block” . . . . . . . . . 28
6.6. NAO Head joints3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
v
1. Introduction
1.1. Motivation
Nowadays there already are various application domains in which humanoid robots are
used as service assistants to support humans. The use of robots gets more and more
essential as they provide a variety of skills to be able to cover a large field of duties.
Humanoids assist people not only in governmental tasks but also as service robots in
home environments, e.g. taking care old or handicapped people: as a caretaker for
medication intake or as a housekeeper. The further development in society requires
more qualified interaction solutions to cover upcoming deficits as labor shortage of
nurse practitioners. Hence humanoids assisting in medical centers. 1
The challenging task in human-robot interaction is the achievement in robots explo-
ration and utilization of interaction strategies. Components hereby are the perception
of verbal and non-verbal expressions of the interaction partner. Following by own ex-
pressions within the given context of interaction.
1.2. Objectives
The aim of this thesis is to develop and validate a possible solution of human-robot-
interaction by implementing a tower-building game with the humanoid robot NAO on
the basis of the standard animation software and modules.
The focus is put on NAO´s ability to recognize and react on non-verbal clues as well
as his performance of grabbing a block and putting it on another one.
Therefore a recognition module has to be implemented which allows NAO to distinguish
his own blocks from the blocks of his game partner and also the blocks that have been
already placed.
NAO also has to be able to recognize whether it is his turn or not by monitoring his
game partner’s nonverbal signals and the change of state of the play area.
1http://www.atp.nist.gov/eao/sp950-1/helpmate.htm
1.3. Project structure
The first part will give an overview about the general aspects of human-robot interac-
tion and related work in this field with the main focus on modes of interaction.
A short introduction to the architecture of the NAO robot will especially focus on
issues of Nao´s components that have made the implementation more difficult. The
project has been conducted using solely Software and Hardware provided by NAO dis-
tribution.
The basis of the realization is given by an introduction into game strategy including
it´s position on human-robot interaction, in particular the planning of the object recog-
nition and turn detection as also their possible implementation up to a certain point.
The evaluation part presents the analysis results of the implementation (with reference
to efficiency and effectiveness) and points out faced issues and suggests improvements
referring to future work related to this tower-building game.
2
2.1. Game strategy
Given the focus on communication strategy and also on communication pathways, the
rules of the tower-building game are kept deliberately simple.
In this thesis the rules are intended for two participants who alternately build together
a tower by stacking blocks one above the other. The following images 2.1 illustrate
how a successful game shall work.
Each player gets an equal number of small styrofoam blocks, in this case two
blocks each player were intended
Landmarks are only fixed on the blocks of the human player, so NAO can detect
where the current block is
The human player starts the game by placing one of his blocks
Note: To create an equal basis the starting order may be determined on a rotation
basis
In next turn NAO has to grab one of his blocks and place it on top of the block
that has been placed by the human participant in previous round
Now there shall be a tower of two blocks, the block has been placed by the human
participant and on top of it the block has been placed by NAO
The second round begins as the human player places another of his blocks on the
top of the tower The game goes on in turn based mode until one of the conditions
as set out below terminate the game:
1. Game terminates successfully, if all blocks of both game participants have
been placed on the tower without the tower falling off
2. Game terminates unsuccessful if the tower gets shaky and falls over the
playground
3
2. Building a tower together
(a) Initial game set up (b) Human player begins and puts his block on table
(c) NAO continues putting its block on top of the other
(d) Human puts his second and last block
(e) NAO finishes game by putting its last block
Figure 2.1.: Optimal course of the game
4
2.2. Object recognition
For the realization of the object recognition part also NAO‘s qualities in grabbing ob-
jects had to be taken into consideration concerning the block object attributes like size,
material and perhaps color of objects surface.
A number of solution approaches came up that were tested:
Wooden blocks, which children play with, had the perfect size but were not usable
due to the flat surface that could not be held by NAO as they slipped out of the
handle.
Block of foam material were too soft and therefore the joints clenched the foam
block when grabbing them and the blocks snapped out as NAO was about to
open its hand.
Blocks of styrofoam material seem to fit perfect for NAO as they can be hold
cause to the rough surface and the low weight.
After the shape and the material were chosen it was about to find a solution how
NAO will be able to recognize and distinguish the blocks.
For this task the landmarks were printed, cut out to the block´s size and for every
block of the human player a landmark was tacked on it.
Actually it does not matter if the same landmark number is used or not as NAO only
recognizes the current block on the top of the tower which is placed by the human
player.
2.3. Goal
As it gets clear the game has no competitive part as games usually do, the significant
and necessary factor is the communication part between NAO and the human co-player.
The game ends successfully if the both players were able to built up the tower by using
all of their blocks.
This in turn depends on how well the interaction strategies have been explored.
The diagram below gives a simple overview of Nao´s events of perceiving the game
area 2.2.
Figure 2.2.: Simple diagram of playing one round
6
3.1. Foundations of HRI
The research on autonomous robots and agents is based on the fundamentals of artifi-
cial intelligence, which is the core discipline in informatics for research and creation of
intelligent systems. Besides the improvement and development of the technical compo-
nents for robots as sensors for perception and effectors for movement, one should also
consider the psychological aspect. Therefore artificial intelligence is the key to under-
standing human cognition and implementing this knowledge in a robot, thus enabling
autonomous human-robot interaction. Russell and Norvig´s “Artificial Intelligence”
[1], is the standard reference that briefly documents the field of artificial intelligence
and the correlated subfields as intelligent agents and robotics.
3.1.1. Robotics
According to the definition of Russell and Norvig:
”Robots are physical agents that perform tasks by manipulating the physical world.”
[1, 971]
Sensors and Effectors
A robot can be equipped with a variety of sensors that allow him to perceive his
environment. [1, 973-975] Cameras evaluate visual stimuli of a certain environ-
ment e.g. detect objects, movements or providing coordinate information of the
environment for calculation of distances to a certain subject.
Microphones scan the robots environment for any acoustical input such as speech
command which triggers a specified action, e.g. greeting the user as he says
“hello”.
Through the tactile sensors a robot is able to evaluate physical contact. Through
it the robot recognizes whether any obstacles like a wall hinders a movement
action or a touch by a human that initiates an action.
7
Figure 3.1.: Interaction of a robot with its environment
The effectors of a robot are equated with humans limbs and provide flexibility
of physical movements and therefore a wider action space in which the robot
can manipulate the environment. [1, 975-978] A robot in industrial production
usually does not need any leg effectors since these robots typically are installed
in a fixed position and therefore only able to use their arm and hand effectors.
Robots that are used for exploration tasks clearly need more leg and foot effectors
in order to move and be able to explore the environment more efficiently. Effectors
of a robot are essential for providing a higher level of autonomy and increase the
complexity of realizable implementations. Figure 3.1 shows a simple interaction
sequence of a robot in its environment.
8
3. Human-Robot interaction
Types of robots
The technical construction of robots depends on the field of application and thus on
the tasks a robot has to perform. Russell and Norvig refer to this topic and [1, 971-973]
define the following three categories of robot types
The mobility of Manipulators is limited to a small space within the workplace
they have been firmly fixed to. Their main focus lies on perceiving assigned fixed
environments and responding to certain conditions which trigger the manipula-
tors to perform a particular processing task on objects.
Due to their simple technical construction in comparison to other robots, ma-
nipulators are used in industrial production such as automobile industries or
electronics manufacturing. As manipulators usually execute only a predefined
program sequence for a certain task, they are quite simple and therefore can not
really be called intelligent agents.
The reason for this is straightforward: manipulators are incapable of learning
behavior: the implemented code specifies the condition that will trigger a certain
executable task of the robot, for instance grabbing an object off the conveyor
belt for further processing as soon as the manipulator sensors the particularized
object on the conveyor belt inside its defined workspace.
It can be concluded that manipulators also can not be called autonomous agents
because of the lack of capability to independently gain knowledge of the environ-
ment as the knowledge is given by the programmer itself. [1, 39]
The second category of robot types is defined as mobile robots. In contrast to
manipulators this kind of robots have a much larger action space due to mobility
features as legs or wheels. As a result mobile robots can move and perform
actions autonomously which allows for a much wider application range than the
one of the manipulators.
Possible application areas for mobile robots are using them as transport assistants
either in hospitals for food delivery or containerized cargo. Mobile robots are not
only used in business sector but also as assistants in domestic use, for instance
as vacuum cleaners.
The combination of manipulators and mobile robots is leading to the third cate-
gory: mobile manipulators, of which humanoid robots are part of.
The abilities of mobile manipulators expand to the perception of the environ-
ment and the situational manipulation of it by applying their effectors to reach
requested aim states of the environment, triggered by a certain action.
The crucial factor on this is the higher flexibility of moving the effectors, as there
9
are no fixed workplace sets, thus the robots environment gains a higher complex-
ity. That implies the need of also more complex algorithms in order to ensure
the best possible interaction result opportunity among the robot and the envi-
ronment.
Challenging technical factors in humanoid robot researches are the development
of the best possible degree of motion of robots effectors and the perception and in-
sertion of facial expressions. The tough part is realizing the psychological aspects
which are inevitable for the human-robot interaction since most of interaction be-
tween humans is based on non-verbal cues.
3.1.2. Application area of robotics/human-robot interaction
Robots as service assistants already include many roles and the application area is
continually growing with the technical progress in robotics.
Patrick Lin describes the three task attributes dull, dirty and dangerous as the key
attributes that determine the application area of a robot.
As the key advantage of robots over humans he names the lack of emotional expres-
sions which make some tasks for robots easier to handle than humans. As example
he mentions use of robots as volcano explorers, bomb squads or assistants in difficult
surgeries. (cf. [2, 4])
Some of the covered fields in which humans take advantage of benefits of robot
assistants are:
Autism therapy
There have been many studies with humanoid robots in autism therapy, espe-
cially the NAO robot seems, in contrast to other humanoids, to be more suitable
as interaction partner to children, surely also based on its cute and childlike ap-
pearance.
The team around Syamimi Shamsuddin has published many studies about human-
robot-interaction between NAO and Children with Autism. One of their studies
has focused on human-robot-interaction where NAO teaches emotions to children
with autism as a significant deficit of autistic people is the inability of recognizing
and expressing emotions.
The study demonstrates a high acceptance of the robot with reference to NAO´s
human-lookalike body shape. The acceptance of NAO as an equal communica-
tion partner expresses into children´s high level of attention and highly motivated
cooperation towards NAO. (cf. [3])
This study ideally represents the successful integration of humanoid robots, not
only as simple assistants for tasks humans can´t or don´t want to do but also as
10
assistants in tasks of direct communication with people or as in this case with
children in psychological therapy.
Personal Care and Home help
The most common home service-robots are the Roomba Vacuum Cleaning Robots
by the company iRobot 1 which also produces also other varieties of floor cleaners
as Floor Mopping robots.
Confronted by an ageing population resulting from decreasing fertility rates and
increasing of life expectancy [4] as well as lack of workers in health care sector,
the question arises about care services for elder people.
So there is an increasing tendency in the use of robots as assistants for elderly
people by not only supporting them in their activities but also in monitoring and
maintaining the household in that the person lives. [5]
Such a robot is the nursebot Pearl developed by the Carnegie Mellon University.
Pearl is able to move autonomous and provides many interaction features as
speech recognition and facial detection.
As the interaction with humans is the key feature of the robot the developers
also considered abilities in communication skills. Pearl plans and coordinates any
activities and schedules, for instance for taking medicine and is able to intervene
if any irregularities may occur. (cf. [6])
Military
The maybe most controversial and challenging task area is the use of robots for
military purposes. Basically military robots cover a wide area of fields of ap-
plication, all with the objective to take over tasks which are too dangerous for
humans or to provide safety functions for them, such as unmanned explorations
in dangerous or impassable areas, as bomb squad assistants, monitoring a certain
territory for any enemies and in case attack those.
Despite the obvious advantages of using military robots, heavy failures have
shown the weaknesses of such complex constructions. [7]
The authors Lin, Bekey and Abney refer in their reference book [2, 7] to an in-
cident that happened in 2007, where a semi-autonomous robot cannon fired at
and killed against nine fellow combatants.
This and other similar incidents raise the criticism towards malfunctions of robot,
which must be reliable, especially when it comes to protect civilians or being a
fighting comrade, as the consequences in these cases are much more fatal.
1http://www.irobot.com/us/learn/home/roomba.aspx
3.1.3. Autonomous agents vs. humanoid robots
As defined by Russell and Norvig:
“An agent is anything that can be viewed as perceiving its environment through
sensors and acting upon that environment through actuators.” [1, 4]
Both, the interaction of humans and a software agent or the interaction with a
physical agent, thus a humanoid robot, have different effects depending on how well or
what kind of reservations they can be performed.
A significant study about the differences between a software agent and a robot
is given by the publication of Shinozawa. [8] The comparison based on an exper-
iment in which a user had to select a color name that was recommended either
by a software agent or a physical robot. To avoid any distortions by humans
subjective, rating the appearance was set up similar for both.
By taking the robots three-dimensional appearance and the software agents two-
dimensional space into consideration, the results of their experiment pointed out
the conformity of dimension between the robot, the agent and their interaction
environment.
The three-dimensional robot had greater influence with its recommendations if
the experiment was set up in a three-dimensional environment and vice versa to
the experiment with the two-dimensional software agent.
In the study of Powers [9] the comparison was based on health interviews between
a test person with a software agent, a robot which was projected on a computer
monitor and a robot being present.
The emphasis of this study was to research the influence of each of the agents on
the behavior and attitude of the user.
The results showed that the social influence of robots on users was higher than
on software agents. The test persons classified the robots as more helpful and
spent more time with them then with software agents.
Otherwise the test persons revealed much less information to the robot located
in the same room as to the software agents and also could remember more details
about the interview when it was performed with software agents rather then
robots.
12
3.2. Components of Human-Robot-Interaction
3.2.1. Intelligence and Consciousness
The principle of human-robot interaction is based on the ability of a robot to establish
a successfully interaction with a human. Therefore the robot should be capable to
recognize non-verbal cues correctly to provide the most natural approach in human-
robot interaction. Accordingly, isn’t´it primarily necessary to be aware of the own
existence in order to have an internal knowledge base of own cognitive processes as
also of the environment?
For humans the cognitive state of subconsciousness provides a significant support in
everyday life. The main challenge for human´s mind are the coping strategies to reduce
daily information overload down to the important matters. Subconscious perception
unburdens the senses and the mind from collapsing that would result from processing
all the information on a full conscious state of mind.
In context to human-human interaction usually this would mean: subconscious per-
ception of non-verbal cues that are expressed by the interaction partner. Nevertheless,
those non-verbal cues also get processed and affect the behavior towards the interaction
partner.
Although, it should be noted that the classification of information split into subcon-
sciousness and consciousness, is characterized as an individual matter and depends on
individual experience.
Inevitably that leads into the question of a robots´s self consciousness and the role of
consciousness in context with the ability of perceiving non-verbal cues.
In the publication “Creation of a Conscious Robot” by Junichi Takeno [10] he addresses
in detail the understanding and development of conscious robots.
His work not only covers principles of human consciousness and the approaches in de-
velopment of conscious robot but also inspires to raise fundamental questions regarding
the psychological aspects of consciousness and its implementation in a robot.
Researches on intelligent robots and their affection on human-robot interaction have
shown, that the development of intelligent robots has become important: Humans tend
to accept and emphasize with robots with social skills rather then those without. [11],
[12]
13
3.2.2. Perception and Expression
To be able to interact with its environment a robot needs to gather and process in-
formation from it, thus a robot needs the ability of perception over the application
domain.
This can be realized using different sensors whereby for autonomous robots the focus
is put on the acoustic, visual and tactile perception.
The acoustic perception provides to the robot with the ability to filter, process
and to respond to audio stimuli from its environment but also to record and play
back sounds triggered by a certain task.
Beside the perception of simple stimuli, acoustic cognition must also include the
perception of complex voice signals, such as speech in order to be able to achieve
the most possible authentically communication basis between a humanoid robot
and a human.
Even in a limited way, spoken words can trigger a certain action in the robot or
support the robot to recognize and locate a known interaction partner based on
individual voice attributes, such as loudness and tone.
Through the use of visual-based sensors, such as cameras, a robot is able to vi-
sually perceive its application environment. The accuracy of visual detection of
a robot not only depends on a various attributes of camera qualities, such as res-
olution, color rang and lightning compensation, but also on the robot´s internal
representation of three rules of Norvig.
The robots internal representation of the environment should not only have a
clear structure that enables him to react to environmental changes in a quick,
efficient way, but also include meaningful and sufficient information as a basis for
decision-making.
Furthermore, the internal representation values should be modeled with regard
to ensuring the consistency of the natural state and also the values that are rep-
resented in the real world.[1, 978]
As the natural environment changes continuously and unforeseen events can cre-
ate problems, the robot has to be able to react in certain time frame, for instance
timely avoiding previously not present obstacles.
The challenging part of robotics surely is the realization of sensors which keep
the work reliable even if characteristics of an environment change constantly, for
example lightning conditions. Siegwart and Nourbakhsh [13, 93-94] deal in detail
14
with attributes that influence sensors performance:
A sensors sensitivity rate specifies the degree to which a change of input values
affects the output values, for instance level of light sensitivity of robots cameras.
An error of a sensor produces inconsistency between the real values of an envi-
ronment and the output values which are measured by the sensor. There are two
kinds of errors, predictable and unpredictable ones.
Predictable errors, or also called systematic are measurable errors, are triggered
by modeled processes.
In contrast to that unpredictable errors or random errors cannot be calculated in
advance, as they occur irregular. These random errors include the color levels of
the camera as well as hue errors concerning the level of brightness and contrast.
Precision defines the level of reproducibility of a sensors input values whereas ac-
curacy measures the level of agreement between the true and recorded values of
a sensors output. Hence a high degree of accuracy is equivalent to low error rates.
3.2.3. Expressions
Another essential part in human-robot communication tasks is the perception of mimic,
gesture and speech expressions. They support us on reinforcement of expressing emo-
tions and thoughts.
When humans communicate, they pass their expressions to their dialogue partner ei-
ther explicit or implicit. The direct path, also called verbal communication, means to
express something by speaking to each other. Non-verbal communication on the other
hand consists mimic and gesture expressions and is therefore not always easily visible
to the dialogue partner.
The realization of non-verbal clues in humanoid robots is therefore challenging, as there
is no explicit content of information available, that could be predefined right away. [14]
3.2.4. Manipulation and Locomotion
Another important aspect of human-robot-interaction is the degree of freedom of
motion of a robot that specifies its physical active participation in human-robot-
interaction.
Motion is the hypernym for both terms, manipulation and locomotion.
15
Manipulator robots are a category of robots that are fixed in a certain workplace
3.1.1 and are just able to move objects from one point to another, usually through
hand joint manipulation. As shown in figure 3.2, the robots position within the de-
fined application environment is fixed, while objects can be moved to various points,
though only within robots workspace area.
Locomotion on the other hand means the ability of a robot to move itself from one
point to any other within a defined environment 3.3.
Movement types are similar to the human ones as, for instance, walking, running 2
or even swimming 3, though NAO robot is limited only to walking.
2http://www.bostondynamics.com/robot_bigdog.html
Beside technical factors, environmental aspects also have to be taken into consid-
eration during development as a possible source of influence on the functionality of
manipulation and locomotion.
Environmental aspects are, for example, the composition and structure of the ground,
the inclination level or the range of contacts points on the robots action path.(cf.[13,
17])
There are different types of motion mechanisms. NASA Mars Rover 4 or the four
legged-robots as Sony´s AIBO 5 are two examples. However, this paper will focus on
two-legged robos, such as the NAO, and their ability to keep balance.
Especially humanoid service-robots in housekeeping must be able to master more dif-
ficult motion sequences, such as walking down and up the stairs without loosing their
balance. Honda´s ASIMO 6 is an example for successful movement implementation of
two-legged robots.
The focus of researches on improvements in human-robot-interaction field is targeted
at communication skills of robots, as they are intended to interact with humans on
a much more complex level of cognition. As a consequence a robot should have the
skills of being conscious about its own behaviorism as well as its interaction partners
behaviorism. Do do so, robot needs to process and assign the non-verbal clues in a
logically manner so it can act or react in reasonable ways.
Many studies have been investigating techniques in human-robot interaction by using
non-verbal cues. [15], [16]
Bakker and Kuniyoshi [17] as also the team Chen Yu and Dana H. Ballard [18]
described in their publications, approaches of robots learning to recognize humans be-
haviorism.
One method is the explicit implementation of default behavior modes, the other one is
by reinforcement learning.
In addition to those methods they introduce a third learning method: it´s based on
learning by imitating human behavior.
The robot learns, actually it could be said like a child, by imitating behaviors that are
shown by humans.
3.4. Goal of HRI
The aim in human-robot-interaction research area certainly is to achieve a successful
communication between humanoid robots and humans on a much more complex level
of cognition on the part of the robots.
One objective is to improve or invent solution implementations for representation of
interaction modes on robots considering robots ability on perception and expression of
verbal and non-verbal cues.
With the current state-of-the-art in human-robot-interaction humanoids already used
in therapy tasks as described in 3.1.2. Nevertheless, robots are not fully accepted so-
ciety members at the present as there is yet the absence of progress regarding in outer
appearance of robots and their ability to interact with humans fully autonomously and
consciously.
As a consequence of creating conscious robots, new issues arise regarding the dealing
of ethical issues and secure rights of robots in society thus this should raise awareness
of all resulting social and ethical consequences.
This in turn leads to further questions about how moral thinking can be realized as
well as what or who will be the source that passes on rules which regulate definitions
about right thinking and correct behavior?
What rules shall be the underlying principles for human-robot-interaction consid-
ering cultural and social differences or shall humanoid robots be constructed into the
corresponding culture-specific manner?
Shall robots be able to execute their interaction skills fully autonomously or shall
be humans able to intervene on the robots autonomy and if so, up to what state and
circumstances are humans allowed to take control over the robot?
In sum, human-robot-interaction research should be aimed not only on successful
achievements in communication skills of robots but also deal with the series of ethical
consequences resulting from that.
4.1. Robot Specifications
NAO is a 58 cm tall humanoid robot developed by the french company Aldebaran
products. The model which was used for this thesis is NAO H25. 1
According to the impact of Nao´s appearance on its human interaction partner,
there have been studies [19] that confirm how humans attention towards a robot rises
as the robots appearance tends to be human-like. The most significant feature about
Nao´appearance surely is its cute, round and childlike face design that makes NAO
being a likeable interaction partner.
Figure 4.1.: NAO Parts - NAO H25
1https://community.aldebaran-robotics.com/doc/1-14/
• Motion
Nao´s movement can be controlled in various ways as its body is parted in single
joints which allow more specific motion implementation as also the in group of
joints for each body part, depending on the used method. Figure below gives an
overview about the partition of Nao´s body. 4.2
Figure 4.2.: NAO body parts 2
• Interaction
Nao´s interaction modules allow him to interact on a human-like level.
Four microphones and two loudspeakers on his head allow NAO to recognize ei-
ther specific words inside a sentence or recognize and react autonomously on a
complete sentence.
For visual perception NAO is equipped with two VGA cameras that provide a
resolution of 640x480 and with a performance over 30 frames per second.
Nao´s cameras revealed some issues that complicated the implementation part
of object recognition and will be described in detail in next section 6.3.
Infrared support allows NAO to communicate with any other infrared support-
ing device which means that it is possible to use NAO as a remote controller or
control NAO via remote control.
Further on NAO can connect to other NAO robots and communicate with them.
2https://community.aldebaran-robotics.com/doc/1-14/naoqi/motion/index.html
20
• Sensors and Bumpers
Bumpers and tactile sensors on Nao´s head, chest, hand and feet allow the per-
ception and communication via tactile cues.
These components can be associated with a specific predefined behavior that
gets triggered by touching the sensors. Furthermore NAO is equipped with sonar
rangefinders so he can estimate distance information about objects placed up to
max. 70 cm.
5.1. Embedded Software
NAOqi
The main software of NAO that runs on the robot is NAOqi. It loads at start up
a list of modules with default behavior methods on the robot.2 The figure below 5.3
gives an overview about the structure of the NAOqi process which is called broker when
it runs on the robot. It provides a directory containing all modules and bound methods.
1https://community.aldebaran-robotics.com/doc/1-14/getting_started/software_in_
Choregraphe
The desktop software Choregraphe 5.3 offers an easy way to interact and control the
real or a simulated robot and create behaviors in less time that to do it with NAOqi
alone.
It already includes behaviors as predefined template-boxes which can be extended
by own python code, therewith Choregraphe gives the possibility to create complex
behaviors in a convenient way.
An advance is certainly the possibility to use a simulated robot since it is feasible to
test movements on NAO without endangering NAO to fall or damage the joints of his
body if parameters for movement and rotation have been handed on wrong.
• Choregraphe also includes the application ”Monitor”, that gives access to further
settings of Nao´s memory and the camera module.
The camera module includes a small widget that displays what NAO current
sees. It is also possible to record retrieving images or taking pictures.
• The Simulation Helper Tool was very helpful when it was about to test behavior
without the real robot and is kindly offered by community members. 3
It comes with a graphical interface which provides a solution by simulating all
NAOqi modules, thus makes possible to test behaviors without necessarily need-
ing the real robot.
Figure 5.3.: Choregraph User Interface
The tool provides all modules needed for speech recognition, simulates a fake
module for visual recognition as well as the sensor modules to simulate tactile
communication options.
5.3. SDK/IDE
The programming with NAO was done by using Python code as Python supports the
access on the robot and is already used in Choregraphe´s implemented behaviors. The
advances of Python inhere in the minimalistic code structure which results into having
a clear and short code even thought it provides flexibility as it can be combined with
software components of the C++ API.
For structuring, testing and modification of Python modules the IDE PyCharm3 Free
Community Edition 4 was used as it comes with a well defined Code Editor and useful
features that allow to write code in a very comfortable and therefore in a productive
way.
4http://www.jetbrains.com/pycharm/
with NAO
6.1. Implementation
The implementation partially was done in Choregraphe as also direct in Python SDK.
For the initialization part Choregraphe was used, as it provides many predefined tem-
plate boxes which can be used for standard positions and easily can be modified by
import of own Python code modules.
For the part where Nao has to put his block on the top of the other, it was more con-
venient to write and test the code directly in the Python IDE , described in 5.3. The
main advantage was the clearer structure, especially when it comes to identify bugs in
the code.
6.1.1. Object structure in Choregraphe
The figure below 6.1 shows the main object structure of the project. For this part two
flow diagram boxes were used. Flow diagram boxes contain various number of script
boxes and were useful to split the script boxes according to what part of the game they
have to be executed.
25
Init flow diagram box
The initialization flow diagram box includes the basics needed before starting the game
at all. The basic modules for the initialization part are:
The first module in the row sets the stiffness for NAO´s joints to on. In this case,
the joints can be moved with their full power, which is needed in this case as the main
focus on Nao´s hardware, is the movement of it´s arm joints within a small spaced
game area.
The second box triggers NAO´s joints to sit down, as it is not needed for him to stand
for this game.
The third box behavior starts, right after and is setting NAO into it´s initial position.
Actually it modifies the sitting position, so it provides to NAO a wider field during the
game.
Finally NAO invites the human player to join the game.
Figure 6.2.: The Init box including all initial behaviors
Main flow diagram box
This flow diagram box includes the core behaviors implementation of the game.
Thus are spitted into another two flow diagram boxes: One includes the perception
part of NAO and the other one the manipulation method
Figure 6.3.: The Main box contains the core modules
26
Human player´s Turn
The human player is always considered to start the game. He puts his block, with a
landmark attached on it, on the game field.
This box contains modules for NAO´s perception on it´s co-player´s actions while on
the human player´s actions within the game area. At beginning, NAO has to switch to
the bottom camera as it´s visual component to able to look for the landmark attached
on the block that has been put on the ground.
Thus it´s NAO´s turn to perceive the game area and detect the landmark which implies,
that the human co-player has put his block and finishes the round.
The landmark detection triggers the behavior on NAO to open it´s left hand and ask
for the block: As soon as the co-player places the block into it´s hand, NAO waits
for the signal to close his hand and start it´s turn. The behavior box Tactile L. Hand
passes the signal to NAO to close it´s hands when the back of the left hand is touched.
Figure 6.4.: NAO perceives human player´s actions
27
NAO´s Turn
In this case, “NAO´s Turn’ recognition” was realized by landmark detection. The
landmark detection could be implemented quite easy, as the modules for this task
are predefined python modules that were taken from the Aldebaran site 1 and were
modified to the particular requirements of the task.
Figure 6.5.: Landmark Detection triggers Action “Grabbing Block”
1https://community.aldebaran-robotics.com/doc/1-14/dev/python/examples.html#
python-examples
6.2. Improvements
My turn - recognition
The recognition when it´s his turn should be improved by implementing the
recognition based on non-verbal cues as, eye contact or gesture expressions.
Grabbing a block
Actually NAO does not grab the block by himself, the block is placed in his hand
and by touching the tactile sensors on the hand he closes and holds the block is
given to him.
Placement of a block
That was actually the part that occurred the most difficulties according to Nao´s
gross joint motion skills.
6.3. Issues with Nao´s components
During the work with NAO two major issues with Nao´s hardware have been experi-
enced.
The bad camera quality resulted in difficulties on object recognition, considering the
use of software provided by Aldebaran. Specific problem were the following.
•••• Low camera resolution of 640x480 results in monitoring objects pixelated, thus
objects with many details difficult to be recognized accurately.
• High sensitivity to lightning conditions, for example incidents of light cloud cover
coming up or soft shadows covering small areas around the block lead to distortion
of recognizing objects including the need to replace the block.
• Objects shall be placed within the range of motion of Nao´s head 6.6, preferably
an object sould be pplaced directly in front the cameras.
Figure 6.6.: NAO Head joints2
6.3.2. Joints overheating
The fast overheating of Nao´s hand joints restricted the practical execution of imple-
mented motion sequences.
• The part of testing to grab the block and putting it on another could not be
tested for more that half an hour.
• Sometimes NAO issued a warning of overheating after 15 minutes and had to be
turned off for quarter-hour to cool down before it could be turned on again.
30
7. Conclusion
The aim of this bachelor thesis was to implement a round-based tower-building game
with the NAO robot in terms of human robot interaction. The tasks were exploring
the best recognition and interaction strategies with the aim to communicate on a non-
verbal level.
The implementation of the game while considering above-mentioned aspects was not
successful. Reasons for that were among other things problems caused by Nao´s hard-
ware components as the bad resolution quality of the camera as also the gross joint
functionality. It was not possible for NAO to put the block exactly above the other,
reason for that is the inaccuracy that results from the mathematical difference that
occurs when it is about to test code on the real NAO the real robot´s body even makes
small movements while sitting or just standing, this happening is an attribute to the
motor joints.
Possible improvements can be to implement object recognition via the opencv frame-
work and calculating an error tolerance for the joint movements. So some tasks could
not been completed but by using other frameworks a successful implementation can be
realized.
31
Bibliography
[1] Peter Norvig, co aut., Ernest Davis, and contrib. Artificial intelligence : A Modern
Approach. Pearson, Boston, 3rd ed. edition, 2010.
[2] Lin Patrick, co aut., Abney Keith, A.Bekey George, and contrib. Robot Ethics -
The ethical and social omplications of Robotics. The MIT Press, 2012.
[3] Syamimi Shamsuddin, Hanafiah Yussof, Mohd Azfar Miskam, A Che Hamid, Nor-
jasween Abdul Malik, and Hafizan Hashim. Humanoid robot nao as hri mediator
to teach emotions using game-centered approach for children with autism. In HRI
2013 Workshop on Applications for Emotional Robots, Tokyo, Japan, 2013.
[4] Global health and ageing, 2011.
[5] J. Broekens, M. Heerink, and H. Rosendal. Assistive social robots in elderly care:
a review. Gerontechnology, 8(2), 2009.
[6] Martha E Pollack, Laura Brown, Dirk Colbry, Cheryl Orosz, Bart Peintner, Sailesh
Ramakrishnan, Sandra Engberg, Judith T Matthews, Jacqueline Dunbar-Jacob,
Colleen E McCarthy, et al. Pearl: A mobile robotic assistant for the elderly.
[7] Michael Goodrich and A C Schultz. Human-robot interaction: a survey. Founda-
tions and Trends Human-Computer Interaction, 1:203–275, 02/2007 2007.
[8] Kazuhiko Shinozawa, Futoshi Naya, Junji Yamato, and Kiyoshi Kogure. Differ-
ences in effect of robot and screen agent recommendations on human decision-
making. Int. J. Hum.-Comput. Stud., 62(2):267–279, 2005.
[9] Aaron Powers, Sara B. Kiesler, Susan R. Fussell, and Cristen Torrey. Comparing
a computer agent with a humanoid robot. In Cynthia Breazeal, Alan C. Schultz,
Terry Fong, and Sara B. Kiesler, editors, HRI, pages 145–152. ACM, 2007.
[10] Junichi Takeno. Creation of a Conscious Robot: Mirror Image Cognition and
Self-Awareness. Pan Stanford Publishing, 1st edition, 2012.
[11] Kerstin Dautenhahn. Socially intelligent robots: dimensions of human–robot in-
teraction. Philosophical Transactions of the Royal Society B: Biological Sciences,
362(1480):679–704, 2007.
32
BIBLIOGRAPHY
[12] Allison Bruce, Illah Nourbakhsh, and Reid Simmons. The role of expressiveness
and attention in human-robot interaction, 2002.
[13] Roland Siegwart and Illah R. Nourbakhsh. Introduction to Autonomous Mobile
Robots. Bradford Company, Scituate, MA, USA, 2004.
[14] AnthonyL. Threatt, KeithEvan Green, JohnellO. Brooks, Jessica Merino, IanD.
Walker, and Paul Yanik. Design and evaluation of a nonverbal communication
platform between assistive robots and their users. In Norbert Streitz and Con-
stantine Stephanidis, editors, Distributed, Ambient, and Pervasive Interactions,
volume 8028 of Lecture Notes in Computer Science, pages 505–513. Springer Berlin
Heidelberg, 2013.
[15] Jingguang Han, Nick Campbell, Kristiina Jokinen, and Graham Wilcock. G.:
Investigating the use of non-verbal cues in human-robot interaction with a nao
robot. In In: Proceedings of 3rd IEEE International Conference on Cognitive
Infocommunications (CogInfoCom 2012). Kosice, 2012.
[16] Chrystopher L Nehaniv. Classifying types of gesture and inferring intent. In Procs
of the AISB 05 Symposium on Robot Companions, pages 74–81. AISB, 2005.
[17] Paul Bakker and Yasuo Kuniyoshi. Robot see, robot do: An overview of robot
imitation. In AISB96 Workshop on Learning in Robots and Animals, pages 3–11,
1996.
[18] Chen Yu and Dana H. Ballard. Learning to recognize human action sequences.
Development and Learning, International Conference on, 0:28, 2002.
[19] Guido Schillaci, Sasa Bodiroza, and Verena Vanessa Hafner. Evaluating the effect
of saliency detection and attention manipulation in human-robot interaction. I.
J. Social Robotics, 5(1):139–152, 2013.
33
Appendix
34
Folder : Towerbuildingincludingthepythonfiles
Choregraphebehaviors
35
Disclaimer
Ich erklare hiermit gemaß §17 Abs. 2 APO, dass ich die vorstehende Bachelorarbeit
selbststandig verfasst und keine anderen als die angegebenen Quellen und Hilfsmittel
benutzt habe.
Components of Human-Robot-Interaction
Intelligence and Consciousness
Perception and Expression
Implementation

Documents

Shall we build a tower together?