40
Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A review Doreen Ying Ying Sim a,, Chu Kiong Loo b a Faculty of Cognitive Science and Human Development, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia b Faculty of Computer Science and Information Techonology, University of Malaya, Wilayah Persekutuan, Kuala Lumpur, Malaysia article info Article history: Received 15 August 2013 Received in revised form 28 November 2014 Accepted 4 December 2014 Available online 24 December 2014 Keywords: Extensive evaluation Assessment Human-Robot Interaction (HRI) Assistive social robot A better modelling approach New vision abstract Assessment and evaluation methodologies as well as combinations of them, for modelling of Human–Robot Interaction (HRI), are reviewed extensively and thoroughly in this paper. However, based on the types of robots and the kinds of interactions involved in the mod- elling of HRI, we concentrate just on the assistive social robot types. A comprehensive review has been done on each of these extensive evaluation and assessment methodologies applied for testing the usability of assistive social robots, user acceptance towards robots and robot acceptance in terms of behavioural adaptation during the HRI. The evaluation methodologies are reviewed based on the primary and non-primary basis, while the assessment methodologies are reviewed based on the type(s) of modelling approaches. We then discussed the weaknesses, strengths and uniqueness of each type of the past research work done on the evaluation and assessment methodologies. Comparison and contrast tables are also illustrated. Lastly, this paper provides our recommended directions, new vision, as well as our inspirations and new insights for future researches by highlight- ing the key areas for enhancing each of the past evaluation and assessment methodologies so that a better modelling approach for HRI can be achieved. Contributions of this review paper are also discussed thoroughly. Ó 2014 Elsevier Inc. All rights reserved. 1. Introduction Assessing acceptance in robots needs a methodology or a series of methodologies that is often used to measure the will- ingness of people to use a technology. This needs a type of modelling, which is always known as the Technology Acceptance Modelling (TAM) [20,21,23,55,57,61,63,135]. Nowadays, significant increase in the elderly population and the increased shortage of labour, as well as the explosion of costs in our daily expenses, have posed extreme challenges to our society [55,61,63,152]. However, how many research projects can explore the applicability of technological advances such as Intel- ligent Systems that enable people to live independently? In this paper, we discussed our notion of the concepts ‘social’ and ‘assistive’ within the context of robots used by people nowadays. So, the two main robot types that we review are of ‘social’ and ‘assistive’ (see Fig. 1). Then, we deepen our understanding of these concepts and review the examples of the developments of these two main robot types, mainly involved in the long-term modelling of Human–Robot Interaction http://dx.doi.org/10.1016/j.ins.2014.12.017 0020-0255/Ó 2014 Elsevier Inc. All rights reserved. Corresponding author. Tel.: +60 82 244942; fax: +60 82 332641. E-mail address: [email protected] (D.Y.Y. Sim). Information Sciences 301 (2015) 305–344 Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins

Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Embed Size (px)

Citation preview

Page 1: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Information Sciences 301 (2015) 305–344

Contents lists available at ScienceDirect

Information Sciences

journal homepage: www.elsevier .com/locate / ins

Extensive assessment and evaluation methodologieson assistive social robots for modelling human–robotinteraction – A review

http://dx.doi.org/10.1016/j.ins.2014.12.0170020-0255/� 2014 Elsevier Inc. All rights reserved.

⇑ Corresponding author. Tel.: +60 82 244942; fax: +60 82 332641.E-mail address: [email protected] (D.Y.Y. Sim).

Doreen Ying Ying Sim a,⇑, Chu Kiong Loo b

a Faculty of Cognitive Science and Human Development, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysiab Faculty of Computer Science and Information Techonology, University of Malaya, Wilayah Persekutuan, Kuala Lumpur, Malaysia

a r t i c l e i n f o

Article history:Received 15 August 2013Received in revised form 28 November 2014Accepted 4 December 2014Available online 24 December 2014

Keywords:Extensive evaluationAssessmentHuman-Robot Interaction (HRI)Assistive social robotA better modelling approachNew vision

a b s t r a c t

Assessment and evaluation methodologies as well as combinations of them, for modellingof Human–Robot Interaction (HRI), are reviewed extensively and thoroughly in this paper.However, based on the types of robots and the kinds of interactions involved in the mod-elling of HRI, we concentrate just on the assistive social robot types. A comprehensivereview has been done on each of these extensive evaluation and assessment methodologiesapplied for testing the usability of assistive social robots, user acceptance towards robotsand robot acceptance in terms of behavioural adaptation during the HRI. The evaluationmethodologies are reviewed based on the primary and non-primary basis, while theassessment methodologies are reviewed based on the type(s) of modelling approaches.We then discussed the weaknesses, strengths and uniqueness of each type of the pastresearch work done on the evaluation and assessment methodologies. Comparison andcontrast tables are also illustrated. Lastly, this paper provides our recommended directions,new vision, as well as our inspirations and new insights for future researches by highlight-ing the key areas for enhancing each of the past evaluation and assessment methodologiesso that a better modelling approach for HRI can be achieved. Contributions of this reviewpaper are also discussed thoroughly.

� 2014 Elsevier Inc. All rights reserved.

1. Introduction

Assessing acceptance in robots needs a methodology or a series of methodologies that is often used to measure the will-ingness of people to use a technology. This needs a type of modelling, which is always known as the Technology AcceptanceModelling (TAM) [20,21,23,55,57,61,63,135]. Nowadays, significant increase in the elderly population and the increasedshortage of labour, as well as the explosion of costs in our daily expenses, have posed extreme challenges to our society[55,61,63,152]. However, how many research projects can explore the applicability of technological advances such as Intel-ligent Systems that enable people to live independently? In this paper, we discussed our notion of the concepts ‘social’ and‘assistive’ within the context of robots used by people nowadays. So, the two main robot types that we review are of ‘social’and ‘assistive’ (see Fig. 1). Then, we deepen our understanding of these concepts and review the examples of thedevelopments of these two main robot types, mainly involved in the long-term modelling of Human–Robot Interaction

Page 2: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Fig. 1. A general categorization of robots (Heerink [55]). This paper mainly focuses on Assistive Social Robots category, under which there are ServiceRobots and Companion Robots.

306 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

(HRI), from past researches. Ultimately, we try to categorize their respective evaluation and assessment methodologies forHRI. We thereafter compare and contrast each of these evaluation and assessment methodologies for long-term modelling ofHRI, mainly in terms of their characteristics, strengths and weaknesses.

Human–Robot Interaction (HRI) is a rapidly advancing area of research, and as such there is a growing need for strong andefficient methods of assessment and evaluation. This brings credibility and validity to the scientific research. According toKidd and Breazeal in 2005, and some other researchers, two primary issues observed in the HRI studies are the lack of sig-nificantly sized participant pools that closely represent the populations being studied and the lack of multiple methods ofassessment used to obtain convergent validity in HRI studies [5,92].

Social robots are robots that people apply a social model in order to interact with them more efficiently, understand themdeeper and build up more intimate relationships [22,31,37,40,42]. When we focus on the social needs, some studies havedemonstrated how robots can have the ability to provide a kind of ‘pet-like’ companionship [30,40,174–177] (see Fig. 1).AIBO was the first interactive robot to prove successful in the commercial market [55,152], and since it behaves like a realpet [42,152]. Things that are more interesting is that Kanda et al., in 2009, found that a series of abstraction techniques forpeople’s trajectories and a service framework for using these techniques in a social robot, can enable a designer to make thesocial robots proactively approach customers. This can be done by only providing information about target local behaviour tothe social robots [79]. In an even more recent research done in 2011, Moriguchi et al. have shown that young children canlearn new actions and skills from a non-human agent, such as a robot [111].

All these researches demonstrated how robots can anticipate the need for a social entity by the users to build a good emo-tional relationship [40,55]. However, as shown by researches done by Bickmore in 2004, as well as Wu and Miller, togetherwith some other researchers in 2005, the possibility to build an emotional relationship not only responds to social needs, butalso increases acceptability by the society. This emotional relationship is combined with the ease of use of an interface that iscontrolled by social interaction [9,11,179,180]. In addition, certain researches done by Kanda et al. in 2004, 2007 and 2009have shown how recent progress in robotics could affect children’s lives [80,88,89]. They showed how the robot had signif-icantly affected children’s behaviours, feelings, and even their friendships. Their studies had provided clues to the process ofchildren’s adaptation to interactions with robots, and particularly on how they started to treat robots as intelligent beings[88]. All these past researches have shown that robots have significantly affected us, in all ages [121].

In a similar vein, we also review all the recent related work for the evaluation and assessment methodologies for HRI.Before any interface related to robotics can be evaluated, it is necessary to understand the users’ relevant skills and mentalmodels in order to develop evaluation criteria with those users in mind [151,182]. In the past, evaluations based on empir-ically validated sets of heuristics have been used on the desktop user interfaces and web-based applications [120]. However,recent human–robot interfaces differ very widely depending on the platforms and sensors. In addition, existing guidelinesare not adequate to support the heuristics evaluation [120,182]. Hence, in this paper, we need to review both the evaluationand assessment methodologies for HRI. From the past researches, in the year 2002, Scholtz proposed six evaluation guide-lines that can be used as high-level evaluation criteria for Human–Computer Interaction (HCI) or Human–Robot Interaction(HRI) [144,182]. Now, we wish to see different types of assessment and evaluation methodologies for HRI researched andproposed by the past researchers so far.

2. Differences between the assessment and evaluation methodologies for HRI

In this paper, we categorize all the methodologies with characterization based on the differences between the assessmentand evaluation methodologies applied by the researchers for modelling HRI.

Page 3: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 307

What are the differences between Assessment and Evaluation Methodologies? We try to categorize the major character-istics of them as below in order to review what have been done by the past researchers in using each of them towardHuman–Robot Interaction, or HRI. The major characteristics of each of them are listed as below:

Major characteristics of the Assessment Methodologies for assessing HRI include:

(1) Process-oriented.(2) Diagnostic approach as it identifies areas for continuous improvement and long-term modelling for the HRI.(3) Ongoing as it is formative to improve the learning approaches to promote HRI from time to time.(4) Continuous as this methodology takes quite some time, e.g. learning approaches, monitoring control, etc.

Major characteristics of the Evaluation Methodologies for evaluating HRI include:

(1) Product-oriented.(2) Judgmental approach as it arrives at a concluding judgmental score, rating or grade for the HRI.(3) Final as it is summative to sum up the evaluation scores or ratings for the HRI.(4) Discrete as this methodology does not need significant amount of time, e.g. experimental results, scores from users’

feedbacks on questionnaires, evaluation scores from participants and the like.

Evaluation methodologies on HRI are reviewed based on the primary and non-primary basis. The primary evaluationmethodologies are those that directly evaluate the HRI, while the non-primary ones are indirectly evaluating the HRI (referto Tables 1 and 2 for detailed illustrations). Non-primary evaluation is an indirect evaluation on HRI based on other param-eters such as numerical analysis on body movements, Ease of Classification (EOC), and non-verbal behaviours of the robot(see Table 2 for more details). Evaluation methodologies from the Human–Computer Interaction (HCI) and Computer-Supported Cooperative Working (CSCW) can be adapted for the use in HRI provided they take into account the complex,dynamic, and autonomous nature of robots [155,182].

Assessment Methodologies applied on HRI are mainly based on the types of models and modelling approaches. Assess-ment Methodologies on (I) humanoid robots, and (II) non-humanoid robots, such as embodied robots, are profoundly quitedifferent [65,68,98,106,112]. We review these different assessment methodologies respectively to human-and-humanoidrobot interaction as well as human-and-non-humanoid robot interaction. Different types of models and modellingapproaches are reviewed to see how researchers have applied them in order to assess and long-term model the HRI.

3. Types of modelling and measurement tactics involved in the assessment methodologies for HRI

Before we review specifically each of the Assessment and Evaluation Methodologies, we review the modelling and mea-surement tactics involved in assessing and long-term modelling of HRI, i.e. Assessment Methodologies.

3.1. Social models mainly involved in assessing the emulation of empathy during HRI

Taking a look at the studies done by Burke et al. in 2004, especially in their research insights into HRI in the larger contextof urban search and rescue, the information needs for the operator in HRI fell into several categories. These include (I) infor-mation about the status of the robot, (II) information about the robot’s environment [15], and (III) information about victimsfound in the environment [13]. So, how did they assess the HRI in terms of the information given to the operators who oper-ated the robots? The information about the status of the robot and the robot’s environment is necessary for real-time mon-itoring and control or supervision of the search. The operator involved in the HRI uses information about the victim state andlocation to ensure coverage [13,182].

Then, the past research on behavioural control architecture, presented in the research done by Tapus and Mataric’ [167],take into account on different factors. These factors include (I) proxemics, (II) verbal and non-verbal communication, as wellas (III) robot activity [166,168]. According to the researches done by them, two of those elements are also very useful in emu-lating empathy: verbal and non-verbal communication. Proximity, or the interpersonal distance, is another important ele-ment they have been explored because it plays a key role in human interactions. As found out from their research, it iswell known that people have stronger empathic emotions and reactions when the interaction episodes are associated withothers with whom they have a social relationship (such as friends and family) or a common background (such as a personwho lived through a similar experience) [166].

The above researches stressed that, in order to be able to use the factor discussed above, humans need to create strongbonds with robots for the nature similar to those formed with other humans. They found out that understanding humanaffect and reacting appropriately to different social situations, such as to avoid misunderstandings but to permit naturalhuman–robot interaction, will lead toward an improved empathic appearance of the robot. As stated by them, verbal andnon-verbal communication provide social cues that make robots appear more intuitive and natural. According to them, theyare two ways of mediating empathy, i.e. (1) cognitive, and (2) affective. In terms of cognitive empathy, the robot should showempathy as if it understands others’ emotions; emotions, robot can behave as if the others’ emotions affect it. In terms of

Page 4: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 1Strengths, weaknesses and uniqueness of the primary evaluation methodologies for long-term modelling of hri.

Author(s) and Year(s)(key authors only) –(in ChronologicalOrder)

Primary EvaluationMethodology on HRI(references are just thekey representing ones)

Strengths of this Primary Evaluation Methodology Weaknesses of this Primary EvaluationMethodology

Favourable Assessment Model(s) orModelling Approaches to incorporate witheach Primary Evaluation Methodology asstated

Bethel et al. [8] (1) Self-Assessments orSelf-ReportMethodologies [5,8]

Much easier and faster to measure than objective methods becausemostly are just the questionnaires given for participants to fill forsubjective assessments. Some parts are just observations, and thereis no specific professional measurement which needs priorknowledge or prior study to be taken place

(1) Problems with validity andcorroboration because participants mayanswer questions unrealistically;

(1) Behavioural Adaptation Model;

Bethel and Murphy[5]

(2) This is a subjective method asquestionnaires are used. So, bias mayincur;

(2) Human Gaze Model or other gazemodels;

(3) Observers may not be able to directlycorroborate the information provided bythe participants. Hence, observationresults may not be accurate

(3) User-Personality Matching Model;(4) Empathic/Emotional/TAME/Therapeutic/Psychological Model(5) Human Friendship Estimation Model(6) Godspeed Key Concepts Modelling(7) UTAUT Model

Bethel et al. [8] (2) Behaviouralmeasurements throughobservations [5,8]

Less tedious if compared with the subjective measurements asmostly are observation jobs involved. Usually, observation workswell with participants self-assessment responses

This method can be very biased because‘Hawthorne effect’ is a well-knownphenomenon in observation studies whichcan incur bias

(same as above – i.e. 1st row, last column, ofthis table)Bethel and Murphy

[5]

Bethel et al. [7,8] (3) Psycho-physiologicalmeasurements [5,7,8]

(1) Very objective method and hence very much less biased; (2) anon-invasive method that determines the stress levels and reactionsof participants interacting with the technology; (3) videoobservations are often recorded for visual and auditory information.So, a method of least biased

(1) It can complicate the process as theresults may not be straightforward andconfounds can lead to datamisinterpretation;(2) Participants’information needs to be obtained prior tothe study; (3) The measurement processcan be tedious and/or time-consuming;(4) Measurements on participants’autonomic system responses may not bevery accurate due to certain confoundingfactors

(1) Behavioural Adaptation Model;(2) User–Robot Personality Matching Model;(3) Empathic/Emotional/TAME/Therapeutic/Psychological Model

Bethel and Murphy[5]

(4) Godspeed Key Concepts Modelling(5) UTAUT Model

Kanda et al. [83] (4) Task performancemetrics [5,8,14,119][122,126,158] (mayincorporate the sub-evaluation methodolo-gies of comparisons ofthe measurement ofbody movement inter-action in between ahumanoid robot andhumans with subjec-tive evaluation results[25] [25,26,39,82] [83])

(1) Very useful when HRI is involving more than one person or onerobot – it is very good for team scoring since variables of interest canbe pre-set in the selection criteria for task performance; (2) Muchless biased approach too when comparisons may get involved in theassessment methodologies with the subjective evaluation results

(1) Not very suitable for one-to-one HRI,and (2) less flexible method if majorsubjective assessments are needed fromthe participants; (3) the metrics designedmay not be thorough and/or specificenough to evaluate every kind of HRI; (4)When this method is too HRI oriented, it isnot generalised enough in applying toevery kind of HRI; (5) This method is notvery suitable for robots which are nothumanoid robots; (6) This method is notvery suitable for HRI which involvesmainly verbal behaviours where actionsare not important; (7) Only well-coordinated behaviours correlate with thesubjective evaluation scores. So, limitedapplications to all body movements

(1) Temporal Awareness Model/TimingModel;

Olsen and Goodrich[122]

(2) Human–Robot Team (HRT) Modelling orTele-Operated Multiple Robot Model

Burke et al. [14] (3) Robot Awareness ModelKanda et al. [80,82] (4) Model of Integrated Humans’ shared

Intentions, e.g. Haptic Channel, MotionPlanning Model through Play Interactions,and the like

Steinfeld et al. [158]Bethel et al. [8]Bethel and Murphy

[5]Mutlu et al. [119]Cooney et al. [25,26]Frank et al. [39]Pateraki et al. [126]

308D

.Y.Y.Sim,C.K

.Loo/Inform

ationSciences

301(2015)

305–344

Page 5: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 2the strengths, weaknesses and uniqueness of each of the non-primary evaluation methodologies.

Author(s)/Year (key authors only – inchronological order)

Non-Primary EvaluationMethodologies on evaluatingHRI

Strengths of this Non-Primary EvaluationMethodology

Weaknesses of this Non-PrimaryEvaluation Methodology

Uniqueness of the Characteristics of thisNon-Primary Evaluation Methodology

Mutlu et al. [118] Ease of Classification (EOC)Formula Scoring method forevaluating the SocietalAcceptance towards robot

(1) EOC is used as a handy basis forcomparison among different user groups,physical interaction spaces, and the like;(2) quick and easy to indirectly evaluateHRI; (3) it is designed to be a flexiblemetric, as non-primary method, that canaccommodate the needs of different usergroups and different user types

(1) EOC method is designed to deal withusers’ ‘‘first impressions’’ of a robot, theirviews may change over time; (2) it isunclear how straight-forward it will be toadjust the EOC score to fullyaccommodate different contexts, userbases and robot types. So, method isstraight-forward but less specific

Ease of Classification (EOC) Formula Scoreis generally very applicable as a non-primary evaluation methodology forevaluating most types of HRI because theevaluation scoring formula and method isvery straight-forward and standardised

Gockley et al. [45]Riek and Robinson [131]Sung et al. [163]Mutlu et al. [117]

Hayashi et al. [52] Using robot’s conversation as apassive social medium –(RobotManzai is used as a passivesocial medium for indirectevaluation on HRI to beconducted.)

Development of a multi-robot cooperationsystem for HRI – an indirect evaluation byusing humanoid robots as passive socialmedia, serving like televisions orcomputers

Timing and technical adjustmentsbetween robots, as passive-scoial media,in the multi-robot cooperation system cansometimes be difficult

(1) Robot Manzai system shows thepotentiality of robots as passive-socialmedia (such as televisions etc.); (2) Robotacting as a passive-social medium is themost effective way of attracting people’sinterest

Hayashi et al. [53]Hayashi et al. [51]

Jacobsson et al. [74]; [73] See-Puck – name of theplatform, the open interactiverobot platform, for exploringHuman–Robot relationships

(1) Users can indirectly evaluate the HRIby influencing the visuals of Glow-Bots.The evaluation outcome is a slowlyevolving, constantly collection ofautonomous robotic display; (2) Opensource all hardware and software so thatanyone can revise, extend upon orimprove the displays- this open robotplatform encourgaes non-primaryevaluation of HRI

(1) GlowBots are demonstrated only for afew days, so unable to show a truly long-lasting relationship between robots andhumans. It may be necessary to sustaininterest over weeks and months; (2) Asnot using LEDs as sensors, could not makethe LED display touch sensitive. So, userscould not directly influence what is seenon the display – makes indirect evaluationon HRI less reliable or accurate

(1) See-Puck platform enables users toexplore new roles of robots in everydayenvironments; (2) Platform is not limitedto a particular application; (3) In theproof-of-concept application, humans androbots can engage in a playful open-endedinteraction; (4) See-Puck sets moreopportunities in robotics, and the likeduring HRI

Kanda et al. [83] Non-primary evaluation aspectinvolves numerical analysis ofbody movements. It measuresthe body movementinteractions between ahumanoid robot and humans

(1) Widely applicable in embodiedcommunication, and it is based on apsychological method; (2) Estimation ofmomentary evaluation makes robots moreadaptive in interacting with humans; (3)Indirect evaluation can relateconversational expressiveness to socialpresence and acceptance towards a robot

(1) As using an optical motion-capturingsystem to measure body movements, highresolution in time and space is needed; (2)The system may create technicaldifficulties and inaccuracies duringmeasurements; (3) Analyses of bodymovements may be tedious or time-consuming

(1) Comparisons of body movements andentrainment score reveals importance ofwell-coordinated behaviours andperformance of the developed interactivebehaviours; (2) Estimation of momentaryevaluation score is a good analyticalapproach for HRI

Kanda et al. [82]

Kamasima et al. [76] Non-verbal behaviours of therobot and/or the robot’sdesign(s) have appropriatepredictabilities for the users totune its behaviour throughmanipulation – this isconsistent with ProximityTheory in Psychology. This cancombine well with societalacceptance modelling for HRI

(1) Robot’s design and/or its non-verbalbehaviour is effective to intuitively conveythe robot’s expressions of attention andemotion; (2) As robot’s non-verbalbehaviours may be affected by users’impressions and other attributions, usingProximity Theory to indirectly evaluateHRI is appropriate

Non-verbal behaviours of the robot maytake a long time to establish interpersonalcoordination and interactional synchronywith the participants before the non-primary evaluation of HRI can beconducted

(1) Synchronized rhythm in establishingengagement during HRI can be evaluatedusing non-primary evaluations onproximity; (2) Observed non-verbalbehaviours suggest robot’s design iseffective in motivating user to sharemental states; (3) This evaluationmethodology can be used to flourish theSocietal Acceptance Modelling andPsychological with Therapeutic Modelling

Kanda et al. [80];Kozima et al. [97]Kanda et al. [86]Kozima et al.[96]Mutlu et al. [119]Satake et al. [142]Shukla and Tripathi [154]Liu et al. [104]Sorbello et al. [156]

D.Y.Y.Sim

,C.K.Loo

/Information

Sciences301

(2015)305–

344309

Page 6: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

310 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

affective empathy, the robot should manifest emotions through its facial expressions, voice, body postures, movements, andgestures so as to fit the situational context [166]. Robot does not feel empathy in any real sense, but it is projecting empathythrough the recognized means of expressions overviewed by Ekman’s research [33–35]. In 2013, i.e. last year, Leite et al. sup-ported that robot’s empathy needs to be sufficiently believable, but not to the extreme end of being so realistic to provokeexpectations that cannot be met in reality [101].

3.2. Technology Acceptance Model (TAM) and various types of methodologies involved in user acceptance

As shown in Fig. 1, assistive robots can be divided into social robots and non-social robots. Assistive social robots can alsobe further divided into two subcategories, i.e. service robots and companion robots [55]. The non-social robot concerns phys-ical assistive technology that is usually developed for rehabilitation purposes, but that is not in any way socially interactive.On the other hand, the other type, i.e. social assistive robot, is of course very socially interactive. These social assistive robotsare systems that can be perceived as social entities which communicate with the user, or are communicated with the usersuch as through touching and sensing. Between these two categories, of course there is an overlap. Since there are also pro-jects on social robots that are to be used for rehabilitation purposes [55,169], but generally in robotics these are in fact sep-arated fields [55].

For HRI, especially for robots and screen agents (see Fig. 4a), starting from applying heuristics evaluation [24], or otherusability type tests [182], classification tests [131] and role-based evaluation methodologies [143] and eventually to physicalresponse measurements [28,57]. Technology Acceptance Model (TAM) is used as a methodology that does not only provideinsights in the probability of acceptance of a specific technology, but also in the influences underlying acceptance tendencies[57,135]. In 1983, Davis proposed the Inter Personality Index (IRI) to assess empathy [32] during interaction, but since ourmain focus is on the evaluation and assessment methodologies for human–robot interaction, this review paper mainlyreviews all the critical evaluation and assessment methodologies which had been developed by the researchers to promotelong-term modelling of HRI. Davis’ Inter Personality Index is just reviewed as one of the assessment methodologies used bythe past researchers, such as by Gonsior et al. in 2011 [47], to incorporate as an assessment methodology to promote HRI.

Venkatesh et al. [172] evaluated eight theoretical models that employ intention and/or usage as the key dependent var-iable, i.e. (i) Theory of Reasoned Action, (ii) Motivation Model, (iii) Theory of Planned Behaviour (TPB), (iv) TAM, (v) a com-bined TAM and TPB Model, (vi) Model of Perceived Control (PC) Utilization, (vii) Innovation Diffusion Theory, and (viii) SocialCognition Theory [55,172]. The result of this process is the UTAUT (Unified Theory of Acceptance and Use of Technology)model which has been used in acceptance of robots [55,105,135]. It stated the influences of (i) Performance Expectancy,(ii) Effort Expectancy, and (iii) Social Influence, to be the direct determinants of Intention to Use or Behavioural Intention(please refer to Fig. 5b) [55,105,135,172].

3.3. Robots and network systems involved in user–robot personality matching and human-friendship modelling

In terms of human friendship estimation model, especially for humanoid and communication robots, based on the anal-ysis of non-verbal HRI and inter-human interaction, Kanda et al., in 2008, proposed a model for estimating human friend-ships in the presence of a humanoid robot [87]. This is because they found that the different appearances of robots didnot affect the participants’ verbal behaviours but did affect their non-verbal behaviours such as distance and delay in par-ticipants’ responses and the like [86,87]. This model is a further research done by Kanda et al. as well in 2006 [77] and 2007[89] on previous estimation models developed. (See Table 3 for detailed comparisons).

See Fig. 2a and b for the differences of the two humanoid robots, i.e. Robovie [82,86] and ASIMO [139]. These differencesare explained by two factors – impressions and attributions [86]. Fig. 3 shows the environment and positions for what havebeen carried out by the participant and the two different robots in the experiments. Partner robots and screen agents (seeFigs. 4a–c) have been used extensively because they act as human peers in everyday life. They perform mental and commu-nicational support for humans as well as for physical support [102]. Pet robots (see Figs. 4b and 4c) have been used success-fully in mental therapies for the elderly [55,82,161]. The conversational ability (by vocal) of robots helps humans to retrieveinformation through a computer network, and creates friendlier relationships with humans. Partner robots facilitate effec-tive multimodal communication in order to complete an arbitrary set of tasks together with humans [82].

In contrast, Kanda et al. in 2010, created a robot system in a shopping mall but it detected a person with floor sensors toinitiate interaction, and the robot was just partially tele-operated to avoid the difficulty of speech recognition. The HRI effectwas shown to be even better [91].

3.4. Modelling approaches and models involved in behavioural adaptation to model HRI

According to Tapus’ and Mataric’s research work in 2008, behavioural adaptation is another recognized challenge to pro-mote a more efficient HRI. This is because creating robotic systems which are capable of adapting their behaviours to user’spersonality, user’s preferences, and user’s profile to provide an engaging and motivating customized protocol is a challengingtarget. This applies especially when working with vulnerable user populations, interaction zones or proxemics that includeintimate, personal, social and public (see Fig. 9) [47,167]. Developments in the last decade in the field of robotics haveushered the interactive robots to be used in socially assistive applications [20,21,30,61,63,87,165]. Shibata and Tanie in

Page 7: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 3Strengths, weaknesses and uniqueness of the assessment methodologies to assess human–robot interaction.

Author(s) (keyauthors only)

Type(s) of Model(s) used on assessing the HRI Weaknesses (specific Assessment Methodologiesapplied)

Year(s) ofeachresearch

Characteristics of the methodologies applied for long-term modelling of HRI

Heerink [55] UTAUT Model (1) UTAUT model was applied to robot technology in alimited study. So, model needs adaptation in the sense ofextension, modification or both; (2) must take the notionof social acceptance into deep consideration thatcomplements technology acceptance; (3) Evaluation isbased on the accuracy in predicting social acceptance; (4)it is a subjective approach

2005 [135] Adapt to the specific requirements of evaluating socialassistive robots through the five selected constructs,which are exactly the standard constructs of UTAUT

Hennington andJanz [66]

2006 [58]

De Ruyter et al.[135]

2006 [64]

Heerink et al.[58,64]

2007 [66]2010 [55]

Heerink et al.[57,62]

Almere Model This model is very subjective and can be even moresubjective than UTAUT model as they use participants’self-assessment questionnaires to obtain convergentvalidity for HRI behavioural measurements

2010[57,62]

(1) Almere Model is very appropriate for the elderlygroup’s assessments; (2) Participants’ self-assessmentresponses are used for the convergent validity ofevaluation models used. Hence, more applicablegenerally to most types of HRI, mainly for elderly

Bartneck et al. [2,3] Godspeed Five Key Concepts Modelling (1) Godspeed Key Concepts is not an objective method,this incurs bias; (2) There are certain overlap betweensome concepts, such as in between anthropomorphismand animacy, and the like; (3) it is extremely difficult todetermine the ground truth, e.g. how anthropomorphic acertain robot is. So, this modelling approach may not berobust enough

2008 [3] (1) Emphasizes the need for standardized measurementtools for HRI; (2) The five Godspeed questionnaires use 5-point scales to help robots’ creators on theirdevelopment; (3) The five consistent questionnaires thatuse semantic differential scales, and psychologicalmeasures are taken into account

2009 [2]2011 [47]Gonsior et al. [47]

Yanco et al. [182] Incorporating HCI/CSCW into Human–Robot Interaction(HRI) Model; (or) Incorporating HCI alone into HRI Model

(1) Prior research on the personality issues may berequired before applying this model; (2) Adaptationprocess may take quite a long time as a large number ofparameter values and the role of robot’s personality inthe hands-off or other therapy, research or entertainmentprocesses have to be investigated; (3) focus on therelationship between the extroversion–introversion ofthe robot and the user and the robot’s ability to adapt itsbehaviour may be tedious

2002 [84] (1) Provides guidelines for developing interfaces for HRI;(2) These assessment guidelines are used as frameworksfor future assessments; (3) While robots act as passive-social media, they are assessed just based on theirabilities to convey information in the public, nointeraction with humans is needed

Hayashi et al. [51–53]

2004 [182]

Kanda et al.[84] 2005 [52]Jacobsson et al.

[73,74]2007 [53]

Pateraki et al.[126]

2007 [74]2008 [51]2008 [73]2014 [126]

Tapus and Mataric[166–168]

(1) HRI Models; (1) Prior research on the personality issues may berequired before applying this model; (2) Adaptationprocess may take quite a long time as a large number ofparameter values and the role of robot’s personality inthe hands-off therapy or similar therapies have to beinvestigated; (3) focusing on the relationship betweenextroversion–introversion of the robot and the user istedious; (4) clarification on humans’ intentions throughhaptic channel or similar model and the like is needed

2004 [165] (1) Behavioural adaptation model is capable of adjustingits social interaction parameters toward customizedrehabilitation therapy; (2) Promotes robot behaviouraladaptation; (3) Learning approach will adapt robot’sbehaviour to better model user’s personality. In recentyear 2013, Haptic channel is used for humans’ intentionintegration, while this year 2014, decision making modelfor intelligent agent is used

Bickmore andSchulman [10]

(2) User Personality Matching Model or User–RobotPersonality Matching Model

2005 [12]

Brave et al.[12] (3) Behavioural Adaptation Model, such as throughHaptic channel or the like

2006 [30]Tamura et al. [165] 2006 [166]Dautenhahn et al.

[30]2006 [168]

Groten et al. [49] 2007 [10]Hu and Loo [70] 2008 [167]

2013 [49]2014 [70]

Moshkina andArkin [113]

(1) Integrated Model of Personality and Affect (TAME) (2)Ethological Model, and (3) Emotional Model

(1) These models may not be reliable because self-reportdata from the questionnaires given is subjective; (2) dueto technical difficulty of the studies, sample size taken

2003 [1] (1) This emotion and personality [113] as well as theethological and emotional models [1] have combinedthese four areas of affect, i.e. TAME, to influence roboticArkin et al. [1] 2005 [113]

(continued on next page)

D.Y.Y.Sim

,C.K.Loo

/Information

Sciences301

(2015)305–

344311

Page 8: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 3 (continued)

Author(s) (keyauthors only)

Type(s) of Model(s) used on assessing the HRI Weaknesses (specific Assessment Methodologiesapplied)

Year(s) ofeachresearch

Characteristics of the methodologies applied for long-term modelling of HRI

was rather small; (3) the chosen physical platform,designed for entertainment, may be a more decisivefactor than differences in the robot behaviour. So, bias

behaviour; (2) Increases the ease and pleasantness of HRI

Leite et al.[101] 2013 [101]Author(s) (just key

authors)Methodologies of Assessment for Long-Term Modellingof HRI

Weaknesses (specific Assessment Methodologiesapplied)

Year(s) ofcompletedresearch

Uniqueness of the Characteristics of these methodologiesapplied for long-term modelling of HRI

Kanda et al.[77,78,87,89,90]

Human Friendship Estimation Model is designed basedon 3 three design principles: (1) it calls children by name;(2) it adapts its interactive behaviours for each childbased on a pseudo-development mechanism; (3) itconfides its personal matters to the children who havelong interactions with robot

(1) The generality of their research findings can belimited, i.e. model may not be appropriate for every HRI;(2) Robot’s capability for long-term interaction is limitedas information that the robot can provide is of limitedresources. So, HRI is restricted mostly to the non-verbalcommunication types

2004 [90] (1) Interaction with robots in real-world setting; (2)Analyze interaction among children with robot from theirnon-verbal interactions; (3) Interactive robots have thefundamental ability to socially communicate withhumans; (4) Friendship Estimation Model is a goodapproach for a social robot to understand humanrelationships

2004 [78]2006 [77]2007 [89]2008 [87]

Leite et al.[100,101]

Empathic or Autonomous HRI Model that includes anaffect detector which allows the robot to infer the valenceof the feeling experienced by the participants. For Crameret al.’s [27] research, Likert-type and semantic differen-tial scales are used to measure the robot’s perceivedempathic ability and etc. For Leite et al.’s [100] research,no affect detector is used, but a series of online survey isused as the assessment measures

(1) Empathic or autonomous HRI mobility effect needs tobe selected carefully as it is under the risk of having theopposite effect; (2) Bias may take place in the targetapplication scenario; (3) children’s specific preferencesseem to influence the ‘‘degree of empathy’’ that socialrobots should be endowed with

2010 [27] (1) This model applies a good ethnographic study in thereal-world settings; (2) Solves the doubts of affectiveinteractions with assistive social robots or mobile robots;(3) Presented a first evaluation of an autonomous mobileor social robot capable of recognizing the user’s affectivestates and displaying appropriate empathic behaviour;(4) This modelling is important for designing empathicrobot companions or autonomous mobile robots

Cramer et al. [27] 2012 [100]2013 [101]

Yanco et al. [182] Human–Robot Team (HRT) Model that involves multiplesocial robots and a specific design framework where asystem is developed in which a single operator cansimultaneously control multiple robots in conversationalinteractions with users. Human-Control of mobile robotsmodel involves type-2 or type-1 fuzzy trackingcontrollers

(1) Coordination between autonomy and operation isoften difficult in real applications of social and/or mobilerobots; (2) Deployment of autonomous mobile and socialrobots often causes unnecessary time and effort; (3)failure of the robot team can often take place; (4) a lot ofefforts are expended on the automation issue; (5)operator must perform very well for timing andsimultaneous control

2011 [184] (1) Tele-operation of multiple autonomous mobile and/orsocial robots – unique challenges posed by remoteoperation of multiple social robots, conducting multipleinteractions at once; (2) This modelling approachdescribes the general system requirements in four areasand shows the effects of their system throughsimulations and a laboratory experiment based on real-world interactions; (3) more optimal autonomous controlby type-2 and type-1 fuzzy tracking controllers underperturbed torques; (4) more optimization of fuzzyintegrator for mobile robots

Olsen and Wood[123]

2012 [43]

Shiomi et al. [150] 2012 [44]Zheng et al. [183–

185]2012 [18]

Glas et al. [43,44] 2013 [107]Melin et al.[108] 2013 [108]Melendez and

Castillo [107]2013 [183]

Castillo et al. [18] 2014 [178]Wang and Young

[178]2014 [185]

Mutlu et al.[116,117]

Human Gaze Model [16,116] and Human-like GazeBehaviour [116] Referential Gaze Model with JointAttention Conversational Gaze Model [117] AutomaticGaze Control and 3D Visual Model [109,140] VisualEstimation of Pointed Targets [110,114,126,140]

(1) Highly technical mechanisms are required to ensurethe coordination of the gaze models; (2) Controlled gazecues may face difficulties to integrate with the unfoldingspeech and the gaze movements; (3) Coordination andsynchronies may have mismatching problems in the gazemechanisms applied to HRI

1994 [16] (1) These gaze models are improved by researchersthrough various types of gaze mechanisms over theyears; (2) Referential gaze model suggests that artificialagents are similar to human agents, thus validating jointattention mechanisms; (3) Conversational gaze modelaffect humans’ rapport with robot, feelings of group spiritwith their conversational partners, and also theirattention on tasks

Cassell et al. [16] 2006 [116]Staudte and

Crocker [157]2011 [157]

Mora et al. [109] 2012 [117]Sakai et al. [140] 2013 [109]Pateraki et al.

[126]2013 [140]

Murakami et al.[114]

2014 [110]

312D

.Y.Y.Sim,C.K

.Loo/Inform

ationSciences

301(2015)

305–344

Page 9: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Morales et al. [110] 2014 [114]2014 [12

Hall [50] Psychological Model and Therapeutic Model (forresearch, therapy and entertainment) – based onProximity Theory in Psychology, and/or related theoriesor principles

(1) Limited applications as certain therapeutic modelsonly apply on children with Autism, Asperger’s syndromeor the like; (2) May take a long time to establishinterpersonal coordination and interactional synchronieswith the participants before non-verbal behavioursformulations can be conducted

1990 [50 (1) Interaction with robots in real-world settings; (2)Social robot, e.g. Keepon, conducts non-verbalinteractions with users, help researchers indirectly assessthe HRI based on the psychological or therapeuticmodels; (3) Robot’s design and behaviour is effective intherapeutic purposes and sharing mental states withparticipants – so, good to be used as learning approachesfor HRI

Kamasima et al.[76]

2004 [76

Kanda et al.[80,82,86]

2004[80

Kozima et al.[96,97]

2004 [82

Cooney et al.[25,26]

2005 [97

Frank et al. [39] 2008 [862009 [962014 [252014 [262014 [39 D

.Y.Y.Sim,C.K

.Loo/Inform

ationSciences

301(2015)

305–344

313

6]

]]

]

]

]

]]]]]

Page 10: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Figs. 2 and 3. (2) Humanoid robots: (a) Robovie (Kanda et al. [82,86]) and (b) ASIMO (Sakagami et al. [139]), both of them are categorized as the servicerobots, under assistive social robots. (3) Environment and positions for the experiments done by the humanoid robots, i.e. Robovie and ASIMO (Kanda et al.[86]).

Fig. 4a. In terms of robot’s behavioural adaptation, different assessment methodologies have been used to continuously monitor and assess the satisfactionof users toward HRI. For the above robots and screen agents, Huggable is assessed mainly on its companionships, while Homie on its communicationcompanionship. Annie, iCat and ISH software agent are assessed mainly on the monitoring controlling devices providing information, while Care-o-bot onits butler guide physical aid, ISH-Joy on both controlling devices and physical aid butler (Heerink [55]).

314 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

Page 11: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Fig. 4b. The PARO companion robot (Riek and Robinson [131]).

Fig. 4c. The huggable robot (Stiehl et al. [160]).

Fig. 5a. Unified Theory of Acceptance and Use of Technology (UTAUT) model used during implementation [55].

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 315

2001 developed a seal-like robot, named Paro (see Fig. 4b), for therapeutic purposes and demonstrated its social effect forencouraging communication among inpatients and caregivers [87,147]. In a similar vein, Kozima et al. in 2005 and 2009,placed a remote-controlled robot, Keepon, in a day care centre for developmental disordered children to encourage theirsocial behaviours [87,96,97]. This paper is reviewing all the assessment and evaluation methodologies for therapy, researchand entertainment purposes. Bartneck et al. in 2009 highlighted that the more animated the face of the robot, the more likelyit is to attract the attention of a user [2]. From Figs. 4b and 4c as shown, PARO (refer to Figs. 13a and 13b) is the robotic seal[131,173], as well as the Huggable robot [160], present themselves embodied within stuffed animals, and behave as onemight expect an animate toy or pet to act. The robot’s behavioural adaptation is one of the main assessment methodologies

Page 12: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Fig. 5b. UTAUT model: direct influences and moderating factors are represented [55,172].

Fig. 5c. Humanoid robot, Honda’s ASIMO, is telling a Japanese fairy tale to two listeners [116] – modelling is one of the assessment methodologies tomonitor, manipulate and assess the robot’s human-like gaze behaviour [116] – (see Table 3).

Fig. 5d. Clustering of the four gaze locations used by the storyteller [116] – one of the assessment methodologies to assess the robot’s human-like gazebehaviour [116] – (see Table 3).

Fig. 5e. Overview of networked robot system [150].

316 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

Page 13: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Fig. 5f. Examples of path information [150].

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 317

that researchers should incorporate in their learning approaches to model a better HRI. Fig. 4a shows a series of differenttypes of robots and screen agents categorized by Heerink in 2010 while different combinations of assessment methodologieshave been used under certain evaluation methodologies during long-term modelling of HRI [55].

4. Assessment methodologies that have been carried out to assess and long-term model the HRI

The original TAM model has been extended with more influences that were found to influence Intention to Use or Usage[55] (see Fig. 5a). Venkatesh and Davis in 2000 [171] reintegrated the concept of Subjective Norm, and this model is usuallyreferred to as TAM-2 [55,171]. As mentioned above, there should be multiple methods of assessments within a singlemethod of evaluation. Some of the methods of assessments were adopted and/or modified from existing scales used in Psy-chology, Social Sciences, and other HRI researchers [5,6]. From the research work done by Bethel and Murphy in 2009, multi-ple signals (within a single method of evaluation, i.e. psycho-physiological study) were used for obtaining reliable andaccurate results. Correlations were conducted between the different signals to determine the validity of participants’responses [5,8]. To further support the statement above, i.e. ‘within a single method of evaluation, there should be multipleassessment measurements to be utilized’, Bethel et al. in 2009 had proven the results in their research work [6]. Multipleassessment measurements are conducted within a single method of evaluation for HRI [5,6,8]. In Section 3, we havereviewed different types of modelling approaches and measurement tactics for Assessment Methodologies. In this section,we review all the combinations of assessment methodologies that have been done by the past researches in assessing andlong-term modelling the HRI.

4.1. Human gaze models with human-like gaze behaviour, referential gaze model and walking gaze models for HRI

Fig. 5c shows the research done by Mutlu et al. on Honda’s humanoid robot ASIMO, i.e. storytelling requires ASIMO to beaware of its audience and be able to direct its gaze in a natural way. They explored how human gaze can be modeled andimplemented on a humanoid robot to create a natural and human-like behaviour for storytelling [116]. This assessmentmethodology is based on a gaze model which integrates data collected from a human storyteller and a discourse structuremodel developed by Cassell et al. [16,116].

Mutlu et al. used this model to direct the gaze of ASIMO (see Figs. 5c and 5d), as they recited a Japanese fairy tale using apre-recorded human voice. They assessed the efficacy of this gaze algorithm by manipulating the frequency of ASIMO’s gazeto assess whether the participants evaluated the robot more positively and did better on a recall task when ASIMO looked atthem more [116]. In 2006, Mutlu et al. further illustrated in their research by stressing that there are many commonalitiesbetween human–human communication and human–robot communication. The assessment methodology used is also basedon modelling the human gaze and subjective evaluation of human-like gaze behaviour. They explore how human gaze can bemodeled and implemented on a humanoid robot in order to create a natural, human-like behaviour for storytelling. Hence,their research further confirms the usefulness of using Gaze Models as one of the assessment methodologies for HRI. In addi-tion to the research in 2006 on modelling and evaluation of human-like gaze behaviour [116], Mutlu et al. in 2012 (seeFig. 12b), evaluated the HRI based on the conversational gaze mechanisms for human-like robots [117] (please refer toTable 3 for different gaze models implemented over the years). In 2013, i.e. last year, Mutlu et al. used more specific coor-dination mechanisms for achieving a much better human–robot collaboration through collaborative manipulation [119].

In 2014, i.e. this year, in a similar research vein, Pateraki et al. formulated a novel approach which takes into account theprior information about the location of possible pointed targets, based on the fact that in most applications, it is the pointedobject, rather than the actual pointing direction which is important. They addressed an important issue in HRI, that of accu-rately deriving pointing information from a corresponding gesture [126]. To decide about the proposed object, as being more

Page 14: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 4Combinations of assessment and evaluation methodologies for HRI (in chronological order for key authors).

Authors/Years of Research (key authorsonly – ascending chronol. order)

Model(s)/Test-bed(s)Protocol(s) used

Strengths Weaknesses Uniqueness of theResearch Work (Test-bed(s) and/or protocol(s)used) – Contributions ofthis modelling approachto the Society

In terms of Evaluation In terms of Assessment In terms of Evaluation In terms of Assessment

Bartneck et al. [3] Godspeed Five KeyConcepts Modelling

(1) Emphasizes the needfor standardisedmeasurement tool; (2)Questionnaires use 5-point scales to helprobots’ developers orevaluators; (3)Questionnaires areconsistent as usingdifferential scales

(refer to Table 3 – seePage 21, 3rd row, lastcolumn)

Primary evaluationmethodology of usingquestionnaires alone issubjective, and hence biasmay take place

(refer to Table 3 – seePage 21, 3rd row, 3rdcolumn)

(1) Testifies the five keymeasurement conceptswhile evaluating andassessing the HRI; (2)These five key conceptsare thorough in reachinga good overall evaluationscore and assessmentmodel in many kinds ofHRI; (3) It is suitable foralmost any kind of HRI;(4) Has a goodcombination ofevaluation andassessmentmethodologies for HRI

Bartneck et al. [2]Gonsior et al. [47]

Yanco et al. [182] (1) Human-control ofMultiple Robots; (2) Tele-operation of MultipleSocial Robots Model; (3)Human–Robot Team(HRT) Modelling

The Primary EvaluationMethodologies used arethe question-and-answerdialogue questionnairesand task performancemetrics works very wellwith HRT model andteleoperation multiplerobots model

(1) Able to show theeffectiveness of theirsystem based on real-world setting orsimulations; (2) Thisassessment model is verysuitable for assessingmultiple robots system

(1) Covers only single orlimited round of dialogue;(2) Task PerformanceMetrics are more suitablefor humanoid robots; (3)Not very suitable for one-to-one HRI; (4)Simulation results arebased only on userstudies. So, results are notreal-world specific

(1) Parameters are set fora specific context, henceHRI is assessed on verylimited topics; (2) Suitmostly for humanoidrobots; (3) thismethodology does notmodel random errors ofautomation. So, not veryaccurate; (4) Timing is ofmission critical

(1) Tele-operation ofmultiple social robots isusually very efficient astiming critical; (2) Bothneeded for human-controlling or tele-operating of the robot andestimating the interactionsuccess; (3) Very suitablein illustrating thedynamics of the systemand the effects of varyingparameters; (4) Veryuseful in designing andtuning systems for thereal-world deployment;(5) more optimalautonomous control bytype-2 and type-1 fuzzytracking controllers underperturbed torques; (6)more optimization offuzzy integrator formobile robots

Olsen and Wood [123]Shiomi et al. [150]Zheng et al. [184,183,185]Glas et al. [43,44]Castillo et al. [18]Melin et al. [108]Melendez and Castillo [107]

De Ruyter et al. [135] (1) UTAUT Model orAlmere Model; (2) iCatrobot (as a test-bed for

At least three veryprominent primaryevaluation methodologies

(1) Use UTAUT model as ameasure of iCat robottechnology acceptance in

(1) The questionnairesused are specific tocertain context only, or

(1) UTAUT model appliedis a modified version,researchers can only draw

(1) Demonstrated therelevance of socialintelligence in HRI

Heerink et al. [58,64]Bartneck et al. [2]

318D

.Y.Y.Sim,C.K

.Loo/Inform

ationSciences

301(2015)

305–344

Page 15: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

social intelligence) are used, i.e. specificquestionnaires developedas well as interviewingand observations. Allthese are to evaluate theperception of HRI socialintelligence

the workplace – astandardized model thatcovers all the 5constructs; (2) Able toshow human-likebehaviours, so, sociallyintelligent

questionnaires applied inlimited contexts. Hence,not standardised enough;(2) Questionnairesmethod is subjective, so,may incur bias

tentative conclusionsfrom this measurement;(2) More effective ways ofachieving socialintelligence for a betterperception are still yet toexplore

computationalcharacters; (2) Able toshow an increase in theperceived socialintelligence; (3) Able togreatly substantiate thatiCat and the itsbehaviours have asignificant effect on thesatisfaction level with theembedded systems,technology acceptance,and sociability towardsthe system; (4) Able toexplore the concept ofsocial intelligence; (5)Behavioural observationis an effective primaryevaluation

Heerink et al. [57,62]

Hall [50] (1) Psychological andTherapeutic Modelling;(2) Keepon (a robot) as atest-bed for SocialIntelligent Modelling[54,55]

(1) EthnographicObservation of Keepon’srhythmic interactions is aunique primaryevaluation methodologyfor HRI. (2) Non-primaryevaluation on non-verbalinteractions is a verysuitable way to evaluateHRI for autistics

(1) Computational modelof rhythmic synchronydeveloped is able toconvey robot’s attentionand emotion; (2)rhythmic synchronies andinteractions are verysuitable for Psychologicaland TherapeuticModelling

(1) Limited application toautistics or similar mentaldisorders only; (2)Primary evaluation byethnographicobservations and othernon-primary evaluationmethodologies can bedifficult as too specificmodels

(refer to Table 3 – seePage 22, last row, 3rdcolumn)

(1) Appropriatelydesigned robot facilitatesdyadic interactionbetween an autistic childand robot, as well astriadic interaction amongautistic children andcaregivers; (2) Movementand dance can havetherapeutic effects, (3)Behavioural observations,such as ethnographicobservations and otherevaluation methods,prove that interactiverobots, such as Keepon[54,55], can facilitatechildren’s social interac-tions

Kamasima et al. [76]Kanda et al. [80,82]Kozima et al. [97]Kanda et al. [86]Kozima et al. [96]Cooney et al. [25,26]Frank et al. [39]

Authors/Years of Research inchronological order (key authors only)

Model(s) with Test-bed(s)/Protocol(s) used

Strengths of theseEvaluation Methodologies

Strengths of theseAssessmentMethodologies

Weaknesses of theseEvaluation Methodologies

Weaknesses ofAssessmentMethodologies

Uniqueness of theResearch Work (Test-bed(s) and/or protocol(s)used) – Contributions ofthis modelling approachto the Society

Tamura et al. [165], Brave et al. [12],Dautenhahn et al. [30], Tapus andMataric [166,168,167] Bickmore andSchulman [10] Groten et al. [49], Huand Loo [70]

(1) BehaviouralAdaptation Model; (2)HRI Model; (3) User–Robot PersonalityMatching Model; (4)Haptic channel; (5)

At least three EvaluationMethodologies are used,i.e. (1) Self-Reportevaluation (2)Behavioural observationsby the users on the

(refer to Table 3 –see Page21, 5th row, last column)

(1) Primary evaluationmethod of using self-report measures issubjective, bias may beincurred; (2) The HRI andPersonality Matching

(refer to Table 3 –see Page21, 5th row, 3rd column)

(1) Models facilitateassistive social robotsystems to aid people indaily lives; (2) Have novelmultidisciplinarycollaboration including

(continued on next page)

D.Y.Y.Sim

,C.K.Loo

/Information

Sciences301

(2015)305–

344319

Page 16: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 4 (continued)

Authors/Years of Research (key authorsonly – ascending chronol. order)

Model(s)/Test-bed(s)Protocol(s) used

Strengths Weaknesses Uniqueness of theResearch Work (Test-bed(s) and/or protocol(s)used) – Contributions ofthis modelling approachto the Society

In terms of Evaluation In terms of Assessment In terms of Evaluatio In terms of Assessment

Decision Making Modelfor Intelligent Agent

robots, and (3)PsychophysiologicalMeasures – all of thesemethodologies work verywell with all the threeAssessment Modelsstated, especially for theBehavioural AdaptationModel; (4) Haptic channelfor HRI IntentionIntegration

models applied are tospecific for certaintherapeutic purposes )Models may take a lotime to develop duerobot’s behaviouraladaptation; (4) Althohaptic channel enhan sshared decisionsituations, it takes tim torealties haptic HRI

cognitive psychology; (3)The role of robot’spersonality in the hands-off therapy process or etc.focuses on therelationship between theextroversion–introversion of the robotand the user, and theability of the robot toadapt its behaviour to theuser personality andpreferences; (4) Modelsare applicable to bothchildren and adults, whoare normal or withmental disorders, e.g.autistics who needstherapies

Kanda et al. [84] Yanco et al. [182] Hayashiet al. [52] Hayashi et al. [53] Hayashiet al. [51] Jacobsson et al. [74,73]Pateraki et al. [126]

(1) HRI model with HCIand CSCW incorporated;(or) HRI model with HCIalone; (2) Robots aspassive-social media

(1) HCI/CSCW evaluationtechniques, i.e.questionnaires are usedfor users’ perception; (2)Involves observation ofrobots and design ofguidelines for developingHRI interfaces. So, specificenough;(3) Metrics toevaluate taskperformance are very HRIoriented, i.e. specificenough

(refer to Table 3 – seePage 21, 4th row, lastcolumn)

Evaluation criteria isnarrower scope becau itis limited to the userinterfaces such thatrobots acting as pass -social media, or in usinterfaces aided by Hand CSCW, i.e. narrowscope evaluation oninterfaces

(refer to Table 3 – seePage 21, 4th row, 3rdcolumn)

Incorporating CSCW and/or HCI into HRI model –(1) Model shows initialguidelines for designinginterfaces for HRI, basedon robots acing aspassive-social media or inCSCW; (2) This modelanalyzes pre- and post-evaluation debriefings todevelop guidelines fordeveloping interfaces forHRI – these guidelines areused as frameworks forassessing and evaluatingrobots; (3) HRI Evaluationis beyond usabilityevaluation – operator–robot pairing and the likeis enhanced by HCI andCSCW

320D

.Y.Y.Sim,C.K

.Loo/Inform

ationSciences

301(2015)

305–344

n

o

; (3ng

to

ughce

e

ofse

iveerCIer

Page 17: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Kanda et al. [83] [82]Shiwa et al. [153]Shiwa et al. [152] Glas et al. [43,44]Castillo et al. [18] Mora et al. [109]Melin et al. [108] Melendez and Castillo[107]

(1) Temporal AwarenessModel or Timing Model;(2) Robovie or Robovie II –(humanoid robot is usedas a test-bed forembodiment); (3) iCatrobot

(1) As this model involveshuman and/or robot taskperformance metrics,primary evaluationmethodology is notsubjective and so, muchless biased; (2) verysuitable for evaluatingmulti-robot system, i.e.team scoring and groupHRI evaluation.

(1) Usually works wellwith HRT model ormodels involving multi-robots systems; (2)Incorporates well withthe task performancemetrics; (3) This model isgood for tele-operatingmulti-robot system astiming is critically underproper control

(1) Task performancemetrics is more suitablefor tele-operated HRI; (2)Task performance metricsmay not be very suitablefor non-humanoid robots;(3) Improper timing mayincur errors.

(1) Technical difficultiesmay incur problems onthe timing control; (2)Need expertise to work onthe tele-operationinterface design; (3)Controls a network thatmaintains timing can beresource consuming andtedious.

(1) This model is of veryfair approach; (2) workswell with the HRT modelsand other multi-robotassessment models, i.e.really good for tele-operation control; (3)particularly good whenmore than one robot orone human are involvingin the HRI; (4) moreoptimal autonomouscontrol by type-2 andtype-1 fuzzy trackingcontrollers underperturbed torques; (5)more optimization offuzzy integrator formobile robots

Author/Year (Key Authors only) Model(s) Test-bed(s)/Protocol(s)

Strengths Weaknesses Uniqueness of Research

In terms of Evaluation In terms of Assessment In terms of Evaluation In terms of Assessment Contributions of thismodelling approach tothe Society

Riek and Robinson [131] Gockley et al.[45] Sung et al. [163] Mutlu et al.[118,117] Satake et al. [142] Liu et al.[104] Sorbello et al. [156]

(1) Societal AcceptanceModelling for HRI; (2)Technology AcceptanceModelling (TAM); (3) EOCformula

(refer to Table 2 – seePage 20, 1st row, both 3rdand last columns)

(1) Consistent withHuman-Centered Design,i.e. technology acceptanceis directly related tousers’ mental models;(2)Assessment is straight-forward, i.e. measureshow easy a useridentifiers a robot’s type,etc.

(refer to Table 2-see Page20, 1st row, 4th column)

(1) Bias may be incurredfrom the users’ EOC; (2)Although think aloudmethod can be used, it isstill a subjective method;(3) Users often feelreluctant to classify therobot, i.e. EOC, when theyfeel hard to tell apart itstype

(1) Societal AcceptanceModelling with the Easeof Classification (EOC)allows easy quantifiablemetric; (2) EOC formula isvery straight forward andeasy to be worked out; (3)This type of modellingapproach is simple as itdoes not involve anytedious adaptation for therobot during HRI

Jacobsson et al. [74] Jacobsson et al. [73]Shihab and Sim [148] Shihab et al.[149] Sim [155]

(1) HRI model with HCIand CSCW incorporatedon interfaces; (2) Openexploring robot platform,i.e. see Puck and e-Puck;(3) Glowbots asdemonstrators – used astestbeds

(refer to Table 2 – seePage 20, 3rd row, both 3rdand last columns)

(1) This method issuitable for any kind ofHRI involving userinterfaces, both tohumanoid and non-humanoid robots; (2)Suitable for HRI involvingone or multiple robots

(refer to Table 2 – seePage 20, 3rd row, 4thcolumn)

(1) Assessment is notthorough or robustenough if compared withTAM or Godspeed FiveKey Concepts as HRI isassessed mainly onGlowbots’ patterns; (2)May take a long time toassess HRI with HCI andCSCW platforms

(1) This model is a goodlearning mechanisminvolving open platformsand/or user interfacesespecially for non-humanoid robots, i.e.Glowbots or etc.; (2)Continuous improvementfrom the non-primaryevaluation on HRI can bedone by users directly onthe open exploring robotplatforms. So, convenient

(continued on next page)

D.Y.Y.Sim

,C.K.Loo

/Information

Sciences301

(2015)305–

344321

Page 18: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 4 (continued)

Authors/Years of Research (key authorsonly – ascending chronol. order)

Model(s)/Test-bed(s)Protocol(s) used

Strengths Weaknesses Uniqueness of theResearch Work (Test-bed(s) and/or protocol(s)used) – Contributions ofthis modelling approachto the Society

In terms of Evaluation In terms of Assessment In terms of Evaluation In terms of Assessment

Mutlu et al. [116] [117] Cassell et al. [16]Staudte and Crocker [157] Mora et al.[109] Sakai et al. [140] Morales et al.[110] Murakami et al. [114] Paterakiet al. [126]

(1) Human Gaze Model[16,116]; (2) ReferentialGaze Models with JointAttention [157]; (3) Con-versation-al Gaze Models[117] with 3D Model[109]; (4) Visual Estima-tion of Pointed Targets[110,114,126,140]

At least two primaryevaluation methodologiesare applied since mainlyinvolves observation,questionnaires and/orgesture imitation, taskperformance metrics

(refer to Table 3 – seePage 22, 4th row, lastcolumn)

(1) As gaze models aremainly used forhumanoid robots, notvery suitable for non-humanoid robots; (2)Evaluation may need timeand IT expertise

(refer to Table 3 – seePage 22, 4th row, 3rdcolumn)

(1) Modelling has littlebias as not involving anypure subjectiveevaluation from users. (2)Confirms that byestablishing eye-contactor observing andimitating gestures,humans can greatlyincrease utterances; (3)Confirms that gaze isaffecting taskperformance in learningapproaches for HRI

322D

.Y.Y.Sim,C.K

.Loo/Inform

ationSciences

301(2015)

305–344

Page 19: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 323

sophisticated research work than what Mutlu et al. did in 2006 [116] and also Ono et al. did in 2001 [82,124], Pateraki et al.in this year, i.e. 2014, proposed using the Dempster-Shafer theory of evidence to fuse information from two different inputstreams [126].

Preceding research to support this, from the research done by Ono et al. in 2001, by establishing eye contact, observingand imitating gestures, humans can greatly increase the understanding of others’ utterances during HRI [82,124]. In addition,from the studies done by Kidd et al. in 2006 [93] and Taggart et al. in 2005 [164], it has been concluded that, for companionrobots (such as the Japanese toy robot dogs Aibo and Paro – see Figs. 4b and 4c), which are assistive social type robots, notonly functionality, but also form and material does matter a lot in terms of acceptance and effects of them toward humans[55,93,164]. The results from the assessment methodologies matched the predictions in literature, i.e. gaze is also shown toaffect task performance in learning approaches [41,116,125,146]. Mutlu et al. found that participants performed significantlybetter in recalling ASIMO’s story when the robot looked at them more. The assessment methodology results also showedsignificant differences in how men and women evaluated ASIMO based on the frequency of gaze they received from the robot[116]. In year 2012, Mora et al. incorporated automatic gaze control and 3D spatial visualization, a more sophisticated eval-uation methodology was used [109]. In 2013, i.e. last year, Sakai et al. created a motion design of interactive small humanoidrobot with visual illusion – this again further proves that gaze models with human-like gaze behaviours are important!![140].

The gaze and gesture algorithm for ASIMO [116] is done by building on results in the literature for avatar gaze [16]. Again,this is another good combination of assessment and evaluation methodologies applied to HRI, preceding the researches doneby Kanda et al. in 2008 [86,87]. When dealing multiple social robots, the task performance metrics are used, especially whereteams or groups are being evaluated and/or more than one person is interacting with one or more robots [5,14,116,118,158].To illustrate this, the experiments done by Shiomi et al. in 2009, a networked robot system (see Figs. 5e and 5f) was devel-oped that coordinates multiple social robots and sensors to provide efficient service to customers in a shopping mall. Thisnetworked robot system directs the tasks of robots based on their positions and people’s walking behaviour [150]. In thesame vein to support this research, in this year, i.e. 2014, Morales et al. constructed a walking together – side by side walkingmodel for an interacting robot [110], and Murakami et al. created a framework for walking side-by-side without knowing thegoal, i.e. destination unknown, for HRI [114]. All these work in 2014 [110,114] again confirm that people’s walking behaviourpeople’s walking behaviour, gaze models and human-like gaze behaviours are important for long-term modelling the HRI(see Table 4 for more details).

4.2. UTAUT and Almere Models – the learning approaches adopted as one of the Technology Acceptance Modelling (TAM) on HRI

According to Heerink [55], the Unified Theory of Acceptance and Use of Technology (UTAUT) model has been applied torobot technology in a limited study, and not just for elderly users (refer to Figs. 5a and 5b). It was also found that manyresearch studies showed how the UTAUT model needed adaptation in the sense of extension, modification or both[55,58,60,64,66,172]. This has been supported by the research done by Hennington and Janz [66]. This is because theyshowed the effects of applying UTAUT model in a healthcare context while the model needed adaptation for physician adop-tion of electronic medical records [66]. As a review finding from Hennington’s and Janz’s work in 2007 [66], as well as a con-clusion done by Heerink’s research in 2010 [55] it is that we must take the notion of social acceptance into consideration as aconcept that complements technology acceptance. This implies that research on robot and agent acceptance can be subdi-vided into two areas: (1) the acceptance of the robot in terms of its usefulness and ease of use and (2) the acceptance of therobot as a conversational partner with which a human or pet like relationship is possible (social acceptance) [55].

In addition, Heerink also found that the experiments with companion type robots (such as Paro and Aibo) were morefocused on social acceptance, while the experiments with service type robots (such as Pearl and iCat) focused more onthe functional acceptance, i.e. the acceptance of the robot regarding its functionalities. Hence, to be able to obtain a completeview on acceptance of an assistive robot, researchers and robot developers need a model that enables us to explore both thesocial and functional acceptance. This means that researchers also have to evaluate this UTAUT model based on the accuracyin predicting both [55,172]. In Figs. 5a and 5b above, Heerink [55] extended the UTAUT Model by several constructs to adaptthis model to the specific requirements of evaluating social robots [55,172]. Besides UTAUT model, Heerink et al. built anAlmere model (see Table 3) for measuring acceptance of assistive social agent technology by older adults in 2010 [57,62].Acceptance methodology is traditionally a questionnaire, which replies on a Likert scale [57]. In terms of the AssessmentMethodologies, relating conversational expressiveness to acceptance, TAM adds behavioural analysis to instrumentation.This enriches acceptance methodology [57].

4.3. Human friendship estimation model – learning approaches adopted for assessment methodologies on HRI

Kanda et al. in 2008 proposed a model for estimating human friendships in the presence of a humanoid robot, and this isbased on the analysis of non-verbal inter-human interaction. They analyzed the video data based on an observation method,as a methodology to analyze the interaction among children and the robot [87]. This research is further to the research doneby Kanda et al. as well in 2004 on friendship estimation model [78,90] for HRI. So, Kanda et al. are able to evaluate the estab-lished model for friendship estimation [87] (see Table 3). Past researchers suggested the importance of recognizing the rela-tionships among what children have with other children in socially assistive applications [87]. Kanda et al. in 2004

Page 20: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

324 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

conducted a study where Robovie (see Fig. 2a), their communication humanoid robot, interacted with elementary schoolchildren. The assessment methodology used is to analyze the video data obtained in the preliminary study and establisheda model for assistive social robots to recognize friendships among children from non-verbal interaction. The video data donewere analyzed with an observation method which is very well-established in Psychology [87,90]. Kanda et al. in 2004 [78]reported their novel approach to develop an assistive social robot. Such a robot reads human relationships from their phys-ical behaviours. They have developed an interactive humanoid robot that attracts humans to interact with it and, as a result,induces their group behaviours in front of it. In this approach, the robot recognizes friendly relationships among humans bysimultaneously identifying each person in the interacting group. They conducted a two-week experiment in an elementaryschool, in which the humanoid robot, Robovie, demonstrated proven reasonable performance in identifying friendshipsamong the children [78]. This ability to read human relationships is essential to for assistive social robots to behave socially[78,90].

4.4. HRI Modelling with HCI alone OR with both CSCW and HCI incorporated

Jacobsson et al. in 2008 used a platform, named see-Puck (see Fig. 7a), for exploring human–robot relationships. This see-Puck is a round display module that extends an open robot platform, named e-Puck [73]. Glowbot is the first demonstrationof robot constructed using the see-Puck platform [73,74]. As opposed to therapist hands-off robot shown in Fig. 6 [166],Figs. 7a and 7b show the group of interacting robots that uses their patterns to attract users’ attention and encouragement[73,74]. Since visualization by using Human–Computer Interaction (HCI) techniques is important [148,149,155], it is pro-posed that the modelling, especially on the Graphic User Interfaces (GUIs), may analyze the attractiveness of the user inter-face in terms of task analysis first, and then followed by goal analysis and lastly to scenario analysis [155]. The assessmentmethodologies of these visualization research include feedbacks done by the users, software developers and the like, as wellas the users’ feedbacks on the HCI and/or HRI [73,74,148,149,155]. For the research done by Heylen et al., because there is nomultiple robots or groups of subjects involved, choosing self-assessment method of evaluation is enough to evaluate the HRI[67]. So, in terms of evaluating HRI, certain subjective method of primary evaluation, such as self-assessment or self-report,is usually adopted as an enhancement [5,8,67].

Tables 2 and 4 show the uniqueness of Jacobsson et al.’s researches in using the Glowbots demonstrators on the platform,named See-Puck, as a learning mechanism for the non-humanoid robots to adopt while interacting with humans [73]. Table 2illustrated the open robot platform implemented to further enhance the modelling of HRI. Hayashi et al. in 2008 constructeda robot system, named Manzai, as a passive-social medium (see Fig. 8a–c) during HRI, for HRI. In which, robots behave as ifcommunicating by speech, while in fact the system exchanges information through a network that maintains communica-tion timing [52]. The evaluation is done on the development of a multi-robot conversation system based on network com-munication. Based on the developed system, they implemented the Robot Manzai and compared its performance withManzai performed by humans shown on video [52]. The assessment and evaluation methodologies done are based on thepassive-social media, acted by the robots, besides Manzai (see Fig. 8d and e). The HRI model can be aided by Human–Com-puter Interaction (HCI), e.g. XML-based visualization system, or by both Computer-Supported Cooperative Working (CSCW)and HCI. Convergences of these HCI techniques can further improve the visualization effects on the GUIs [155].

4.5. Human Robot Team (HRT) modelling, human-control of single or multiple robot system(s) for HRI

Kanda et al. in 2004 evaluated interactive humanoid robots, such as Robovie, by comparing their body movements withsubjective evaluation, which is based on the psychological method [82]. This is a combination of subjective evaluations withobjective types of assessment methodologies. These authors intend to discover knowledge on embodiment that partner

Fig. 6 - (alone) Fig. 7(a) - (in a group) Fig. 7(b) - (in a group)

Figs. 6, 7a and 7b. Therapist Hands-Off Robot (Tapus and Mataric [166]). (7a) A group of interacting robots that uses their patterns to attract userencouragement [73]. The method of assessment used is mainly ‘Self-Assessments’ Evaluation method which is conducted mainly through observations aswell as users’ feedbacks. (7b) GlowBots interact among themselves and with users to create interesting patterns [74]. The method of assessment used ismainly ‘Self-Assessments’ Evaluation method which is conducted mainly through observations as well as users’ feedbacks.

Page 21: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

(a) (b) (c) (d) (e)

Fig. 8a–e. (a) Passive; (b) Interactive and (c) Passive-social: Robot, as a passive social medium, named Manzai, created by Hayashi et al. [12,53]. This figuretells us why multiple evaluation methods and assessment techniques are needed for this kind of HRI since the interaction is dynamic and may involvemultiple subjects or more than one robot – (see Table 2). (d) Robot Manzai ‘‘Robovie and Wakamaru’’ and audiences at Expo 2005 [52] – (see Table 2). (e)Application of a passive social medium during the HRI [52] – (see Table 2).

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 325

robots can utilize to encourage humans to interact with them [82,83]. Last year, i.e. in 2013, to achieve this, Melin et al. usedtype-2 and type-1 fuzzy tracking controllers, under perturbed torques using a new optimization paradigm for autonomouscontrol of mobile robots [107,108]. In 2012, Cazarez-Castro et al. designed type-1 and type-2 controllers via fuzzy LyapunovSynthesis to have good control of mobile robots through non-smooth mechanical systems [18,19]. So, what has this trend ofHRI technology advancement told us?

Zheng et al. in 2013 and 2014, i.e. the latest research last year and this year to support the above, designed and imple-mented a human–robot team for social interactions, and then created a supervisory control of multiple social robots for con-versation and navigation [183,185]. It is well known that during any conversation or in any interaction, a humanimmediately detects correspondences between their own body and the body of their partner. So, this suggests that to pro-duce effective communication skills for an interactive robot during the HRI, its body should be based on a human’s bodymovements and this should be done with subjective evaluation [82,116]. The development of humanoid and interactiverobots such as Honda’s [139] and Sony’s [42] is a research direction in Robotics. The concept of partner robot is rapidlyemerging, and the evaluation of human–humanoid robot interaction is a concern for Kanda et al. in their research workin 2004, 2008, 2009 [79,82,86,88,121] till now. The previous research on HRI, which is often motivated by Cognitive Scienceand Psychology [170], has determined various interactive behaviours that the robot’s body should afford (see Tables 3 and 4).Figs. 11a and 11b show the various interactive behaviours of the humanoid robot with the human [82]. Fig. 2b and c showhow assistive social robots, Robovie and ASIMO, are compared and contrast for HRI experiments [86].

4.6. User–robot personality matching model, robot behavioural adaptation model and HRI model for the HRI

Fig. 9 above shows exactly the HRI model adopted by Tapus and Mataric in 2007 as they posit that it is necessary to incor-porate personality and empathy [167,168], during evaluation and assessments, in order to facilitate the HRI and robot behav-iour selection [167]. Before any formal evaluation and assessment on HRI, we need to make sure that a social embodied robotmust make appropriate use of the social space so that the users can feel safe, comfortable and in concordance with his or herpersonality preferences [167] (see Figs. 9 and 10a). Empathy can have really profound positive effects on users’ attitudestowards social robots [12,27,69,94,127]. So, responding to the user’s affective experience in a socially appropriate manneris considered really important issue in achieving user’s trust and satisfaction, as well as compliance to requests[10,12,27,30,165].

Fig. 9. HRI Model: interaction zones or proxemics proposed by Tapus and Mataric in 2007, i.e. intimate, personal, social, and public [167]. This figure tells uswhy multiple evaluation methods and assessment methodologies are needed for almost any kind of HRI.

Page 22: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Fig.10(a). Therapeutic and Psychological Modeling [97]

Fig. 10a. Eye-contact (referring to each other’s mental states) of Keepon, and joint attention (sharing the perceptual information), that enable theinteractants to exchange intention and emotion toward the target (Kozima et al. [96,97]) – (see Tables 2–4).

326 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

The experiments done by Tapus and Mataric’ to promote HRI Model are mainly addressing these two issues: (1) theyinvestigate the user–robot personality matching; (2) by using the results of the first experiment, they refine the matchingprocess between the user and the robot using their own adaptation algorithm. Preceding the research of Tapus and Mataricon the robot’s behaviour control and its architecture for therapeutic purposes [166,168], Sterling and Gaertner have shown apositive correlation between empathy and physiological indices (such as heart rate acceleration, palm sweating, and eyeblinking) [159]. These physiological responses can also be used by the robot as a significant source of sensory informationfor real-time interaction and emphatic response [159,166]. So, using emulation capabilities of robot as one of the assessmentmethodologies, is a good HRI assessing and modelling approach. In contrast of this similar vein, in 2014, i.e. this year, Hu andLoo showed an assessment and modelling approach using a generalized quantum-inspired decision making model for intel-ligent agent [70]. This is one of the latest HRI models.

4.7. Empathic model for long-term modelling of empathic behaviours during HRI in real-world settings

These assessment methodologies are assessing based on the idea that autonomous social robots capable of assisting us inour daily lives is becoming more real everyday [100]. In year 2013, Leite et al. supported this research using a specificempathic model [101]. In 2012 [100], i.e. the year before, Leite et al. indicated that autonomous social robots should havesocial capabilities so as to make our daily interactions with robots more natural. Their research findings suggest that therobot’s empathic behaviour affects positively how children perceive the robot. However, the weakness of this modellingis that the empathic behaviours should be selected carefully as under the risk of having the opposite effect. In addition,the target application scenario and the particular preferences of children seem to influence the ‘‘degree of empathy’’ thatsocial robots should be endowed with [100]. Further to this research, there has been a growing interest in studying the inter-action with robots, i.e. the HRI, in real-world settings (see Table 3). This includes the research studies done at homes[100,162], workplaces [100,115], elderly-care facilities (see Figs. 13a–13c) [100,136,173] or schools.

The goal of the research done by Leite et al. in 2012 and 2013 is to qualitatively evaluate children’s reactions to anempathic robot in again, a real-world setting, by using an affect detector. This affect detector allows the robot to infer thevalence of the feeling experienced by the children [100,101]. This is a good assessment methodology. In another emotionalmimicry study done by Riek et al. in 2010 [132], it was found that most participants considered the interaction with therobot, i.e. the HRI, to be more satisfactory than participants who interacted with a version of the robot without mimickingcapabilities [100,132]. In the same vein, more recently than Riek and Robinson in 2008 [131], Cramer et al. [27] in 2010, stud-ied how empathy affects people’s attitudes towards robots. To support the assessment methodologies of HRI, for the eval-uation methods incorporated to augment the assessment methodologies, Cramer et al. use Likert-type and semanticdifferential scales to measure the robot’s perceived empathic ability, trust (dependability, credibility) and closeness [27]. Dif-ferent from Leite et al.’s research, no affect detector is used [100,101].

4.8. Social acceptance modelling for HRI

Riek and Robinson in 2008 [131] suggest Classification Ease as one of the assessment methodologies to assess the societalacceptance of robots. This is consistent with one of the core ideas in Human-Centred Design – the technology acceptance isdirectly related to the consistency with users’ mental models [131]. There are a variety of tools available for HRI researchersseeking to assess aspects of the societal acceptance of robots [48,131]. From most past researchers done to assess the societalacceptance of robots, there include: (1) ethnographic observation [38], (2) system response-time analysis [153], (3) commonground analysis [161], (4) embodiment measurement [29], (5) perceived enjoyment analysis [59], (6) comfort level analysis[95], (7) interaction profile analysis [133], and others [48,131]. For people to accept robots in social contexts, it is important

Page 23: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 327

that the robots are easily classifiable, i.e. end users should be able to quickly and easily identify a robot’s type, role, andbehavioural function. This is further supported by the societal acceptance research work done by Mutlu et al. [117–119]in the years 2006, 2012 and 2013.

To support societal acceptance modelling for HRI, i.e. non-verbal behaviours and/or designs of robots and the like, Liuet al. in 2014, i.e. latest research this year, showed ways to train robot by teaching service robots to reproduce human socialbehaviour [104]. Again this year, i.e. 2014, Sorbellow et al. stressed their research in using telenoid android robot as anembodied perceptual social regulation medium to control natural human–humanoid interaction [156]. This research canbe used as one of the latest research frameworks in social acceptance modelling for HRI in 2014. This can also be used asa flourishing channel for the research done by Satake et al. in 2013 that showed ways how a robot approaches pedestrians[142] (see Table 2 for the categorization of these 2013 and 2014 research work).

Preceding this research, Riek and Robinson [131] proposed that for people to accept robotic agents socially, it is necessaryfor the robots to be easily classifiable. They constructed an Ease of Classification (EOC) Score formula. Although this EOC for-mula is grouped under evaluation methodology (see Section 5.2.2) because it is a score, continuous assessments and mon-itoring are still required for improving the EOC formula. Riek and Robinson expect that additional research in this socialacceptance area will lead to a refinement of the EOC score formula and to the establishment of a reproducible testingmethodology. Moreover, this formula has a few loopholes yet to be verified experimentally. Again, this confirms that Easeof Classification Methodology can be used as an Assessment Methodology (see Table 4), and also a Non-Primary EvaluationMethodology (see Table 2). While its Societal Acceptance Modelling is used as an Assessment Methodology, it is used toassess and long-term model the HRI (refer to Table 4). As shown in Table 2, using EOC method is that it provides a quanti-fiable metric that can be used as a basis of comparison between different user groups, interaction spaces and between dif-ferent robots [131]. Subjects would have unstructured interactions with the robot and asked to state when they havedetermined the robot’s type and role. The subjects will be asked to think aloud during the experiment [129,131].

4.9. Psychological and therapeutic models for HRI, based on Proximity Theory in Psychology and Cognitive Science

By combining the knowledge from Proximity Theory in Psychology as well as Cognitive Science and the like, we can assesshuman–robot communication and then, evaluate and long-term model the HRI. For instance, Kanda et al. utilize the robots’body properties for facilitating the interaction with humans [72,82,86] and cause people to unconsciously behave as if theywere communicating with humans [82,85]. (Refer to Figs. 11d and 11e for interactive behaviours of the humanoid robot[82].) When we review another social model of proximity control for information-presenting robots, Yamaoka et al. in2010 did establish a good model for information-presenting robots to appropriately adjust their position. The experimentalresults verified the effectiveness of the model and showed that an information-presenting robot using their model presentsan object better than by using simpler models. The primary evaluation methodology used by them is also the subjective self-assessment or self-report [181]. Again, the assessment methodologies under this evaluation methodology include observa-tions and participants’ impressions toward the robot. Kanda et al. in 2008 compared two humanoid robots, ASIMO [139] andRobovie [82], and a human [86]. They found that not only the impressions, but also the attributions such as humanity,affected the participants’ non-verbal behaviours (refer to Tables 2 and 3). There were no differences found in their verbalbehaviours. In this research, Kanda et al. [86] formulated a statement or formula to model human behaviours to robots orhumans as: Non-verbal behaviours = f (Impressions, Attribution). For instance, the distance during talking and walking dur-ing the HRI show similar tendencies to familiarity, and this is consistent with Proximity Theory in Psychology, as proposedby Hall in 1990 [50,86]. The attribution includes ‘whether it is respected as the conversation partner or not’ [86].

In terms of Therapeutic and Psychological Modelling, Kozima et al. in 2005 supported in their research by proposing apossible use of interactive robots in the remedial practice for children with Autism [97]. The assessment methodology usedis based on a small creature-like robot, Keepon, which was carefully designed to get autistic and non-autistic childreninvolved in playful interaction (see Fig. 10a) [97]. This research is further supported by their work later in 2009 [96]. Theyundertook a similar study to investigate the interaction between toddlers and Keepon, a small robot designed to interactwith children through non-verbal behaviours [96] (see Figs. 15a and 15b). In 2005, Kozima et al. observed how autistic chil-dren (2–4 years old) interacted with Keepon. Each child showed a different style and a different unfolding of interaction overtime, which told us a ‘‘story’’ of his or her personality and developmental profile, which would not be explained completelyby a diagnostic label like ‘‘autism’’ [97]. Hence, their research further confirms the usefulness of Psychological and Therapeu-tic modelling approaches.

4.10. Integrated model of personality and affect (TAME), ethological and emotional models for HRI

Tapus and Mataric’ analyse how the varying minor characteristics of the robot’s personality gives impacts to the user’sefficiency during HRI and whether the robot is able to converge to a set of characteristics that are in consensus with theuser’s preferences (see Fig. 10a) [167]. This is because people have stronger empathic emotions and reactions when the inter-action episodes are associated with others with whom they have a social relationship (such as with friends or with familymembers) or a common background (such as a person who lived through a similar experience) [150,167]. Arkin et al. in 2003[1], as well as Moshkina and Arkin in 2005 [113], have respectively used Ethological and Emotional Models [1] as well as theintegrated model of personality and affect (TAME, i.e. Traits, Attitudes, Moods and Emotions) [113] to assess and long-term

Page 24: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

328 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

model the HRI (see Table 3 for detailed illustrations on the comparisons and contrasts). These models have presented designof framework which increases the ease and pleasantness of HRI [1,113]. So, these are good assessment methodologies imple-mented to model the HRI. As shown in Fig. 10b, the personality of the robot is expressed through the extroversion–introver-sion personality trait, and this is important for the user–robot personality matching [167]. The robot’s behaviour usually hasa range from non-social to social, low activity to high activity, in order to express the extroversion (i.e. challenging) or theintroversion (i.e. nurturing) therapy styles [82,167]. (Refer to Figs. 11a–11e). Although various learning approaches for HRIwere proposed by past researchers, such as by Tapus and Mataric in 2006, Breazeal and Scassellati in 2003, none of themincludes the user’s profile, preferences and personality trait. Tapus’ and Mataric’ proposed work [167] is to create assistivesocial robots which are capable of monitoring and enhancing physical therapy (see Fig. 6). They proposed a methodology forevaluating a reinforcement-learning-based approach to robot’s behaviour adaptation. This learning approach would incre-mentally adapt the robot’s behaviour to better model the user’s personality and needs, and so, improve users’ task perfor-mance [167].

Bartneck et al. in 2007 [4] have shown that robot’s perception is culturally dependent on a study comparing the measuredattitudes for participants from different nationalities. Results indicated that the Japanese are concerned about the impactthat robots might have on society and that they are particularly concerned with the emotional aspects of interacting withrobots [4,55]. Further to these studies done by Bartneck et al. in 2008, to emphasize the need for standardized measurementtools for HRI, they had shown their abilities to compare the results done from different previous studies [2,3]. Thereafter,Bartneck et al. in 2009 investigated the influence of two different embodiments, i.e. Robovie II robot and iCat robot (seeFigs. 12a and 12b), on how robots are perceived in terms of Animacy and Perceived Intelligence [2]. Animacy and PerceivedIntelligence are two of the components of Godspeed Five Key Concepts modelling [3]. To support this modelling approach,Gonsior et al. in 2011 did researches on improving aspects of empathy and subjective performance for HRI through mirroringfacial expressions [47]. They conducted experiments so as to evaluate the long-term modelling of ‘Five Key Concepts in HRI’as proposed by Bartneck et al. in 2008 [3]. User acceptance is evaluated according to the analytical measures of Heerink in2009 [56]. A measure of empathy and subjective task performance experienced by the user is assessed to reveal any possiblecorrelation. They conducted a study which the participants were asked to rate empathy and task performance of the robot[47]. This again supports the Godspeed Five Key Concepts Modelling as the main assessment.

In their previous experiment on a route-guidance situation [76], Kanda et al. observed that human participants used dif-ferent words to humans and Robovie (for example, giving different landmarks to Robovie) [76,82,86]. Kanda et al. conducteda comparison of impressions based on the factor scores, and then followed the method reported by them in 2001 [81]. Theyconducted factor analysis on the Semantic Differential ratings, and adopted a solution that consists of four factors. These fourfactors were interpreted by referring to factor loadings, i.e. Familiarity, Novelty, Safety, and Activity factors [86]. However,the limitations of this research are that it does not ensure whether the research findings can be applied to all other humanoidrobots, besides the two existing ones, i.e. Robovie [82] and ASIMO [139]. Hence, the general application of this research islimited. In addition, this experiment only involves a situation reflecting first-time conversation [86]. It seems that Noveltyhad larger effect on the non-verbal behaviours that did the other factors [80,86]. This further supports the research done byKanda et al., in 2004 on interactive robots based on their non-verbal behaviours [80] (refer to Tables 2 and 3 for furtherdetails).

4.11. Temporal awareness modelling and timing modelling for assessing and long-term modelling the HRI

According to Shiwa et al. in 2008 and 2009 [152,153], a robot cannot always respond in such a short time as one or twoseconds. So, what should a robot do if it cannot respond quickly enough? Ref. [153] some researchers have tried todesign specific modelling approaches or optimization paradigm or HRI models which are of temporal awareness

Fig.10(b). Personality Model of the user and its empathy level [166]

Fig. 10b. Human–Robot Interaction information processing using user’s Personality Model and his empathy level (Tapus and Mataric’ [167]).

Page 25: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

(a) (b) (c)

Fig. 11a–c. (a) Robovie – a robot with sufficient physical expression ability (Kanda et al. [82]). (b) A scene of joint attention, i.e. eye contact and pointing toshare the attention (Kanda et al. [82]). (c) Development of a situated module with communicative units (Kanda et al. [82]) All the above can be used as theTask Performance Metrics of the HRI, serving as one of the Primary Evaluation Methodologies for HRI.

(d) (e)

Figs. 11d and e. (d) Humanoid robot’s interactive behaviours [82]. (e) Attached markers (left) obtained 3-D animated images (right) [82].

Fig. 12a. iCat robot (left) (Bartneck et al. [2,86] and Saerbeck et al. [137]) and the Robovie II (right) robot (Bartneck et al. [2,82,86]).

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 329

[17–19,43,44,82,83,107,108,126]. They assessed on HRI based on specific task performance metrics as parts of the primaryevaluation methodologies on HRI. These models and modelling approaches are usually used when timing and the operator’srole for tele-operating the multi-robot system are of mission critical [17,18,108,119,126,178]. Mutlu et al. in 2013 used spe-cific coordination mechanisms to enhance HRI, these models worked well with the Human Robot Team (HRT) models involv-ing multi-robot systems [119]. In years 2012 and 2013, Mora et al. used a tele-operation approach on mobile social robots,incorporating automatic gaze control and three-dimensional spatial visualization, a much more sophisticated evaluationmethodology has been explored [18,108,109]. While in both the years 2012 and 2013, Melin et al. used optimal design oftype-2 and type-1 fuzzy tracking controllers for controlling autonomous mobile robots under perturbed torques[17,18,108] (see Table 4). In 2013, Melendez and Castillo stressed about using evolutionary optimization of the fuzzy inte-grator in a navigation system for a mobile robot [107]. Till this year, i.e. 2014, Pateraki et al. used visual estimation of pointed

Page 26: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Fig. 12b. Robovie R-1, the robotic platform that Mutlu et al. used in the evaluation of the gaze mechanisms studied in 2012 (Mutlu et al. [117]).

(c) (d)

Fig. 12c and d. (c) Scene of eye contact (Kanda et al. [82]). (d) Scene of synchronized body movements (Kanda et al. [82]).

330 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

targets for robot guidance, the temporal awareness is done via fusion of face pose and hand orientation [126]. We will dis-cuss the trends of this technology advancement in recommending our proposed hybrid of HRI methodologies in Section 6!

5. Evaluation methodologies (primary and non-primary) that have been carried out to evaluate and long-term modelthe HRI

5.1. Convergent validity for the primary evaluation methodologies for long-term modelling of HRI

According to the key researches done by Bethel et al. in 2007, and then Bethel with Murphy in 2009, there are four pri-mary methods of evaluation used for human studies in HRI [5,7,8]. For the fourth evaluation methodology, (i.e. task perfor-mance metrics measurements), it can be compared with the classical subjective evaluation results. This has been applied byKanda et al. since 2004. It is the comparison method of the movement interactions during HRI [54]. These primary evaluationmethodologies include:

(1) Self-Assessments Subjective Evaluation [5,8].(2) Behavioural measurements [5,8].(3) Psycho-physiological measures [5,7,8].(4) Task performance metrics [5,8,14,122,158]; (this may include comparisons of the measurements on the movement

interactions during humanoid-robot and human interaction [82,83]) see Figs. 11d and 11e.

The most common methods utilized in HRI studies and research so far are the self-assessment and behavioural measures[5]. This is probably because there has been quite limited research done in the use of psycho-physiological measures and taskperformance metrics. Each method has its own advantages and disadvantages. However, according to the researches done byKidd and Breazeal in 2005, and Bethel et al., in 2007, some of these disadvantages can be overcome by using more than onemethod or methodology of evaluation [5,8].

5.1.1. Self-assessment subjective evaluation methodologiesAccordingly, the first of the listing above, i.e. the use of self-assessments is one of the most commonly used primary eval-

uation methodology in HRI studies. Self-assessment measures include paper or computerized psychometric measures, ques-tionnaires, and/or surveys. For this evaluation method, participants provide a personal assessment of how they felt or theirmotivations related to an object, situation, or interactions [5]. However, the weaknesses of this method are that there are

Page 27: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 331

often problems with validity and corroboration although self-assessments provide valuable information. These can be shownin some of the researches done in the same year and thereafter [2,150]. In a deeper illustration of this weakness is that,participants may not answer the questions based on how their feeling or perception is at that time, but rather very unreal-istically, i.e. based on how they feel others would answer the questions or in an imaginative way they think what theresearcher wants them to answer [5]. So, more than one methodology of evaluation is always required [5,7,8]. Anotherweakness of self-assessment measures is that observers are not able to immediately and directly corroborate the informationprovided by the participants. While on the other hand, participants may not be in touch with what they feel about the object,situation, and/or interaction, and therefore they may not report their true feelings. The participants’ responses could be influ-enced by their mood and state of mind on the day of study [5,36]. All these can cause inaccuracies of this assessmentmethodology.

According to Bethel’s and Murphy’s research work in 2009, the design of a quality research study, especially for the use inHRI applications, is a major challenge, when producing results that are verifiable, reliable and reproducible is a must. This isbecause the use of a single method of measurement is not sufficient to interpret accurately the responses of participants to arobot with which they are interacting [5]. Moreover, in addition to that, Steinfeld et al. (2006) describe the need for thedevelopment of common metrics as an open research issue in HRI by discussing an approach of developing common metricsfor HRI [158]. However, the weakness of this type of approach is that it is oriented more toward an engineering perspective,but does not completely address the social interaction perspective. Both the engineering and social interaction perspectivesrequire further investigation in order to develop common metrics and methods of evaluation [5,36,75]. For these reasons, itis important to perform additional types of measurements, e.g. behavioural, task performance, and/or psycho-physiologicalmeasures, in order to add another dimension of understanding the HRI studies [5,8].

5.1.2. Behavioural measurementsFor the second evaluation methodology, i.e. behavioural measurement is considered as the second most common method

of evaluation in HRI studies. This method is sometimes included along with the psycho-physiological evaluations, and par-ticipants’ self-assessment responses for obtaining convergent validity [5]. John and Christensen (2004) define observation as‘the watching of behavioural patterns of people in certain situations to obtain information about the phenomenon of inter-est.’ This is because the ‘Hawthorne effect’ is a concern with observational studies, i.e. it is a phenomenon in which partic-ipants know that they are being observed, and this impacts their behaviours [36,75]. To further support this measurement,Kanda et al. developed a human friendship estimation model for communication robots in 2008 [87], while Heerink et al.built an Almere model (see Table 3) for measuring the acceptance of assistive social agent technology by older adults in2010 [62] (see Figs. 13a–13c). They all used the participants’ self-assessment responses, i.e. questionnaires for obtaining con-vergent validity for the evaluation models used in behavioural measurements for HRI [57,62,87]. Again, this supports theconcept that evaluation methodologies should work with assessment methodologies for achieving a better HRI measure-ment and modelling.

5.1.3. Psycho-physiological measurementsFor the above weaknesses, especially due to the ‘Hawthorne effect’, the third evaluation methodology, i.e. the psycho-

physiological measures, can assist for obtaining a better understanding of participants’ underlying responses at the timeof observations [5]. The benefit of using the behavioural measures is that this method is less biased. This is because research-ers are able to record the actual behaviours of participants, and do not have to rely on participants to report their intendedbehaviours or preferences [5,7,8,36]. In addition, video observations are often recorded in order to be later coded for visualand auditory information using two or more independent assessors [5,14]. This method of evaluation, i.e. using the psycho-physiological measures, is gaining more popularity in the HRI studies because, as mentioned above, it is a relatively fairapproach. This is because participants cannot consciously manipulate the activities of their autonomic nervous system[5,7,71,92,99,103,128,130].

Psycho-physiological measures offer a non-invasive method that can be used to determine the stress levels and reactionsof participants interacting with the technology [5,71,99,103,128,130]. However, on the other hand, one of the weaknesses ofusing psycho-physiological measurements is that it can complicate the process because the results are not always straight-forward and confounds can lead to misinterpretation of data. There is a tendency to attribute more meaning to resultsbecause of the tangible nature of the recordings. Information needs to be obtained from participants prior to beginning astudy in order to help reduce these confounds, such as health information and state of mind. Multiple physiological signalsshould be used to find correlations in the results [5,7,8].

5.1.4. Task performance metricsFor the fourth evaluation methodology, i.e. the use of task performance metrics, is evolving and becoming more common

in HRI studies, especially where teams or groups are being evaluated and/or more than one person is interacting with one ormore robots [5,14,56,118,158]. This method has also been used by a number of researches done by Kanda et al., such as in theresearches of analyzing of the humanoid appearances in HRI [86]. (Sub method is usually incorporated in this task perfor-mance metrics measurement to augment this Primary Evaluation – comparisons of the movement interactions for humanoidrobot–human interactions with traditional subjective evaluations.)

Page 28: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

332 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

Below is a review of this type of evaluation methodology implemented by a few past researchers. Kanda et al. in 2004 [82]measure the body movement interaction between a humanoid robot (such as Robovie – see Fig. 11a) and humans. Then, theycompare the results with traditional subjective evaluation results. Through the evaluation experiments, they argue about theperformance of the developed interactive humanoid robots and provide perspectives for the new analytical method for HRI.This research is further supported by researches done by Kanda et al. in 2008 and 2009 [82,86] as well as Mutlu et al. in 2012[117] (see Fig. 12b) and Frank et al. in 2014 [39]. In terms of correlations between body movements and subjective impres-sions, Kanda et al. employed an optical motion- capturing system to measure the body movements of the robot (see Fig. 12cand 12d). Comparison between the body movements and the subjective evaluations indicates the meaningful correlation.Well-coordinated behaviours, such as eye contact and synchronized arm movements are really important [82] (seeFigs. 12c and 12d).

In 2014, i.e. this year research update, Cooney et al. has demonstrated this type of primary evaluation methodology todesign robots for well-being [25,26]. This is based on visual scenes of affectionate play with a small humanoid robot [26],and also through designing very enjoyable motion-based play interactions with a small humanoid robot [25]. In the samevein in this year 2014, Frank et al. emphasized on using curiosity driven reinforcement learning for motion planning onhumanoids [39] – task performance metrics are important in terms of primary evaluation basis.

Kanda et al. performed an experiment to evaluate the developed humanoid robot and analyzed the interaction betweenrobots and humans. In the experiment, the humans behave as if they were interacting with a human. They kept eye contactwith the robot and imitated the gestures of the robot (refer to Figs. 11d and 11e and 12c for illustrations). These entrain-ments of body movements indicate the high performance of the developed robot during the HRI [82]. The Entrainment Scor-ing techniques, are used as the non-primary evaluation methodologies for HRI. The robot’s task performance metrics duringthe HRI are the primary evaluation methodologies for HRI. Positive correlations between cooperative body movements andsubjective evaluations are discovered by Kanda et al. [82,83].

As shown in Figs. 11d and e and 12c and d, comparisons of the measurement on the movement interactions during thehumanoid robot-and-human interaction with the subjective evaluation results have been conducted profoundly [2,82,117].These task performance metrics are designed to measure how well a person or a team performs or completes a task or tasks.This is essential in HRI studies and should be included with other methods of evaluation, such as behavioural and/or self-assessment methodologies [5,14]. Bethel and Murphy presented the research outcome in their paper (2009) by utilizingthe above four primary evaluation methodologies as described. This is done so that the convergent validity may be obtainedto determine the effectiveness of non-facial and non-verbal affective expressions for naturalistic HRI social interaction. Mul-tiple self-assessments were used during HRI [5,118,145,158]. In the same research vein, Burke et al. [14] took a systemsapproach for measuring task performance metrics.

As mentioned above, the evaluation methodology is a measurement tool, incorporated together with subjective evalua-tion results from participants [117,141].

The behaviours of robots do not violate their Type or Role Classifications, and thus are generally well accepted[131,160,173] (see Figs. 13a–c and 14).

5.2. Non-primary evaluation methodologies for efficient and thorough evaluation criteria for HRI

The non-primary evaluation methodologies may sometimes be used as supplementary to indirectly evaluate HRI. It isimportant to use more than one evaluation methodology in a comprehensive study to gain a better understanding ofHuman–Robot Interaction. Within a single methodology of evaluation, there are usually more than one assessment measure-ment to be utilized [2,5,82,117,134,138].

Fig. 13a–c. (a) Seal robot, named Paro, which has a behaviour generation system that consists of proactive and reactive processes. These two layersgenerate three kinds of behaviours, i.e. proactive, reactive, and physiological behaviours (Wada et al. [173]). (b) Interaction between elderly people and Paro(Wada et al. [173]). (c) Results of Average Scores of a Question Item ‘‘Vigorous’’ of Elderly People for 6 weeks’ period [173].

Page 29: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Fig. 14. The Robotceptionist (Riek and Robinson in 2008 [131]). As the Ease of Classification (EOC) measurement, take note of the flowers, business cards,and memorabilia surrounding the desk, as well as the office attire that the robot is dressed in [131]. Using EOC formula score is a good non-primaryevaluation methodology for HRI.

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 333

5.2.1. Non-primary evaluation methodologies incorporated under the four primary evaluation methodologiesAs mentioned in Section 5.1 above, we investigate how many or to what extent the assessment methodologies are needed

to be incorporated in each of the major four Primary Evaluation Methodologies stated above.(1) For the first primary evaluation methodology, i.e. self-assessment methods of evaluation, a lot of researchers use more

than one assessment methods within just this methodology of evaluation. The assessment methodologies include using dif-ferent types of learning approaches, social models, passive-social media, other similar media, open platform or the like[5,8,47,51–53] (refer to Fig. 16a); (2) for the second primary evaluation methodology, i.e. behavioural studies for HRI[173] (see Figs. 13a–13c), observations should be obtained from more than one angle or perspectives [5,8]; (3) for the thirdprimary evaluation methodology, i.e. psycho-physiological studies, more than one signal should be obtained for validity andcorrelation [5,7,8]; and (4) for the fourth primary evaluation methodology, i.e. task performance evaluation criteria or taskperformance metrics, it should be measured in more than just one way [5,8]. This is because task performance metricsinvolve measurements from a lot of measurement tactics that differ in different kinds of HRI. For the sub-method incorpo-rated together with this fourth evaluation methodology, i.e. comparisons of body movements with classical subjective eval-uation results are also conducted so that for more efficient and thorough evaluations for HRI can be adopted.

5.2.2. Ease of Classification (EOC) as non-primary evaluation methodologies for the societal acceptance of robotsRiek and Robinson in 2008 [131] use an Ease of Classification (EOC) score as a means of measuring the societal acceptance

of robots. As mentioned above, this can be a type of non-primary evaluation methodology for HRI. Using the EOC score for-mula as a means of measuring the societal acceptance for robots has a few quite profound advantages (see Tables 2 and 4 forthe strengths and weaknesses of using EOC and its Societal Acceptance Modelling approaches). This is because, upon the veryfirst encountering towards a robot within its intended physical space, it should be immediately apparent to the users whatgeneral purpose the robot is intended to serve, i.e. its type. In other words, to indirectly evaluate HRI using EOC, the robot’sphysical appearance, movement, gait, speech, gesture, gaze, or stature can reflect exactly this purpose [131]. Preceding thisresearch conducted by Riek and Robinson in 2008, Gockley et al. [45] in 2005 showed the roboceptionist as a measurementfor the EOC evaluation methodologies. For instance, when one encounters Tank the Roboceptionist (see Fig. 14) at CarnegieMellon, it is very easy to classify its type as receptionist. From this figure, we can see that Tank is located near the entrance toa building inside a wooden booth. This robot (as shown in this figure) is unlikely to be mistaken for anything because itsdesign and physical placement clearly reflects its type and purpose [45,131].

To further support this non-primary evaluation methodology towards HRI, Sung et al. [163] in 2008 show examples in ourdaily lives such as people who interact with personal robots in the home will often dress them in costumes. Perhaps, this is ameans to help other family members and visitors to the home to readily classify the robot as non-threatening [163]. So, usingEOC evaluation methodologies can be a good way to indirectly using users to evaluate the HRI based on how easy they thinkthe robot can be classified [45,46,131,163]. Riek and Robinson proposed the EOC Score formula to calculate the EOC score,forming the Classification Ease to the HRI researcher’s toolset [131].

5.2.3. Psychological human–humanoid robot interactions as the non-primary evaluation methodologies for estimation ofmomentary evaluation score for HRI

Kanda et al. in 2004 [82] conducted the momentary evaluation score, i.e. entrainment score, through comparisons of bodymovements with traditional subjective evaluations – this is for indirect evaluation of HRI. This entrainment score serves as anon-primary evaluation methodology, while the different parameters measured can be the task performance metrics servingas a primary evaluation methodology for HRI. So, for this research, there is a combination of primary and non-primary eval-uation methodologies (refer to Tables 1 and 2), but the assessment methodologies applied during HRI are not very promi-nent. The humanoid robot, Robovie (refer to Figs. 11a, 12a, 12c and 12d), is used as a test-bed for studying embodiedcommunication [82] (refer to Table 4). The non-primary evaluation methodologies involve Kanda et al.’s [83] constructive

Page 30: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

(a) (b)

Fig. 15a and b. (a) Keapon, the creature-like robot, performing eye-contact and joint attention with the human interactant (Kozima et al. in 2005 [97] andin 2009 [96]) – refer to Table 4 for the uniqueness of Keepon. (b) Keapon’s external and internal structure and its deformable body made of silicone rubber(Kozima et al. in 2005 [97]). Keepon’s structure: its simple appearance and marionette-like mechanism (left), which drives the deformable body (right)(Kozima et al. in 2009 [96]) – refer to Table 4 for the uniqueness of Keepon.

(a) (b)

Fig. 16a and b. Non-social condition (left) and social condition (right). Non-primary evaluation of HRI by observing and recording the different behavioursof each station user, and correlating each of them with the questionnaires given to the station users (Hayashi et al. in 2007 [53]). (b) Outline of a multi-robotcommunication system (Hayashi et al. respectively in 2005 [52], 2007 [53] and 2008 [51]).

334 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

approach, which is to continue implementing behaviours of the interactive humanoid robots until humans think that therobot has an animated and lifelike existence that is beyond that of a simple automatic machine.

5.2.4. Formulations as non-primary evaluation methodologies for the non-verbal behaviours of robotsKanda et al. [86], in 2008, performed analysis of humanoid appearances in HRI that was also through ‘non-verbal behav-

iours formulations’ on the robot. This type of non-primary evaluation methodology is conducted and supported by othersimilar researches [50,76,80].

Kozima et al. in 2005 [97] and 2009 [96] used Keepon, a creature-like robot in real-world setting (see Fig. 15a and babove), to interact with autistic children through non-verbal behaviours and long-term HRI is modeled through psycholog-ical and therapeutic modelling approaches (refer to Table 4).

5.2.5. Using robots to serve as passive-social media or using open exploring platform for exploring HRIResearches in Human–Computer Interaction (HCI) have highlighted the importance of robots as user interface media

[51,52], or as round display modules that can extend on open exploring robot platform [73,74]. They believe that humanoidrobot(s) will be, used as interface media especially of passive social, particularly by showing conversation among multiplerobots [51,52] (see Fig. 16b) or as Glowbots creating interesting patterns to attract users [73,74] (see Fig. 7a and b, by Jacobs-son et al.). To support this, Kanda et al. proved that users can understand a robot’s speech more easily, and more activelyrespond to it, after observing the conversation between two robots or among multiple robots [53,84] (see Fig. 16a left figureand Fig. 16b right figure). Preceding this research, Hayashi et al. have shown that using non-primary evaluation methodol-ogies of correlating through hypothesizing methods gives significant findings, which was the most effective way of attractingpeople’s interest during HRI [51–53] (refer to Table 2). As mentioned in Section 4.4 above, Jacobsson et al. showed that userscan indirectly evaluate the HRI through observations on the Glowbots’ demonstrations on the robot platforms [73,74].

Please view all the Tables 1–8.

6. Contributions of this review paper

6.1. Contributions in providing our insights and vision for future HRI assessment and evaluation methodologies

After looking extensively at all the major HRI assessment and evaluation methodologies mainly from the year 2000 till2014, and from all the 4 summarized tables on the Primary and Non-Primary Evaluation Methodologies as well as theAssessment Methodologies stated, from the trends of technology advancement over the years, Table 6 below clearly states

Page 31: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 5Our recommended questions for future research in Assessment and Evaluation Methodologies for HRI.

� How to further refine the learning approaches applied for the Assessment Methodologies for long-term modelling of Human–Robot Interaction(HRI)?

� How to further improve each of the four major primary Evaluation Methodologies reviewed in this paper for the betterment in long-termmodelling of the Human–Robot Interaction (HRI)?

� How to further improve each of the major non-primary Evaluation Methodologies reviewed in this paper for the betterment in long-termmodelling of the Human–Robot Interaction (HRI)?

� How to develop a better social model or social assistive model in the HRI so that robots can learn from their past mistakes and keep improving soas to achieve a higher evaluation score for Human–Robot Interaction (HRI)?

� Which combination of Assessment and Evaluation Methodologies for HRI is the most suitable for which type of robot to interact with humans?� How to develop a better learning approach in the Assessment Methodologies so as to promote Human–Humanoid Robot Interaction?� Which combination of Assessment and Evaluation Methodologies for HRI is the most suitable to be applied in which type of Robotic Environment?� How can a Human–Robot Interaction (HRI) be evaluated successfully by a combination of Evaluation Methodologies so that the primary and non-

primary evaluations can ‘work’ synergistically?� For interactive Human–Robot Communication, which kind of learning approaches or monitoring control in the Assessment Methodologies should

be adopted by the robot so that a better evaluation score for HRI can be achieved?� How can the interactive social learning model adopted by the robot be used as an emulator to promote a better long-term modelling of HRI, such

as a better vicarious HRI? If so, which combinations of assessment and evaluation methodologies should be used for assessing and evaluating HRIin order to promote a better long-term modelling of HRI?

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 335

and illustrates our new insights, our own vision, and inspirations for future HRI assessment and evaluation methodologies tobe adopted. The inspirations for future improvements from current research work done on assessing and evaluating HRI arethat we proposed possible hybrids of HRI methodologies or combinations of HRI methodologies to provide solutions for thelimitations of the HRI assessments and evaluations surveyed so far. In Table 6, we summarised the advantages of our newinsights and our inspired HRI methodologies that we have proposed so that future improvements can be done!!

We have thoroughly reviewed, discussed and analysed extensively almost all the assessment and evaluation methodol-ogies for modelling HRI, especially for long-term modelling of HRI. Although a large amount of research has been done inassessment and evaluation methodologies for HRI in order to have better modelling approaches for HRI, many issues stillremain open. However, in our vision for future, our recommended Types I and II combinations of methodologies (as statedin Tables 7 and 8) for assessing and evaluating HRI, can help to achieve the following:

(1) Contributions in providing the best suits for ongoing improvement learning and modelling approaches for HRI:The characteristic feature for each combination of assessment and evaluation methodologies can be achieved by usingRecommended Type I as stated in Tables 7 and 8 below. This is because HRI is getting more and more complex as wellas advanced. Our recommended Type I combination of HRI methodologies stated below is more appropriate for theelderly people as it increases the ease and pleasantness of HRI. It gives more social impacts to societies by achievingmore integrated shared humans’ intentions.

(2) Contributions in certain Robotic environment which is less or least biased for ongoing improvement and mod-elling, i.e. assessment, as well as for discrete or final judgements, i.e. evaluation: Recommended Type II combina-tion of methodologies (as stated in Tables 7 and 8 below) ensures a good robotic environment so that the facilities,equipments and devices can be fully utilized. The long-term modelling effects of HRI can be enhanced as well asassessed and evaluated properly, involving even greater numbers than a multiple of robots and humans in the inter-action, for better multi-tasking, entertainment and presentation purposes.

(3) Contributions in providing Interactive Social Learning and Modelling Approaches: Recommended Type I combina-tion or hybrid of methodologies (as stated in Tables 7 and 8 below) ensures good intrinsic and extrinsic learning mod-els for the assistive social robots, especially for the humanoid robots. A better learning model or reinforcement-basedlearning model for assistive social robots can assist more for the elderly.

(4) Contributions in rendering HRI Model and other modelling approaches such as the Empathic Model mainly tothe Societies and Robotics, AI or IT Industries: Recommended Types I and II combination or hybrid of methodologies(as stated in Tables 7 and 8 below) can let assistive social robots adopt much better HRI model and HRT Modelling aswell as assistive social modelling approaches during the HRI.

Keeping in view of all the future research directions as stated above, we hope that this review paper has given goodinsights and thorough summarized review for all or almost all the HRI assessment and evaluation methodologies done bythe past researches on assistive social robots during the HRI. All these assessment and evaluation methodologies whichare reviewed in this paper are also for long-term modelling of the HRI.

6.2. Contributions of our recommended types in terms of Social and Industrial Impacts

What are the social and industrial impacts of our reviewing work done so far? Well, after reviewing the HRI assessmentand evaluation methodologies mainly from the year 2000 till 2014, in this subsection, based on the 6 summarized tables

Page 32: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 6New insights, directives and inspirations for improvements of HRI assessment and evaluation methodologies.

Existing Assessment and EvaluationMethodologies currently adopted orhave been adopted (surveyedmainly from 2000 till 2014)

Difficult issues during HRI assessments and evaluations Our Proposed Solutions to eliminate theLimitations (In HRI Assessments andEvaluations)

Our Proposed Insights – (In HRIAssessments and Evaluations)

In HRI Evaluation In HRI Assessment (Contributions to the Society) (Contributions to the RoboticsIndustry) (for improvements on HRIAssessments and Evaluations)

(Assessment Methodologies) (1)Human-control of MultipleRobots; (2) Tele-operation ofMultiple Social Robots ModelPlus (Primary Evaluation) TaskPerformance Metrics

(1) Evaluation is mostly limitedon questionnaires; (2) TaskPerformance Metrics suitablefor humanoid robots only; (3)Not suitable for one-to-one HRI

(1) Parameters for the models only in aspecific context; (2) Models only cover HRIon limited aspects; (3) Assessment doesnot model random errors of automation.So, less accurate; (4) Timing is of missioncritical

(1) Combine the system with advanced HCIand CSCW techniques and well-operatedHuman–Robot Team (HRT) modelling sothat it is also suitable for non-humanoidrobots as well as one-to-one HRI, even ifreal-world applications. Wider scope ofapplication too. (2) Well-adjusted andautonomous Task Performance Metrics thathave excellent timing control and almosterror free in real-world deployment

A combination of more advanced andautomated HCI or CSCW and/or XML-based visualization system to the HRTmodelling with well-adjusted TaskPerformance Metrics

(Assessment Methodology) (1)UTAUT Model; (2) iCat robot (asa test-bed for social intelligence)

(1) Questionnaires used arespecific to certain contexts only– not standardised enough; (2)Questionnaire method issubjective, and hence, biased

UTAUT model applied is a modified version,researchers can only draw tentativeconclusions from this measurement

Implement more advanced HCI and CSCWto ensure wider social intelligentacceptance and application of the UTAUTmodel

A combination of more advanced andautomated HCI or CSCW and XML-based visualization system to theUTAUT Model

(Assessment Methodology)Temporal Awareness Model orTiming Model; plus (PrimaryEvaluation) Task PerformanceMetrics

Evaluation methodology is notsuitable for non-humanoidrobots

(1) Technical difficulties may incurproblems during timing control; (2)Problems may occur during the tele-operation of interface design

(1) Combine with well-operated Human–Robot Team (HRT) model or other multi-robot models so that the timing control canbe well-adjusted; (2) Incorporate HCI/CSCW into the Human–Robot Interaction(HRI) model so that the evaluationmethodologies can be suitable for non-humanoid robots too

A combination of Temporal AwarenessModel with well-adjusted TaskPerformance Metrics (suitable for bothhumanoid and non-humanoid robots)

(Assessment Method) SocietalAcceptance or TechnologyAcceptance Modelling; plus(Non-Primary Evaluation) Ease-Of-Classification (EOC) formula

Users may feel reluctant toclassify the robots

Bias may be incurred (as no objectivemeasurements in HRI assessments)

Combine this hybrid of methodologies withpsycho-physiological measurements sothat objective measurement on HRI isinvolved. Bias can somehow be eliminatedeven if users feel reluctant to classify therobots

A combination of TechnologyAcceptance Modelling (TAM)/UTAUTmodelling with EOC and psycho-physiological measurements

(Assessment Methods) (1) HRImodel with HCI and CSCWincorporated (2) Open exploringrobot platform, i.e. see Puck ande-Puck with Glowbots

Evaluation methodologies aresubjective, hence incurs bias

Assessment modelling may not bethorough enough as not all concepts aretaken in account

Incorporate Technology AcceptanceModelling (TAM) or Godspeed Five KeyConcepts Modelling as well as psycho-physiological measurements into the HRIassessments to eliminate bias andincompleteness in assessments

A combination of TAM or GodspeedFive Key Concepts Modelling withadvanced HCI and CSCW networkingand psycho-physiologicalmeasurements

(Assessment Methodology)Different Gaze Model(s)

Evaluating gaze mechanisms isusually tedious and time-consuming

Assessment is not suitable for non-humanoid robots

(1) Incorporate HCI/CSCW into Gaze modelso that the evaluation methodologies canbe suitable for non-humanoid robots too;(2) HCI/CSCW can be tele-operated, i.e.every control is automated and really fast

A combination of more advanced andautomated HCI/CSCW and/or XML-based visualization system to the GazeModel(s)

(Assessment Methodology)Psychological and TherapeuticModelling; Plus (PrimaryEvaluation) Ethnographic

Evaluation methodologies arespecific for autism and so,limited application

Assessment modelling is not thoroughenough as not all concepts are taken inaccount

Incorporate Technology AcceptanceModelling (TAM) or Godspeed Five KeyConcepts Modelling to the Psychologicaland Therapeutic Modelling plus more

A combination of TAM/UTAUT modelor Godspeed Five Key Concepts withPsychological and TherapeuticModelling can be incorporated with

336D

.Y.Y.Sim,C.K

.Loo/Inform

ationSciences

301(2015)

305–344

Page 33: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Observations standardized behavioural observations behavioural observations(Assessment Methodologies) (1)

Behavioural Adaptation Model;(2) User Personality MatchingModel or User–Robot PersonalityMatching Model

Mostly subjective primaryevaluation methodologies areused, and hence, biased

(1) models are subjective, and so, biased;(2) specific only for therapeutic purposes;(3) models may take a long time to develop

(1) Emphasize more on psycho-physiological measurements as primaryevaluation method in order to eliminatebias; (2) Incorporate TechnologyAcceptance Modelling (TAM) or GodspeedFive Key Concepts Modelling to ensuremore standardized HRI assessmentmodelling and evaluation methodologies

A combination of TAM or GodspeedFive Key Concepts Modelling withBehavioural Adaptation or UserPersonality Matching Model withemphasized psycho-physiologicalmeasurements

(Assessment Methods) (1) HRImodel with HCI and/or CSCWincorporated; (2) robots aspassive-social media

Evaluation is of narrowerapplication, i.e. only on the userinterfaces

Assessment is limited on robots whileacting as passive social media

Implement more advanced HCI and CSCWto ensure wider application on HRImodelling other than just on userinterfaces, and ensure the HRI assessmentis beyond robots acting as passive-socialmedia

A combination of more advanced HCIand CSCW and/or XML-basedvisualization system to HRI modelling

(Assessment Methodology)Godspeed Five Key ConceptsModelling

Evaluation methodology ofusing questionnaires issubjective, and so, biased

Modelling is more subjective, less objective Incorporate psycho-physiologicalmeasurements into the Godspeed Five KeyConcepts Modelling to introduce objectivemeasurement, and hence, to eliminate bias

A combination of Godspeed Five KeyConcepts Modelling with psycho-physiological measurements asemphasis

D.Y.Y.Sim

,C.K.Loo

/Information

Sciences301

(2015)305–

344337

Page 34: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 7Our strongly recommended HRI assessment and evaluation methodologies.

Our Strongly RecommendedHybrids of HRI Assessment andEvaluation Methodologies

Reasons for recommending this combination of HRIEvaluation and Assessment Methodologies – Strengths ofeach Recommended Type (Contributions to the Societiesand Industries)

Contributions to the Societies and to Robotics, AI andIT Industries

Recommendation Type I Ahybrid of the following:-

(Assessment Methodologies)(1) UTAUT Model This recommended type has more Social Impact to HRI Used for the wellness, counselling and

companionship purposes, especially for the elderlypeople, autistic people and those who are in need ofcompanionship

(2) Godspeed Five Key ConceptsModelling

This combination of methodologies is thorough, robust,and with advanced Human–Computer Interaction (HCI)and Computer Supported Cooperative Working (CSCW)on user interfaces, it ensures wider social intelligentacceptance, especially in the elderly population. Thisapproach is suitable for general population as well

(3) HRI Model with HCI andCSCW incorporated

(4) Model of integratedintentions, e.g. Hapticchannel

(Evaluation Methodologies)(1) Self-Assessments Subjective

Evaluation(2) Psycho-Behavioural

measurements

Recommendation Type II Ahybrid of the following:-

(Assessment Methodologies) Used for the highly technical system, especially forcontrolling multiple-robot team system, for multi-tasking and presentation purposes

(1) UTAUT Model This recommended type has more Industrial Impact toHRI

(2) Timing Model This combination of methodologies ensures a goodtiming model for controlling multiple robots in a HumanRobot Team (HRT) modelling, and works well with taskperformance metrics

(3) Human Robot Team (HRT)Model

(4) Fuzzy Integrated Model(Evaluation Methodology)Task Performance Metrics

338 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

above which have highlighted the strengths, weaknesses and uniqueness as well as the characteristics of each evaluation(primary and non-primary) and assessment methodology on HRI, another 2 tables are deployed as below to illustrate thecontributions of our two strongly recommended hybrids of HRI methodologies.

7. Conclusions and discussions

From the trends of HRI technology advancement and evolutionary optimization over the years, Table 8 above summarizesthe contributions of this review paper to the societies as well as to the Robotics, AI and IT industries. In Section 6, we havediscussed our new insights for future HRI, our inspirations, new vision for future research on modelling HRI and our pro-posed directives, including new approaches on newly recommended hybrid approaches. The goal of this review paper isto extensively explore almost all the HRI assessment and evaluation methodologies done for assessing and evaluating theHRI, where their aims are not just to substitute humans’ care with robotic care. The intentions of these past researchers’work, however, are to provide the much–needed care where it is currently lacking in the HRI, and where the gap in the avail-able care will subsequently increase due to the recognized demographic trends reviewed and discussed so far. Our recom-mended Type I has more social impacts because this hybrid of HRI methodologies focuses in solving the weaknesses of socialintegrated benefits during the HRI after surveying the past HRI methodologies (i.e. from UTAUT Model to Integrated Humans’Intention Model). For our recommended Type II, this hybrid of HRI methodologies focuses on solving the weaknesses of tech-nology advancement such as platform sensors, fuzzy tracking and integrated controllers during the HRI after surveying thepast HRI methodologies (i.e. from UTAUT Model to Fuzzy Integrated Model).

As a summary for our reviewing work, creating robots that are capable of estimating friendship, emulating empathy,understanding humans, learning from past experience and continuously improving from past learning approaches, is a veryimportant step towards having those created robots as parts of our daily lives!! This paper has extensively reviewed the HRIassessment and evaluation methodologies so as to analyze if these approaches or methodologies on assessing and evaluatingthe HRI do improve or enhance over the years! This paper has also presented the important elements needed for assessingand evaluating humans’ acceptance towards robots, and how these methodologies are applied for modelling of HRI, espe-cially on assistive social robots types. Our future review work includes exploring researches done on developing real-worldexperimental design in which the HRI models and modelling approaches as discussed in this paper can be further tested andenhanced. Our future research work will also include reviewing the verification of the test-beds and testing protocols,together with the assessment and evaluation methodologies, that have ever been done by the researchers on the robotsdiscussed so far. The main contribution of this review paper is that we have proposed our new insights and inspirations

Page 35: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

Table 8Contributions of our strongly recommended HRI assessment and evaluation methodologies to the society and industry.

Our Strongly Recom-mended Hybrids of HRI Assessment and Evalua-

tion Methodologies

Type(s) of impact

from years 2000 to

2014

Contributions of each of our recommended combinations of method-ologies to the Society and Industry - Strengths of our recommended methodologies as compared with those surveyed from the types of commonly adopted methodologies since the year 2000 till 2014

Recommendation Type I(Assessment Methodologies) (1) UTAUT Model (with

Empathic Modelling); (2) Godspeed Five Key

Concepts Modelling;(3) HRI Model with HCI and

CSCW incorporated; (4) Model of integratedhumans’ intentions, e.g. Haptic channel or the like(Evaluation Methodologies) (1) Self- Assessments

Subjective Evaluation; (2) Psycho-Behavioural

Measurements

More on Social Impact

After reviewing from the trends of Assessment and Evaluation Methodolo-gies used during HRI (2000-2014), our recommended Type I increases the ease and pleasantness of HRI by advanced HCI and CSCW techniques, as well as shared humans’ intentions through integrated model such as HapticChannel or feedback system. The trends of advancement is as below:- 2000-2003 2004-2009 2010-2012 2013 till now → UTAUT →Robot Behavioural → HRI model & → Intention IntegrationModel (with Adaptation Model HCI & CSCW or the like with theEmpathic and the like or similar hybrids hybrid of the 3elements) stated in 1st column

more robot’s behavioural adaptation more HCI & CSCW components

more intention integration featuresmore shared human’s intention

Contributions in solving the weaknesses of the commonly adopted HRI methodologies (from the year 2000 till 2014)

Recommendation Type II(Assessment Methodologies) (1) UTAUT Model (with

Tele-operation Model ling);

(2) Timing Model;(3) Human Robot Team

(HRT) Model (4)Fuzzy Integrated Model

(Evaluation Methodology) Task Performance Metrics

More on Industrial

Impact

After reviewing from the trends of Assessment and Evaluation Methodolo-gies used during HRI (2000-2014), our recommended Type II is used for the highly technical system, especially for controlling multiple-robot team system, such as by using type-2 and type-1 Fuzzy Tracking Controllers or Fuzzy Integrator or the like. The trends of advancement is as below:- 2000-2003 2004-2008 2009-2012 2013 till now →UTAUT → HRT model or → Temporal → Fuzzy Integrator trackingModel (with Human Control Awareness + or Gaze Controllers viaTele-operation) Gaze Models fusion or etc., or the like

more multi-& autonomous control more temporal awareness

more fuzzy tracking & integrationContributions in solving the weaknesses of the commonly adopted HRI methodologies (from the year 2000 till 2014)

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 339

for current and future HRI methodologies. These have significant social and industrial impacts. We have recommended twomajor types of hybrids for HRI assessment and evaluation methodologies in order to model HRI in a much easier, more pleas-ant, attractive and efficient, as well as more humans’ intentions integrated mode.

Acknowledgements

This research Project is proudly and mainly supported by the High Impact Research (HIR) Grant at UM.C/625/1/HIR/MOHE/FCSIT/10 from University of Malaya (URL: www.um.edu.my), mainly from the Ministry of Higher Education underthe Federal Malaysian Government Funding, Kuala Lumpur, Malaysia. In addition, this research is also supported by thefunding from the eScienceFund, under the Ministry of Science, Technology and Innovation (MOSTI), for University of Malaya.The grant number for this research Project is 01-01-03-SF0661. This funded research is titled ‘AGED WELLNESS AUGMENTEDBY EMPATHIC ENABLER (AWARE)’.

References

[1] R.C. Arkin, M. Fujita, T. Takagi, R. Hasegawa, An ethological and emotional basis for human–robot interaction, Robot. Auton. Syst. 42 (3–4) (2003) 191–201.

[2] C. Bartneck, T. Kanda, O. Mubin, A. AlMahmud, Does the design of a robot influence its animacy and perceived intelligence?, Int J. Soc. Robot. 1 (2)(2009) 195–204.

[3] C. Bartneck, D. Kulic, E. Croft, easuring the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots, in:Workshop on Metrics for Human–Robot Interaction, Amsterdam, 2008, pp. 37–44.

[4] C. Bartneck, T. Suzuki, T. Kanda, T. Nomura, The influence of people’s culture and prior experiences with AIBO on their attitude towards robots, Artif.Intell. Soc. 21 (1–2) (2007) 217–230.

[5] C.L. Bethel, R.R. Murphy, Use of large sample sizes and multiple evaluation methods in human–robot interaction experimentation, in: AAAI,Association for the Advancement of Artificial Intelligence, 2009, pp. 1–8.

[6] C.L. Bethel, C. Bringes, R.R. Murphy, Non-facial and non-verbal affective expression in appearance-constrained robots for use in victim management:robots to the rescue! in: The 4th ACM/IEEE International Conference on Human–Robot Interaction (HRI2009), 2009.

Page 36: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

340 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

[7] C.L. Bethel, K. Salomon, J.L. Burke, R.R. Murphy, Psycho-physiological experimental design for use in human–robot interaction studies, in: The 2007International Symposium on Collaborative Technologies and Systems (CTS 2007), IEEE, Orlando, FL, 2007.

[8] C.L. Bethel, K. Salomon, R.R. Murphy, J.L. Burke, Survey of psychophysiology measurements applied to human–robot interaction, in: 16th IEEEInternational Symposium on Robot & Human Interactive Communication, 2007.

[9] T. Bickmore, R.W. Picard, Towards caring machines, in: Proceedings of Computer–Human Interaction (CHI’04), Vienna, 2004, pp. 1489–1492.[10] T. Bickmore, D. Schulman, Practical approaches to comforting users with relational agents, in: Proceedings of Human–Computer Interaction 2007

(CHI’07), San Jose, CA, 2007.[11] T.W. Bickmore, L. Casuro, K. Clough-Gorr, T. Heeren, It’s just like you talk to a friend’ relational agents for older adults, Interact. Comput. 17 (6) (2005)

711–735.[12] S. Brave, C. Nass, K. Hutchinson, Computers that care: investigating the effects of orientation of emotion exhibited by an embodied computer agent,

Int. J. Hum. Comput. Stud. 62 (2) (2005) 161–178.[13] J.L. Burke, R.R. Murphy, M.D. Coovert, D.L. Riddle, Moonlight in Miami: a field study of human–robot interaction in the context of an urban search and

rescue disaster response training exercise, Hum.–Comp. Interact. 19 (2004) 85–116.[14] J.L. Burke, R.R. Murphy, D.R. Riddle, T. Fincannon, Task performance metrics in human–robot interaction: taking a systems approach, Perform. Metrics

Intell. Syst. (2004)[15] S. Calinon, Z. Li, T. Alizadeh, N. Tsagarakis, D. Caldwell, Statistical dynamical systems for skills acquisition in humanoids, in: Proceedings of the

International Conference on Humanoid Robots, 2012, pp. 323–329.[16] J. Cassell, C. Pelachaud, N.I. Badler, M. Steedman, B. Achorn, T. Beckett, B. Douville, S. Prevost, M. Stone, Animated conversation: rule-based generation

of facial expression, gesture and spoken intonation for multiple conversational agents, in: Proceedings of the 21st Annual Conference on ComputerGraphics and Interactive Techniques (SIGGRAPH ’94), 1994, pp. 413–420.

[17] O. Castillo, P. Melin, Optimization of type-2 fuzzy systems based on bio-inspired methods: a concise review, Inform. Sci. J. 206 (1) (2012) 1–19.[18] O. Castillo, R. Martinez-Marroquin, P. Melin, F. Valdez, J. Soria, Comparative study of bio-inspired algorithms applied to the optimization of type-1 and

type-2 fuzzy controllers for an autonomous mobile robot, Inform. Sci. J. 192 (2012) 19–38.[19] N.R. Cázarez-Castro, L.T. Aguilar, O. Castillo, Designing type-1 and type-2 fuzzy logic controllers via fuzzy Lyapunov synthesis for non-smooth

mechanical systems, Eng. Appl. Artif. Intell. 25 (5) (2012) 971–979.[20] A. Cesta, G. Cortellessa, M.V. Giuliani, F. Pecora, M. Scopelliti, L. Tiberio, Psychological implications of domestic assistive technology for the elderly,

Psycho-Neurol. J. 5 (3) (2007) 229–252.[21] A. Cesta, G. Cortellessa, F. Pecora, R. Rasconi, Supporting interaction in the RoboCare intelligent assistive environment, in: Proceedings of the AAAI

Spring Symposium on Interaction Challenges for Intelligent Assistants, Stanford, USA, 2007, pp. 18–25.[22] F. Chersi, Learning through imitation: a biological approach to robotics, IEEE Trans. Auton. Ment. Dev. 4 (3) (2012) 204–214.[23] T. Chesney, An acceptance model for useful and fun information systems, Hum. Technol. 2 (2) (2006) 225–235.[24] E. Clarkson, R.C. Arkin, Applying heuristic evaluation to human–robot interaction systems, in: FLAIRS Conference, 2007, pp. 44–49.[25] M.D. Cooney, T. Kanda, A. Alissandrakis, H. Ishiguro, Designing enjoyable motion-based play interactions with a small humanoid robot, Int. J. Soc.

Robot. 6 (2) (2014) 172–193.[26] M.D. Cooney, S. Nishio, H. Ishiguro, Designing robots for well-being: theoretical background and visual scenes of affectionate play with a small

humanoid robot, Lovotics 1 (1) (2014) 1–9.[27] H. Cramer, J. Goddijin, B. Wielinga, V. Evers, Effects of (in) accurate empathy and situational valence on attitudes towards robots, in: ACM/IEEE

International Conference on Human–Robot Interaction, ACM, 2010, pp. 141–142.[28] K. Dautenhahn, I. Werry, A quantitative technique for analyzing robot–human interactions, in: Proceedings of the IEEE/RST International Conference

on Intelligent Robots and Systems, Lausanne, 2002, pp. 1132–1138.[29] K. Dautenhahn, B. Ogden, T. Quick, From embodied to socially embedded agents – implications for interaction-aware robots, Cognit. Syst. Res., Spec.

Iss. Sit. Embod. Cognit. 3 (3) (2002) 397–428. guest-editor: Tom Ziemke, Elsevier.[30] K. Dautenhahn, M. Walters, S. Woods, K.L. Koay, C.L. Nehaniv, How may I serve you?: a robot companion approaching a seated person in a helping

context, in: Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human–Robot Interaction 2006 (HRI2006), 2006, pp. 172–179.[31] K. Dautenhahn, M. Walters, S. Woods, K.L. Koay, C.L. Nehaniv, A. Sisbot, R. Alami, T. Simeon, How may I serve you? A robot companion approaching a

seated person in a helping context, in: The 1st ACM SIGCHI/SIGART Conference on Human–Robot Interaction (HRI 2006), ACM Press, New York, NY,Salt Lake City, UT, USA, 2006, pp. 172–179.

[32] M.H. Davis, Measuring individual differences in empathy: evidence for a multidimensional approach, J. Pers. Soc. Psychol. 44 (1) (1983) 113–126.[33] P. Ekman, Facial expressions, Handbook of Cognition and Emotion, John Wiley & Sons Ltd., New York, 1999.[34] P. Ekman, Facial expressions of emotions, Annu. Rev. Psychol. 20 (1979) 527–554.[35] P. Ekman, Universals and cultural differences in facial expressions of emotion, in: Nebraska Symposium on Motivation, University of Nebraska Press,

1971.[36] D.G. Elmes, B.H. Kantowitz, H.L. Roediger III, Research Methods in Psychology, eighth ed., Thomson-Wadsworth, Belmont, CA, 2006.[37] T. Fong, I. Nourbakhsh, K. Dautenhahn, A survey of socially interactive robots, Robot. Auton. Syst. (2003) 143–166[38] J. Forlizzi, How robotic products become social products: an ethnographic study of cleaning in the home, in: Proceedings of the 2nd ACM/IEEE

International Conference on Human Robot Interaction (HRI 2007), ACM/IEEE, 2007, pp. 129–136.[39] M. Frank, J. Leitner, M. Stollenga, A. Förster, J. Schmidhuber, Curiosity driven reinforcement learning for motion planning on humanoids, Front.

Neurorobot. 2014 (2014) 7–25.[40] B. Friedman, P. Kahn, J. Hagman, Hardware companions? what online AIBO discussion forums reveal about the human–robotic relationship, in:

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Fort Lauderdale, USA, 2003.[41] R. Fry, G.F. Smith, The effects of feedback and eye contact on performance of a digit-encoding task, J. Soc. Psychol. 96 (1975) 145–146.[42] M. Fujita, AIBO: towards the era of digital creatures, Int. J. Robot. Res. 20 (10) (2001) 781–794.[43] D.F. Glas, T. Kanda, H. Ishiguro, N. Hagita, Tele-operation of multiple social robots, IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 42 (3) (2012) 530–

544.[44] D.F. Glas, T. Kanda, H. Ishiguro, N. Hagita, Temporal awareness in tele-operation of conversational robots, IEEE Trans. Syst., Man, Cybernet., Part A:

Syst. Hum. 42 (4) (2012) 905–919.[45] R. Gockley, A. Bruce, J. Forlizzi, M. Michalowski, A. Mundell, S. Rosenthal, B. Sellner, R. Simmons, K. Snipes, A.C. Schultz �, J. Wang, Designing robots for

long-term social interaction, in: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2005), IEEE, 2005, pp.2199–2204.

[46] J. Goetz, S. Kiesler, A. Powers, Matching robot appearance and behavior to tasks to improve human–robot cooperation, in: IEEE Workshop on Robotand Human Interactive Communication (ROMAN’03), 2003.

[47] B. Gonsior, S. Sosnowski, C. Mayer, J. Blume, B. Radig, D. Wollherr, K. Kuhnlenz, Improving aspects of empathy and subjective performance for HRIthrough mirroring facial expressions, in: Proceedings of the 19th IEEE International Symposium on Robot and Human Interactive Communication,2011, pp. 1–7.

[48] M.A. Goodrich, A.C. Schultz, Human–robot interaction: a survey, Found. Trends Hum.–Comp. Interact. 1 (3) (2007) 203–275.[49] R. Groten, D. Feth, R.L. Klatzky, A. Peer, The role of haptic feedback for the integration of intentions in shared task execution, IEEE Trans. Haptics 6 (1)

(2013) 94–105.[50] E.T. Hall, The Hidden Dimension, Anchor Books, 1990.

Page 37: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 341

[51] K. Hayashi, T. Kanda, T. Miyashita, H. Ishiguro, N. Hagita, Robot Manzai – robot conversation as a passive-social medium, Int. J. Humanoid Rob. 5 (1)(2008) 67–86.

[52] K. Hayashi, T. Kanda, T. Miyashita, H. Ishiguro, N. Hagita, Robot Manzai – robots’ conversation as a passive-social medium, in: Proceedings of 2005 atthe 5th IEEE – RAS International Conference on Humanoid Robotics, 2005, pp. 456–462.

[53] K. Hayashi, D. Sakamoto, T. Kanda, M. Shiomi, S. Koizumi, H. Ishiguro, T. Ogasawara, N. Hagita, Humanoid robots as a passive social medium – a fieldexperiment at a train station, in: Proceeding of the ACM/IEEE International Conference on Human–Robot Interaction 2007 (HRI2007), 2007, pp. 137–144.

[54] K. Hayashi, M. Shiomi, T. Kanda, N. Hagita, Are robots appropriate for troublesome and communicative tasks in a city environment?, IEEE Trans Auton.Ment. Dev. 4 (2) (2012) 150–160.

[55] M. Heerink, Assessing Acceptance of Assistive Social Robots by Aging Adults, Thesis completed, University of Amsterdam, Netherlands, 2010, pp. 9–22, 41–51, 87–99.

[56] M. Heerink, B.J.A. Krose, V. Evers, B. Wielinga, Measuring acceptance of an assistive social robot: a suggested toolkit, in: The 18th IEEE InternationalSymposium on Robot and Human Interactive Communication, RO-MAN 2009, 27 2009-October 2, 2009, pp. 528–533.

[57] M. Heerink, B.J.A. Krose, V. Evers, B.J. Wielinga, Relating conversational expressiveness to social presence and acceptance of an assistive social robot,Virtual Reality 14 (1) (2010) 77–84.

[58] M. Heerink, B.J.A. Krose, V. Evers, B. Wielinga, Studying the acceptance of a robotic agent by elderly users, Int. J. Assist. Robot. Mechatron. 7 (3) (2006)33–43.

[59] M. Heerink, B.J.A. Krose, B.J. Wielinga, V. Evers, Enjoyment intention to use and actual use of a conversational robot by elderly people, in: Proceedingsof the 3rd ACM/IEEE International Conference on Human Robot Interaction (HRI 2008), ACM/IEEE, 2008, pp. 113–120.

[60] M. Heerink, B.J.A. Krose, B.J. Wielinga, V. Evers, Human–robot user studies in eldercare: lessons learned, in: Smart Homes and Beyond, 4thInternational Conference on Smart Homes and Health Telematics 2006 (ICOST2006), Belfast, UK, 2006, pp. 31–38.

[61] M. Heerink, B.J.A. Krose, B.J. Wielinga, V. Evers, Influence of social presence on acceptance of an assistive social robot and a screen agent by elderlyusers, Adv. Robot. 23 (14) (2009) 1909–1923.

[62] M. Heerink, B.J.A. Krose, B.J. Wielinga, V. Evers, Measuring acceptance of assistive social agent technology by older adults: the Almere model, Int. J.Soc. Robot. 2 (3) (2010) 1–15.

[63] M. Heerink, B.J.A. Krose, B.J. Wielinga, V. Evers, Measuring the influence of social abilities on acceptance of an interface robot and a screen agent byelderly users, in: Human–Computer Interaction 2009 (HCI 2009), Cambridge, UK, 2009, pp. 430–439.

[64] M. Heerink, B.J.A. Krose, B.J. Wielinga, V. Evers, The influence of a robot’s social abilities on acceptance by elderly users, in: Proceedings of theRomanian-Academy (RO-MAN), Hertfordshire, UK, 2006, pp. 430–439.

[65] F. Hegel, T. Spexard, T. Vogt, G. Horstmann, B. Wrede, Playing a different imitation game: interaction with an empathic android robot, in: Proceedings2006 IEEE-RAS International Conference on Humanoid Robots (Humanoids06), 2006, pp. 56–61.

[66] A.H. Hennington, B.D. Janz, Information systems and healthcare XVI: physician adoption of electronic medical records: applying the UTAUT model in ahealthcare context, Commun. Assoc. Inf. Syst. 19 (1/5) (2007) 60–80.

[67] D. Heylen, A. Nijholt, D. Reidsma, Determining what people feel and think when interacting with humans and machines: notes on corpus collectionand annotation, in: Proceedings 1st California Conference on Recent Advances in Engineering Mechanics, Fullerton, USA, 2006, pp. 1–6.

[68] C. Ho, K.F. MacDorman, Z.A. Pramono, Human emotion and the uncanny valley: a GLM, MDS, and Isomap analysis of robot video ratings, in:Proceedings of the 3rd ACM/IEEE International Conference on Human–Robot Interaction (HRI ’08), ACM, 2008.

[69] K. Hone, Empathic agents to reduce user frustration: the effects of varying agent characteristics, Interact. Comput. 18 (2) (2006) 227–245.[70] Y.H. Hu, C.K. Loo, A generalized quantum-inspired decision making model for intelligent agent, Sci. World J. 2014 (2014) 1–8. ID No. 240983.[71] K. Itoh, H. Miwa, Y. Nukariya, M. Zecca, H. Takanobu, S. Roccella, M.C. Carrozza, P. Dario, T. Atsuo, Development of a bioinstrumentation system in the

interaction between a human and a robot, in: International Conference of Intelligent Robots and Systems, 2006, pp. 2620–2625.[72] H. Ishiguro, T. Ono, M. Imai, T. Kanda, Development of an interactive humanoid robot ‘‘Robovie’’ – an interdisciplinary approach, Robot. Res. (2003)

179–191.[73] M. Jacobsson, J. Bodin, L.E. Holmquist, The see-Puck: a platform for exploring human–robot relationships, in: Proceeding of the 26th Annual SIGCHI

Conference on Human Factors in Computing Systems, Florence, Italy, 2008, pp. 141–144.[74] M. Jacobsson, S. Ljungblad, J. Bodin, J. Knurek, L.E. Holmquist, GlowBots: robots that evolve relationships, in: The SIGGRAPH’07 International

Conference of Emerging Technologies, San Diego, California, 2007, pp. 1–4.[75] B. Johnson, L. Christensen, Educational Research Quantitative, Qualitative, and Mixed Approaches, second ed., Pearson Education, Inc., Boston, MA,

2004.[76] M. Kamasima, T. Kanda, M. Imai, T. Ono, D. Sakamoto, H. Ishiguro, Y. Anzai, Embodied cooperative behaviours by an autonomous humanoid robot, in:

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’04), 2004, pp. 2506–2513.[77] T. Kanda, H. Ishiguro, An approach for a social robot to understand human relationships: friendship estimation through interaction with robots,

Interact. Stud. 7 (3) (2006) 369–403.[78] T. Kanda, H. Ishiguro, Reading human relationships from their interaction with an interactive humanoid robot, in: International Conference on

Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA/AIE), 2004.[79] T. Kanda, D.F. Glas, M. Shiomi, N. Hagita, Abstracting people’s trajectories for social robots to proactively approach customers, IEEE Trans. Rob. 25 (6)

(2009) 1382–1396.[80] T. Kanda, T. Hirano, D. Eaton, H. Ishiguro, Interactive robots as social partners and peer tutors for children: a field trial, J. Hum.–Comp. Interact. 19 (1–

2) (2004) 61–84.[81] T. Kanda, H. Ishiguro, T. Ishida, Psychological analysis on human–robot interaction, in: The IEEE International Conference on Robotics and Automation

(ICRA’01), 2001, pp. 4166–4173.[82] T. Kanda, H. Ishiguro, M. Imai, T. Ono, Development and evaluation of interactive humanoid robots, in: Proceedings of the IEEE International

Conference, 2004, pp. 1839–1850.[83] T. Kanda, H. Ishiguro, M. Imai, T. Ono, K. Mase, A constructive approach for developing interactive humanoid robots, in: Proceeding of IEEE/RSJ

International Conference of Intelligent Robots and Systems, 2002, pp. 1265–1270.[84] T. Kanda, H. Ishiguro, T. Ono, M. Imai, K. Mase, Multi-robot cooperation for human–robot communication, in: The IEEE International Workshop on

Robot and Human Communication (ROMAN2002), 2002, pp. 271–276.[85] T. Kanda, H. Ishiguro, T. Ono, M. Imai, R. Nakatsu, Development and evaluation of an interactive humanoid robot Robovie, in: The Proceedings of IEEE

International Conference on Robotics and Automation, 2002, pp. 1848–1855.[86] T. Kanda, T. Miyashita, T. Osada, Y. Haikawa, H. Ishiguro, Analysis of humanoid appearances in human–robot interaction, IEEE Trans. Rob. 24 (3)

(2008) 725–735.[87] T. Kanda, S. Nabe, K. Hiraki, H. Ishiguro, N. Hagita, Human friendship estimation model for communication robots, Auton. Robots 24 (2) (2008) 135–

145.[88] T. Kanda, S. Nishio, H. Ishiguro, N. Hagita, Interactive humanoid robots and androids in children’s lives, Child., Youth Environ. 19 (1) (2009) 12–33.[89] T. Kanda, R. Sato, N. Saiwaki, H. Ishiguro, A two-month field trial in an elementary school for long-term human–robot interaction, IEEE Trans. Rob. 23

(5) (2007) 962–971.[90] T. Kanda, R. Sato, N. Saiwaki, H. Ishiguro, Friendly social robot that understands human’s friendly relationships, in: The Proceeding of the IEEE/RSJ

International Conference on Intelligent Robots and Systems, 2004, pp. 2215–2222.[91] T. Kanda, M. Shiomi, Z. Miyashita, H. Ishiguro, N. Hagita, A communication robot in a shopping mall, IEEE Trans. Rob. 26 (5) (2010) 897–913.

Page 38: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

342 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

[92] C.D. Kidd, C. Breazeal, Human–robot interaction experiments: lessons learned, in: Proceedings of the AISB’05 Symposium Robot Companions: HardProblems and Open Challenges in Robot–Human Interaction, 2005, pp. 141–142.

[93] C.D. Kidd, W. Taggart, S. Turkle, A social robot to encourage social interaction among the elderly, in: Proceedings in 2006 of the IEEE InternationalConference on Robotics and Automation (ICRA 2006), 2006, pp. 3972–3976.

[94] J. Klein, Y. Moon, R.W. Picard, This computer responds to user frustration: theory, design, and results, Interact. Comput. 14 (2) (2002) 119–140.[95] K. Koay, M. Walters, K. Dautenhahn, Methodological issues using a comfort level device in human–robot interactions, in: Proceedings of the 14th IEEE

International Workshop on Robot and Human Interactive Communication (RO-MAN 2005), IEEE Press, 2005, pp. 359–364.[96] H. Kozima, M. Michalowski, C. Nakagawa, A playful robot for research, therapy, and entertainment, Int. J. Soc. Robot. 1 (2009) 3–18.[97] H. Kozima, C. Nakagawa, Y. Yasuda, Interactive robots for communication-care: a case-study in autism therapy, in: IEEE International Workshop on

Robot and Human Interactive Communication (ROMAN 2005), 2005, pp. 341–346.[98] K. Kronander, A. Billard, Online learning of varying stiffness through physical human–robot interaction, in: International Conference on Robotics and

Automation, 2012, pp. 1842–1849.[99] D. Kulic, E. Croft, Physiological and subjective responses to articulated robot motion, Robotica 25 (1) (2007) 13–27.

[100] I. Leite, G. Castellano, A. Pereira, C. Martinho, A. Paiva, Modelling empathic behaviour in a robotic game companion for children: an ethnographicstudy in real-world settings, in: 7th ACM/IEEE International Conference on Human–Robot Interaction, 2012, pp. 367–374.

[101] I. Leite, A. Pereira, S. Mascarenhas, C. Martinho, R. Prada, A. Paiva, The influence of empathy in human–robot relations, Int. J. Hum Comput Stud. 71 (3)(2013) 250–260.

[102] S. Levine, V. Koltun, Continuous inverse optimal control with locally optimal examples, in The Proceedings of the 29th International Conference onMachine Learning (ICML’12), 2012, pp. 1–8.

[103] C. Liu, P. Rani, N. Sarkar, Affective state recognition and adaptation in human–robot interaction: a design approach, in: International Conference onIntelligent Robots and Systems (IROS 2006), 2006, pp. 3099–3106.

[104] P. Liu, D.F. Glas, T. Kanda, H. Ishiguro, N. Hagita, How to train your robot – teaching service robots to reproduce human social behavior, in: The 23rdInternational Symposium on Robot and Human Interactive Communication (RO-MAN 2014), UK, 2014, pp. 961–968.

[105] R. Looije, F. Cnossen, M.A. Neerincx, Incorporating guidelines for health assistance into a socially intelligent robot, in: Proceedings of RomanianAcademy (RO-MAN), Hatfield, UK, 2006, pp. 515–520.

[106] M. Masashiro, On the uncanny valley, in: Proceedings of the Humanoids-2005 Workshop: Views of the Uncanny Valley, 2005.[107] A. Melendez, O. Castillo, Evolutionary optimization of the fuzzy integrator in a navigation system for a mobile robot, Recent Adv. Hybrid Intell. Syst.

(2013) 21–31[108] P. Melin, L. Astudillo, O. Castillo, F. Valdez, M. Garcia, Optimal design of type-2 and type-1 fuzzy tracking controllers for autonomous mobile robots

under perturbed torques using a new chemical optimization paradigm, Expert Syst. Appl. 40 (8) (2013) 3185–3195.[109] A. Mora, D.F. Glas, T. Kanda, N. Hagita, A teleoperation approach for mobile social robots incorporating automatic gaze control and three-dimensional

spatial visualization, IEEE Trans. Syst., Man, Cybernet. 43 (3) (2013) 630–642.[110] Y. Morales, T. Kanda, N. Hagita, Walking together: side by side walking model for an interacting robot, J. Hum.–Robot Interact. 3 (2) (2014) 50–73.[111] Y. Moriguchi, T. Kanda, H. Ishiguro, Y. Shimada, S. Itakura, Can young children learn words from a robot?, Interact Stud. 12 (1) (2011) 107–118.[112] M. Mori, Bukimi no Tami [The Uncanny Valley], Energy 7 (1970) 33–35.[113] L. Moshkina, R.C. Arkin, Human perspective on affective robotic behavior: a longitudinal study, in: IEEE/RSJ International Conference on Intelligent

Robots and Systems (IROS 2005), 2005, pp. 2443–2450.[114] R. Murakami, Y. Morales, T. Kanda, H. Ishiguro, Destination unknown: walking side-by-side without knowing the goal, in: The Proceedings of the 9th

ACM/IEEE International Conference on Human–Robot Interaction (HRI2014), 2014, pp. 1–15.[115] B. Mutlu, J. Forlizzi, Robots in organizations: the role of workflow, social, and environmental factors in human–robot interaction, in: ACM/IEEE

International Conference on Human–Robot Interaction, ACM, 2008, pp. 287–294.[116] B. Mutlu, J.K. Hodgins, J. Forlizzi, A storytelling robot: modeling and evaluation of human-like gaze behavior, in: The 6th 2006 IEEE – RAS

International Conference on Humanoid Robots (HUMANOIDS’06), IEEE, Genova, Italy, 2006, pp. 518–523.[117] B. Mutlu, T. Kanda, J. Forlizzi, J.K. Hodgins, H. Ishiguro, Conversational gaze mechanisms for humanlike robots, ACM Trans. Interact. Intell. Syst. 1 (2)

(2012) 1–33 (Article No. 12).[118] B. Mutlu, S. Osman, J. Forlizzi, J.K. Hodgins, S. Kiesler, Task structure and user attributes as elements of human–robot interaction design, in: The 15th

International Symposium on Robot and Human Interactive Communication (RO-MAN 2006), 2006, pp. 74–79.[119] B. Mutlu, A. Terrell, C. Huang, Coordination mechanisms in human–robot collaboration, in: The ACM/IEEE International Conference on Human–Robot

Interaction (HRI) – on Collaborative Manipulation, 2013, pp. 1–6.[120] J. Nielsen, Enhancing the explanatory power of usability heuristics, in: Proceedings of the CHI 94 Conference on Human Factors in Computing

Systems, ACM, Boston, MA, 1994.[121] T. Nomura, T. Kanda, T. Suzuki, K. Kato, Age difference and images of robots – social survey in Japan, Interact. Stud. 10 (3) (2009) 374–391.[122] D.R. Olsen, M.A. Goodrich, Metrics for evaluating human–robot interactions, in: Proceedings of (PerMIS), NIST’s Performance Metrics for Intelligent

Systems Workshop, Gaithersburg, MA, 2003.[123] D.R. Olsen, S.B. Wood, Fan-out: measuring human control of multiple robots, in: Proceedings of the Conference on Human Factors in Computing

Systems (CHI), 2004.[124] T. Ono, M. Imai, H. Ishiguro, A model of embodied communications with gestures between humans and robots, in: Proceedings of 23rd Annual

Meeting Cognitive Science Society, 2001, pp. 732–737.[125] J.P. Otteson, C.R. Otteson, Effects of teacher gaze on children’s story recall, Percept. Mot. Skills 50 (1980) 35–42.[126] M. Pateraki, H. Baltzakis, P. Trahanias, Visual estimation of pointed targets for robot guidance via fusion of face pose and hand orientation, Comp. Vis.

Image Understand., Elsevier 120 (2014) 1–13.[127] R. Picard, K. Liu, Relative subjective count and assessment of interruptive technologies applied to mobile monitoring of stress, Int. J. Hum Comput

Stud. 65 (4) (2007) 361–375.[128] R. Picard, E. Vyzas, J. Healey, Toward machine emotional intelligence: analysis of affective physiological state, IEEE Trans. Pattern Anal. Mach. Intell.

23 (10) (2001) 1175–1191.[129] J. Preece, Y. Rogers, H. Sharp, D. Benyon, S. Holland, T. Carey, Human–Computer Interaction, Addison-Wesley, Wokingham, United Kingdom, 1994. p.

622.[130] P. Rani, N. Sarkar, C.A. Smith, L.D. Kirby, Anxiety detecting robotic system – towards implicit human–robot collaboration, Robotica 22 (1) (2004) 85–

95.[131] L.D. Riek, P. Robinson, Robot, rabbit, or red herring? Societal acceptance as a function of classification ease, in: The 17th International IEEE Symposium

on Robot and Human Interactive Communication (RO-MAN 2008), Workshop Robots as Social Actors: Evaluating Social Acceptance and SocietalImpact of Robotic Agents, IEEE, 2008.

[132] L.D. Riek, P.C. Paul, P. Robinson, When my robot smiles at me: enabling human–robot rapport via real-time head gesture mimicry, J. Multimod. UserInterf. 3 (1–2) (2010) 99–108.

[133] B. Robins, K. Dautenhahn, R. te Boekhorst, A. Billard, Robotic assistants in therapy and education of children with autism: can a small humanoid robothelp encourage social interaction skills? in: Access In the Information Society (UAIS), vol. 4(4), Springer-Verlag, 2005, pp. 2199–2204.

[134] L. Rozo, P. Jimenez, C. Torras, A robot learning from demonstration framework to perform force-based manipulation tasks, Intel. Serv. Robot. 6 (1)(2013) 33–51.

Page 39: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344 343

[135] B. de Ruyter, P. Saini, P. Markopoulos, A.J.N. van Breemen, Assessing the effects of building social intelligence in a robotic interface for the home, Spec.Iss. IwC: Soc. Impact Emerging Technol. 17 (5) (2005) 522–541.

[136] A. Sabelli, T. Kanda, N. Hagita, A conversational robot in an elderly care centre: an ethnographic study, in: ACM/IEEE International Conference onHuman–Robot Interaction, ACM, 2011, pp. 37–44.

[137] M. Saerbeck, T. Schut, C. Bartneck, M. Janse, Expressive robots in education: varying the degree of social supportive behaviour of a robotic tutor, in:Proceedings of Computer–Human Interaction 2010 (CHI 2010), ACM, 2010, pp. 1613–1622.

[138] T. Saito, T. Shibata, K. Wada, K. Tanie, Relationship between interaction with the mental commit robot and change of stress reaction of the elderly, in:Proceedings of IEEE CIRA, Kobe, Japan, 2003, pp. 16–20.

[139] Y. Sakagami, R. Watanabe, C. Aoyama, S. Matsunaga, N. Higaki, K. Fujimura, The intelligent ASIMO: System overview and integration, in: Proceeding ofthe IEEE/RSJ International Conference on Intelligent Robots and Systems, 2002, pp. 2478–2483.

[140] K. Sakai, H. Sumioka, T. Minato, S. Nishio, H. Ishiguro, Motion design of interactive small humanoid robot with visual illusion, Int. J. Innov. Comput.,Inform. Control, IJICIC 12/2013 9 (12) (2013) 4725–4736.

[141] D. Sakamoto, K. Hayashi, T. Kanda, M. Shiomi, S. Koizumi, H. Ishiguro, T. Ogasawara, N. Hagita, Humanoid robots as a broadcasting communicationmedium in open public spaces, Int. J. Soc. Robot. 1 (2) (2009) 157–169.

[142] S. Satake, T. Kanda, D.F. Glas, M. Imai, H. Ishiguro, N. Hagita, A robot that approaches pedestrians, IEEE Trans. Rob. 29 (2) (2013) 508–524. issue 01.[143] J. Scholtz, S. Consolvo, Toward a framework for evaluating ubiquitous computing applications, IEEE Pervasive Comput. 3 (2) (2004) 82–88.[144] J. Scholtz, Evaluation methods for human–system performance of intelligent systems, in: Proceedings of the 2002 Performance Metrics for Intelligent

Systems (PerMIS) Workshop, National Institute of Standards and Technology, Gaithersburg, MD, 2002.[145] A.M.B.T.Z.G.B. SergeThill, Daniele Caligiore, Theories and computational models of affordance and mirror systems: an integrative review, Neurosci.

Bio-behav. Rev. 37 (2013) 491–521.[146] J. Sherwood, Facilitative effects of gaze upon learning, Percept. Mot. Skills 2/3 (2) (1988) 1275–1278.[147] T. Shibata, K. Tanie, Physical and affective interaction between human and mental commit robot, in: Proceedings of the IEEE International Conference

on Robotics and Automation, 2001, pp. 2572–2577.[148] K. Shihab, D.Y.Y. Sim, Development of a visualization tool for XML documents, in: International Journal of Computers, vol. 4(04), North Atlantic

University Union (NAUN), USA, 2010, pp. 153–160.[149] K. Shihab, D.Y.Y. Sim, A.M. Shahi, Angur: a visualization system for XML documents, in: Proceeding of the 9th WSEAS International Conference on

Telecommunications and Informatics (TELE-INFO’10), Italy, 2010, pp. 159–165.[150] M. Shiomi, T. Kanda, D.F. Glas, S. Satake, H. Ishiguro, N. Hagita, Field trial of networked social robots in a shopping mall, in: The 2009 IEEE/RSJ

International Conference on Intelligent Robots and Systems, 2009, pp. 2846–2853.[151] M. Shiomi, T. Kanda, S. Koizumi, H. Ishiguro, N. Hagita, Group attention control for communication robots, Int. J. Human. Robot. (IJHR) 5 (4) (2008)

587–608.[152] T. Shiwa, T. Kanda, M. Imai, H. Ishiguro, N. Hagita, How quickly should a communication robot respond?, Int J. Soc. Robot. 1 (2) (2009) 141–155.[153] T. Shiwa, T. Kanda, M. Imai, H. Ishiguro, N. Hagita, How quickly should communication robots respond? in: Proceedings of the 3rd ACM/IEEE

International Conference on Human Robot Interaction (HRI 2008), ACM/IEEE, 2008, pp. 153–160.[154] P.K. Shukla, S.P. Tripathi, A new approach for tuning interval type-2 fuzzy knowledge bases using genetic algorithms, J. Uncert. Anal. Appl. 2 (4) (2014)

1–15.[155] D.Y.Y. Sim, Emerging convergences of HCI techniques for graphical scalable visualization, in: Proceedings of 7th International Conference on IT in Asia

2011 (CITA’11), 2011, pp. 1–8.[156] R. Sorbello, A. Chella, C. Cali, M. Giardina, S. Nishio, H. Ishiguro, Telenoid android robot as an embodied perceptual social regulation medium engaging

natural human-humanoid interaction, Robot. Auton. Syst. 62 (09) (2014) 1329–1341.[157] M. Staudte, M.W. Crocker, Investigating joint attention mechanisms through spoken human–robot interaction, Cognition 120 (2011) 268–291.[158] A. Steinfeld, T. Fong, D. Kaber, M. Lewis, J. Scholtz, A. Schultz, M. Goodrich, Common metrics for human–robot interaction, in: Proceedings of the 1st

ACM SIGCHI/SIGART Conference on Human–Robot Interaction, 2006, pp. 33–40.[159] B. Sterling, S.L. Gaertner, The attribution of arousal and emergency helping: a bidirectional process, J. Exp. Soc. Psychol. 20 (1984) 286–296.[160] W.D. Stiehl, J. Lieberman, C. Breazeal, L. Basel, L. Lalla, M. Wolf, The design of the huggable: a therapeutic robotic companion for relational, affective

touch, in: Proceedings of AAAI 2005 Fall Symposium on Caring Machines, AAAI, 2005.[161] K. Stubbs, P. Hinds, D. Wettergreen, Autonomy and common ground in human–robot interaction: a field study, IEEE Intell. Syst. 22 (2) (2007) 42–50.[162] J. Sung, H. Christensen, R. Grinter, Robots in the wild: understanding long-term use, in: The ACM/IEEE International Conference on Human–Robot

Interaction (HRI), ACM, 2009, pp. 45–52.[163] J. Sung, R. Grinter, H. Christensen, L. Guo, Housewives or technophiles? Understanding domestic owners, in: Proceedings of the 3rd ACM/IEEE

International Conference on Human Robot Interaction (HRI 2008), ACM/IEEE, 2008, pp. 129–136.[164] W. Taggart, S. Turkle, C. Kidd, An Interactive Robot in a Nursing Home: Preliminary Remarks, Towards Social Mechanisms of Android Science,

Cognitive Science Society, Stresa, Italy, 2005. pp. 56–61.[165] T. Tamura, S. Yonemitsu, A. Itoh, D. Oikawa, A. Kawakami, Y. Higashi, T. Fujimooto, K. Nakajima, Is an entertainment robot useful in the care of elderly

people with severe dementia?, J Gerontol. Ser. A: Biol. Med. Sci. 59 (1) (2004) 83–85.[166] A. Tapus, M.J. Mataric’, Emulating Empathy in Socially Assistive Robotics, American Association for Artificial Intelligence, 2006. pp. 1–4.[167] A. Tapus, M.J. Mataric’, Socially assistive robots: the link between personality, empathy, physiological signals, and task performance, Association for

the Advancement of Artificial Intelligence in AAAI Spring 8 (2008) 1–8.[168] A. Tapus, M.J. Mataric’, User personality matching with hands-off robot for post-stroke rehabilitation therapy, in: Proceeding of International

Symposium on Experimental Robotics (ISER’06), 2006.[169] A. Tapus, M.J. Mataric’, B. Scassellati, The grand challenges in socially assistive robotics, IEEE Robot. Autom. Magaz. Spec. Iss. Grand Chall. Robot. 14 (1)

(2007) 35–42.[170] D. Veenstra, V. Evers, The development of an online research tool to investigate children’s social bonds with robots, Hum.–Robot Pers. Relation. 59

(2011) 19–26.[171] V. Venkatesh, F.D. Davis, A theoretical extension of the technology acceptance model: four longitudinal field studies, Manage. Sci. 46 (2) (2000) 186–

204.[172] V. Venkatesh, M.G. Morris, G.B. Davis, F.D. Davis, User acceptance of information technology: toward a unified view, MIS Quart. 27 (3) (2003) 425–

478.[173] K. Wada, T. Shibata, T. Saito, K. Tanie, Analysis of factors that bring mental effects to elderly people in robot assisted activity, in: Proceedings of the

IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 2, 2002, pp. 1152–1157.[174] K. Wada, T. Shibata, T. Saito, K. Tanie, Effects of robot assisted activity to elderly people who stay at a health service facility for the aged, in:

Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), USA, 2003, pp. 2847–2852.[175] K. Wada, T. Shibata, T. Saito, K. Tanie, Psychological and social effects of robot assisted activity to elderly people who stay at a health service facility

for the aged, in: Proceedings International Conference on Robotics and Automation, USA, 2003, pp. 3996–4001.[176] K. Wada, T. Shibata, T. Saito, K. Tanie, Psychological, physiological and social effects to elderly people by robot assisted activity at a health service

facility for the aged, in: Proceedings of IEEE/RSJ International Conference on Advanced Intelligent Mechatronics (AIM 2003), Kobe, Japan, 2003.[177] K. Wada, T. Shibata, T. Saito, K. Tanie, Relationship between familiarity with mental commit robot and psychological effects to elderly people by robot

assisted activity, in: Proceedings IEEE International Symposium on Computational Intelligence in Robotics and Automation, Kobe, Japan, 2003, pp.113–118.

Page 40: Extensive assessment and evaluation methodologies … · Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A

344 D.Y.Y. Sim, C.K. Loo / Information Sciences 301 (2015) 305–344

[178] Y. Wang, J.E. Young, Beyond ‘Pink’ and ‘Blue’: Gendered attitudes towards robots in society, in: The Proceedings of the ACM SIGCHI Conference on theSignificance of Gender for Modern Information Technology (GenderIT2014), 2014, pp. 1–10.

[179] D. Watson, L.A. Clark, A. Tellegen, Development and validation of brief measures of positive and negative affect: the panas scales, J. Pers. Soc. Psychol.54 (6) (1988) 1063–1070.

[180] P. Wu, C. Miller, Results from a field study: the need for an emotional relationship between the elderly and their assistive technologies, Found. Augm.Cognit. 11 (2005) 889–898.

[181] F. Yamaoka, T. Kanda, H. Ishiguro, N. Hagita, A model of proximity control for information-presenting robots, IEEE Trans. Robot. 26 (1) (2010) 187–195 (short paper).

[182] H.A. Yanco, J.L. Drury, J. Scholtz, Beyond usability evaluation: analysis of human–robot interaction at a major robotics competition, Hum. Comp.Interact. 19 (2004) 117–149.

[183] K. Zheng, D.F. Glas, T. Kanda, H. Ishiguro, N. Hagita, Designing and implementing a human–robot team for social interactions, IEEE Trans. Syst., ManCybernet.: Syst. 43 (4) (2013) ; 843–859 (issue 01).

[184] K. Zheng, D.F. Glas, T. Kanda, H. Ishiguro, N. Hagita, How many social robots can one operator control? in: Proceedings of ACM/IEEE 6th AnnualConference on Human–Robot Interaction, Lausanne, Switzerland, 2011, pp. 379–386.

[185] K. Zheng, D.F. Glas, T. Kanda, H. Ishiguro, N. Hagita, Supervisory control of multiple social robots for conversation and navigation, Trans. Control Mech.Syst. 3 (2) (2014) 76–92.