MSc Thesis - UEF · MSc Thesis Ekundayo Olufemi A. {Contactless Measurement in Smart Environment for the Elderly People Using Kinect v2 Sensor.} School of Computer Science {International

MSc Thesis

Ekundayo Olufemi A.

{Contactless Measurement in Smart Environment for the Elderly People

Using Kinect v2 Sensor.}

School of Computer Science

{International Master's Degree Programme in Information Technology}

February 2018

Foreword

This thesis was done at the School of Computing, University of Eastern Finland dur-

ing the autumn 2017.

I want to extend my gratitude to my parents, friends, teachers, and especially my

supervisor Prof. Pekka Toivanen.

List of abbreviations

AAL Ambient Assisted Living

API Application Programming Interface

CMOS Complementary Metal–oxide–semiconductor

GDL Gesture Description Language

IR Infrared

IMU Inertia Measurement Unit

LIDAR Light Detection and Ranging

RFID Radio-frequency Identification

RGB Red Green Blue

SDK Software Development Kit

SEAL Smart Environment for assisted Living

SIFT Scale Invariant Feature Transform

SSIM Structural Similarity Index-based Measure

TOF Time of Flight

iv

Contents

1 Introduction to Kinetic v2 Sensor ............................................................... 1

1.1 Evolution of Kinect Sensor ................................................................. 1 1.2 Technology of Kinect ......................................................................... 2 1.3 Kinect (2.0 2013) – Designed for Xbox One ...................................... 3 1.4 1.5 Non- Commercial Kinect designed for Microsoft Windows ........ 3 1.5 Kinect Versions from 1.5 to 1.8 .......................................................... 3

1.6 Kinect v2 ............................................................................................. 3 1.7 Significance of Kinect v2 and Assisted Living Facilities ................... 4 1.8 Potential Use of Kinect v2 in Assisted Living .................................... 4

1.8.1 Different Spheres for Application of Kinect v2 .................. 5

2 Review of related literature ......................................................................... 6

2.1 Introduction ......................................................................................... 6 2.2 Use of Wireless Sensor Networks ...................................................... 6

2.3 Kinect v2 Depth Sensor ...................................................................... 6 2.4 Use in Karate Techniques ................................................................... 6

2.5 Advantages of v2 over v1 ................................................................... 7 2.6 Pose Estimation of Human Body Part Using Multiple Cameras ........ 8 7 An Innovative Hearing System Utilizing the Human Body .............. 8

2.8 Accuracy and Reliability of Optimum Distance in Kinect v2 ........... 9 2.9 Integration of Microsoft Kinect with Simulink ................................ 10

2.10 Utility and usability of Kinect v2 and Leap Motion ........................ 12 2.11 A Depth-Based Fall Detection System Using a Kinect Sensor ........ 13 2.12 Experimental Studies on Human Body ............................................. 15

2.13 Body Movement Analysis and Recognition ..................................... 15

2.14 An Integrated Platform for Live 3-D Human Reconstruction ......... 18 2.15 Automated Training and Maintenance through Kinect .................... 19 2.15 Kinect in the Kitchen and other Practical Home Environments ....... 20

2.16 Kinect Gaming and Physiotherapy ................................................... 21

3 Research Methodology ............................................................................. 23

3.1 Introduction ....................................................................................... 23

3.2 Model of the research ....................................................................... 24 3.3 Research Design ............................................................................... 24 3.4 Primary Data ..................................................................................... 25 3.5 Summary ........................................................................................... 25

4 Data analysis and presentation .................................................................. 26

4.1 Introduction ....................................................................................... 26

4.2 Smart Home environments ............................................................... 27

4.3 Movement detection models ............................................................. 39 4.4 Skeletal Tracking systems ................................................................ 53

5 Findings and conclusion ........................................................................... 57

v

5.1 Findings ............................................................................................ 57

5.2 Conclusions ....................................................................................... 60

References ......................................................................................................... 62

Appendices

Appendix 1: Checklist (2 pages)

vi

Table of Figures and Illustrations

Figure 1-Xbox 360, Kinect v1. Klesistern (2014) ....................................................... 1

Figure 2- CMOS sensor, Primesense. Journal of Sensors (2014) ................................ 2

Figure 3-Kinect sensor components. Journal of Sensors (2013) ................................. 4

Figure 4- GDL illustration. Teng et.al. (2013) ............................................................ 7

Figure 5- Medical application. Lim et.al. (2014) ......................................................... 9

Figure 6- Simulink Kinect. Joshua et.al. (2014) ........................................................ 11

Figure 7- Leap Motion Sensor. Hughes et.al. (2015) ................................................ 12

Figure 8- Motion Sensor illustration. Hughes et.al. (2015) ....................................... 13

Figure 9- Fall detection illustration. Samuele et.al. (2014) ....................................... 14

Figure 10- Movement analysis Glove. Yang et.al. (2012) ......................................... 16

Figure 11- Humanoid robotics illustration. Clingal et.al. (2014) .............................. 16

Figure 12- RGDB illustration. Immitrios et.al. (2014) .............................................. 19

Figure 13- Smart Home System illustration. Berkley University Journal et.al. (2013)

................................................................................................................................... 20

Figure 14- Pose Experiments, Kinect tests. (2013) .................................................. 23

Figure 15- Research design ........................................................................................ 25

Figure 16- Gradinaru (2016) graphical representation of system .............................. 26

Figure 17- Conceptual Framework of a smart home environment ............................ 28

Figure 18- Smart home environment layered description ......................................... 28

Figure 19- Smart home environment layout .............................................................. 30

Figure 20- Hondori et al (2013) system set up including inertia sensors and Kinect

sensors ........................................................................................................................ 31

Figure 21- Hondori et al (2013) 3-D trajectories ....................................................... 32

vii

Figure 22- Hondori et al (2013) experimental data on body movements .................. 33

Figure 23- Hondori et al (2013) limb changes in task like drinking and eating ........ 34

Figure 24- Hondori et al (2013) inertia sensor data from individual’s items ............ 34

Figure 25- Mohamed et al (2013) smart house used in the experiment .................... 35

Figure 26- Mohamed et al (2013) Natural User Interface ......................................... 36

Figure 27- Mohamed et al (2013) Waist detection posture ....................................... 37

Figure 28- Mohamed et al (2013) Waist detection posture ....................................... 37

Figure 29- Mohamed et al (2013) Kinect procedure for gesture recognition ............ 38

Figure 30- Mohamed et al (2013) .............................................................................. 38

Figure 31- Mohamed et al (2013) Kinect toolbox recognition of circle gestures ...... 39

Figure 32- Chin et al (2013) Three Kinect sensors, IR light, RGB camera, IR detector

................................................................................................................................... 40

Figure 33- Chin et al (2013) Depth sensor distance .................................................. 40

Figure 34- Chin et al (2013) Depth frame bit pixel ................................................... 41

Figure 35- Chin et al (2013) Algorithm depth distance ............................................. 41

Figure 36- Chin et al (2013) Average depth distance vs Actual distance .................. 44

Figure 37- Chin et al (2013) Accuracy analysis AMPE vs Distance ......................... 44

Figure 38- Chin et al (2013) Precision analysis std vs Distance ............................... 45

Figure 39-Alexiadis et al (2017)3-D Camera and sensor setup ................................. 47

Figure 40- Alexiadis et al (2017) Stages for the proposed model ............................. 48

Figure 41- Alexiadis et al (2017) Image quality reconstruction; Kinect data,

waterlight geometry and Poisson ............................................................................... 48

Figure 42- Tahavori et al (2013) Kinect for Xbox vs Windows ................................ 49

Figure 43- Sengupta and Ohya (1996) Two staged pose estimation illustration ....... 51

viii

Figure 44- Sengupta and Ohya (1996) back projection method estimation .............. 51

Figure 45- Sengupta and Ohya (1996) images used for the experiment ................... 52

Figure 46- Sengupta and Ohya (1996) extracted silhouette images .......................... 52

Figure 47- Sengupta and Ohya (1996) rendered images from the parameter set ...... 53

Figure 48- Sengupta and Ohya (1996) rendered images of the transferred model .... 53

Figure 49- Tao et al (2013) constant camera error .................................................... 54

Figure 50- Tao et al (2013) variable camera error ..................................................... 55

Figure 51- Choe et al (2014) invariability of IR images and RGB under different

lighting conditions ..................................................................................................... 56

Figure 52- Choe et al (2014) Data capturing system, used to obtain the base mesh . 56

Figure 53- Choe et al (2014) input shading image, projected mesh and depth map . 56

1 Introduction to Kinetic V2 Sensor

Kinect technology was initially named as Project Natal during the initial phases of its

development. It is a series of input devices developed by Microsoft for different vid-

eo consoles including Xbox one and Xbox 360. The device makes use of different

gestures and spoken commands to provide a natural interface to users to interact with

console and computer (Lange, 2011). In 2010, Kinect was developed to enhance the

audience base of Xbox 360 and was rumored to be launched with release of Xbox

360 console [1]. These reports were however dismissed by Microsoft. At that time, it

was believed by the company that Xbox will last until 2015. Following the release,

different experiments were conducted to evaluate the stability of the device. In 2009

to prove the stability of Kinect, different games were shown in Tokyo Game show of

year 2009. The games included Beautiful Katamari and Space Invaders Extreme

were the important ones (Stowers, 2011). Initially, it was planned that for Kinect

operations like skeletal mapping, a microprocessor would also be accompanied by

the sensor unit, however, later it was decided that there would not be any dedicated

processor in it. For this dedicated purpose, processor cores were developed instead.

Research by Stowers (2011) further showed that only 10-15 % computing resources

were used by Kinect. In the same timeframe development of Kinect like gadgets be-

came the trend of the time.

Figure 1-Xbox 360, Kinect v1. Klesistern (2014)

1.1 Evolution of Kinect Sensor

After the "World Premiere 'Project Natal' for Xbox 360 Experience" of 2010, Kinect

was the official name granted to this gadget. This word was created from a combina-

tion of two words kinetic and connect. Initially, this was thought to be an imperative

initiative and the date of launch was initially set as November 2010 by Microsoft [3].

This was however changed as the project faced delays. When Xbox 360 was later

announced to be launched it was ready for Kinect, port with connector and ready for

launch in July 2010.

At the time of release of the Kinect, there were many companies, working in collabo-

ration with Microsoft to ascertain its possibilities, application and compatibility with

other gadgets. Villaroman (2011) argued that because of its immense appeal and at-

tention Microsoft announced that it would launch a commercial version along with

launch of Software Development Kit (SDK) for the companies [1]. Finally, Mi-

crosoft launched Windows SDK, the commercial version of Kinect. At that time,

different companies were working on different applications for Kinect.

1.2 Technology of Kinect

Israeli developer Primesense, developed Kinect v1. It was a combination of hardware

and software, both by Microsoft. Kinect v1 generated 3-D view of an object through

a combination of gadgetry including camera, infra-red projector and microchip spe-

cially designed for this purpose. Versions of 3-D reconstruction of image based was

done by scanner system called Light Coding. To capture video data in 3-D, in spite

of different light conditions, depth sensor has been designed with a combination of

monochrome Complementary metal–oxide–semiconductor (CMOS) sensor. The

depth sensor was an innovative feature addition that fitted well with most applica-

tions. Keeping in view, the presence of furniture or other obstacles, player’s physical

environment and game play can be calibrated automatically by Kinect software. It

also has the ability to adjust the depth of the sensor.

Figure 2- CMOS sensor, Primesense. Journal of Sensors (2014)

The developer PrimeSense, clarified that the number of people that can be tracked by

the software are only restricted by the number of people who can fit in the camera.

According to Microsoft only 6 players can be tracked by the software simultaneous-

ly. However joint players could go up to 20. The key features, which were regarded

as success of Kinect were its voice recognition, facial recognition and most im-

portant, gesture recognition.

1.3 Kinect (2.0 2013) – Designed for Xbox One

It was released in November 2013. The old technology of Primesense was replaced

by Microsoft’s developed ‘time of flight sensor’. According to most analysts like

Azzari (2013) this innovation uses a time of flight camera and has great ability of

processing of 2 GB per second. It has three times greater accuracy over its predeces-

sor and can track with the help of an Infrared (IR) sensors. It also has the ability of

tracking 6 skeletons at a time. Kinect v2 came with improved video communication

and applications, specifically developed for video analytics. The accompanying Mi-

crophone is used to provide voice commands.

1.4 Non- Commercial Kinect designed for Microsoft Windows

In Feb 2012, Microsoft released a new version that has Windows 7 compatible PC

drivers. This version provided capabilities to developers built by using C++, C# and

Visual basic. It also had access to low level streams from depth and other sensors.

Almost 50 companies worked with Microsoft for the development of Kinect (Chang,

2012). The enhanced capabilities were for skeletal tracking and advanced audio ca-

pabilities. Skeletal tracking was to allow tracking of people by gesture driven appli-

cations. The audio capabilities were integrated with speech recognition Windows

application programming interface (API).

1.5 Kinect Versions from 1.5 to 1.8

These were started and launched in 19 different countries. It was released in 2012. A

new application known as Kinect for Windows v1.5 SDK including Kinect Studio

was developed. Users interacting with the application were to record, debug and play

back clips. In this version, tracking of arms, neck and head of Kinect using person

was developed for new system or joint skeletal system. The versions from 1.6 to 1.8

further improved with minor variations.

1.6 Kinect v2

It was released for the first time in 2014. It was designed on the same technology as

was Kinect for Xbox one.

Figure 3-Kinect sensor components. Journal of Sensors (2013)

1.7 Significance of Kinect v2 in Assisted Living Facilities

According to Biswass (2011), Kinect v2 is an advanced motion sensor capable of

measuring 3-D motion in a person. Kinect SDK, Microsoft made Kinect for Win-

dows was an interface to kinetic hardware, which was provided by an Application

Programme.

Assisted living residence is for the people with some disability or those that have

attained old age who cannot live independently or have opted to not live inde-

pendently (El-laithy, 2012). In recent past, with scientific developments in this field,

there has been a transformation from ‘care as a service’ to “care as business”. It has

evolved to a huge industry, in 2012, a survey conducted in US facilities showed ex-

istence of 22,500 such facilities. These can be standalone services or part of multi-

level senior living community. Kinect v2 sensor has emerged as a potential contribu-

tor in improving the standards of assisted living. The features of v2 are; enhanced

field of view, improved picture resolution, enhanced skeletal tracking and recogni-

tion of joints.

1.8 Potential Use of Kinect V2 in Assisted Living

Most researchers like Stowers (2104) agree that Kinect v2 can be a potential contrib-

utor to many more domains to enhance standards for assisted living. It can be used in

building of smart home environment, detection of driver fatigue by multi-sensor sig-

nals based methods and to model movement in human body by using twin cylinder

method. Kinect v2 can also provide a platform for live 3-D human reconstruction as

well as capturing motion. It can help in monitoring of patients during external beam

radiotherapy and assist in recognition of Karate techniques and similar domains.

In the rehabilitation system, it can help in undertaking skeletal tracking in virtual

reality rehabilitation system. Kinect v2 has also been widely used in geometry re-

finements required in the motion fields and human body tracking based on discrete

wavelet transform (DWT). Moreover, it can be used in shadow detection and classi-

fication, estimation of movements of human body parts and propagation along hu-

man body parts (Rowe, 2011).

1.8.1 Different Spheres with Scope for Application of Kinect v2

Sensor

In this paper we will capitalize upon the potential of Kinect v2, with regards to

assited living. Kinect v2 can be a contributor to facilitate living in assisted living

environment for elderly people and treatment of illnesses. People can perform their

routine exercises under the view of Kinect sensor, because it can analyze the move-

ments and correct any mistakes and accordingly pass on the instructions. This can

provide much needed motivation for elderly people to exercise regularly. Another

innovation of v2 sensor with regards to assisted living can be by providing a hearing

system using human body as a medium of transmission. This mechanism of replacing

sound transmitter and transmission line can be done by Kinect v2 Sensor.

Kinect v2 can also help in the treatment of Parkinson’s disease. It can measure clini-

cally relevant movements with accuracy like hand clasping and even tapping. Rela-

tive improvements or worsening of these movements over time could also be accu-

rately measured using Kinect v2.

2 Review of related literature

2.1 Introduction

A variety of studies have been undertaken to review the efficacy of Kinect v2 Sensor.

Researchers have even gone ahead to recommend use and applications. However,

utilization of Kinect v2 in assisted living is rarely found. Some of the research in-

clude;

2.2 Use of Wireless Sensor Networks

For Wireless Sensor Networks (WSNs), analysis, proposal and implementation for

smart home for assisted living has been done by Hemant and Ghayvat (2013). Ac-

cording to them WSNs are today the backbone of many systems. Smart home sys-

tems that provide assisted living to patients already use WSNs. These researchers

provided a protocol designed for providing smart homes for assisted living. They

described this protocol implementation in an old home built to specifically test the

implementation of a wireless sensor network. The protocol targets event and com-

munication based protocols and provides smart home solutions. However, sensors

alone were not found to be enough. Intelligent sampling and control algorithm is

designed according to sensor type and structure.

2.3 Kinect v2 Depth Sensor

Research by Lin and Longyu (2013) extensively described the use of Kinect depth

sensor, since its launch. Even though Microsoft has released a new version with im-

proved hardware as well, however, in their view the accuracy needed a test. They

performed experiments to check the Kinetic v2 depth sensor and its accuracy. They

observed some variations in the depth evaluations of the Kinect and proposed a toler-

ation method to enhance the accuracy while evaluating depth. [2]

2.4 Use in Karate Techniques

An Effectiveness comparison of Kinect v1 and Kinect v2 for recognition of Oyama

karate techniques has been done by (Marek and Tomasz, 2010). The purpose of study

was to evaluate Kinect v1 and Kinect v2 to recognize the actions of Karate Tech-

niques named as Oyama. Initially, multimedia cameras were famous for personal

computers and game consoles and were also cheaper while being used for these pur-

poses. However, Kinect v2 has given the concept a wide array of use. Its use for hu-

man computer interaction also gave it a new dimension.

According to their research Kinect can be used in medicine, education and for con-

trolling robotics arm. Kinect v2 has come up as one of the best intelligent home solu-

tions and has many potentials, yet to be explored and fully utilized. Postural segmen-

tation and assessment of postural control capabilities are the most common ap-

proaches to be used. Classification method is used to make gesture recognition pos-

sible. To perform tracking and generate motion capture data, kinetic sensor data is

preprocessed by kinetic libraries. Kinect v2 has appeared by enhancing the capabili-

ties of its predecessor i.e. Kinect.

2.5 Advantages of v2 over v1

In Kinect v2, Gesture Description Language (GDL) has been used as a classification

algorithm. The data was recorded by two professional and belt instructors. The re-

search collected 200 x movement samples per person. The data was divided into two

sections as training and evaluation. The data was then thoroughly assessed. Kinect v2

proved to be more reliable than Kinect v1, taking stock of recognition rates of GDL

classifier and error classification cases. The major advantage of Kinect v2 over

Kinect v1 was accurate calculation of leg joint positions. [3]

Figure 4- GDL illustration. Teng et.al. (2013)

A different research conducted by the University of North Carolina at Chapel Hill

has correctly illustrated the functions and classification of Kinetic shadow detection

feature. The research shows that Kinetic maps are often found with holes, missing

data or similar missing links in many of the cases. In their research they advocate a

different idea, which is, turning holes into a useful information (Teng and Hui,

2014). They proposed different types of shadows based on local patterns as shown by

geometry. Shadow information is then fully used. [4]

2.6 Pose Estimation of Human Body Part Using Multiple Cameras

There is a lot of existing research on methods of estimating the pose of multiple 2-D

and 3-D images and objects as a starting point (Kuntal and Jun, 2014). In the re-

search the approximate volume in 3-D is obtained by projecting the silhouettes in

images. The authors analyzed that existing means of communication like the video

conferencing systems have few limitations. The users are often at far distances, one

of the solutions viewed for this problem are feeling of co-location of humans [2].

Views of points in real space interact to object in space. It has tackled the issue by

assuming space in 3-D modeled. In this paper an example of human body part with

pose estimation has been given. The works include pose parameters by random selec-

tion. The author conducted few experiments using CAD model of a human head,

which was undertaken utilizing 4 cameras. These were placed in a semi-circle in

equal distance. Any algorithm for the estimation of pose is difficult to extend easily

for the application. Silhouette edges for experiments were separated manually. Three

randomly chosen points in volume are taken, every fifth point on edge of silhouette is

taken. The results were initially not good, but the results later improved. The algo-

rithms developed by them can easily be used in future [3].

2.7 An Innovative Hearing System Utilizing the Human Body as a

Transmission Medium

Some researchers recommended an innovation in hearing system using the human

body as the medium of transmission (Son and Kwang, 2013). This concept has made

the replacement of sound transmitter with human body. Self-demodulation is the

base of generating audible sound. Frequency of two waves difference in audio signal

was produced by self-modulation effect, through a non-linear medium. In this con-

text a user is able to hear sound without a transmitter and making noise by using self-

modulation. The concept of wireless sound transmission has been given by the au-

thor. Distortions in propagation process can be reduced by ultrasound [19]. The pa-

per has successfully given the concept of using human body as a transmission medi-

um for the proposed system as model to be used.

Figure 5- Medical application. Lim et.al. (2014)

2.8 Accuracy and Reliability of Optimum Distance for High Per-

formance Kinect Sensor

A high-performance research conducted by Lim and Shafriza (2013), analyzed the

sensor from a different perspective. In their camera i.e. depth/rg camera, each pixel

represented a distance, which corresponded directly to some point in this physical

world [20]. Biomedical application is one of the successful features of Microsoft

Kinect Sensor as it gives the necessary tools required to provide measurements of

volume, length and other measurements. These technologies have become popular

with time, applications like Time of flight (TOF) and Microsoft Kinect Sensor are

applicable in biomedical field, and come in domain of range camera. The principal

of working of TOF camera is of emission of modulated light on the scene [17]. This

light is reflected and measured with reference signal. To obtain depth information, it

is correlated with modulated light. The technique used by Kinect sensor is different

as utilized in infrared structured light projector and CMOS camera, which computes

depth of the scene. Now, 3-D technologies have come in market with depth cameras

and Kinect Sensor. The primary aim of development of Kinect sensor is its utiliza-

tion in biomedical applications. This Kinect sensor is like a camera, due to its speci-

fications [14]. These authors focused on the fact that Kinect sensor can provide ac-

curate and reliable depth distance values same with actual distance. The analysis of

the measurement of depth to actual distance has a lot of importance for the accuracy

of Kinect sensor. The depth array calculated by the researchers had a precision of up

to 11 bits value. Therefore, it is likely that the depth sensor measurements of Kinect

Sensor will provide non-linear function of distance [18]. The focus of the research

was also on default range and near range of distance from Kinect sensor.

The authors undertook the task of investigating depth data of Kinect sensor. In this

study, they carried out a reliability analysis of the sensor’s specification as claimed

by Microsoft. This research provided an insight into the authenticity of data. Experi-

ments conducted with these sensors have proved that error in depth measurements

are enhanced by enhancing distance to sensor. These variations range from a few mm

to 40 mm at the max [15]. The formula used for these calculations was Kuder-

Richardson formula. This study proved to be very useful as it provided the method-

ology to determine 3-D pose estimation in human motion application by carrying out

accurate, precise and reliable depth distance.

2.9 Integration of Microsoft Kinect with Simulink: Real-Time Ob-

ject Tracking Example

Microsoft Kinect has a lot of potential in system applications due to its introduction

as a low cost and high resolution 3-D sensing systems (Joshua and Tyler, 2015). The

purpose of their study was to develop Kinect block, and provide access to depth im-

age streams, and access to sensor cameras. Available drivers of Kinect, the interface

of C language, is an impediment to Kinect application development. This access pro-

vides the ability to incorporate without any difficulty to an image processing based

on Simulink. The study focused on issues affiliated with implementation aspect of

Kinect. One of such important aspect is calibration of sensor, another is utility of

Kinect block and Kinect and it is through 3-D object tracking example [9].

Figure 6- Simulink Kinect. Joshua et.al. (2014)

For detection of both moving and stationary obstacles, it is greatly dependent upon

the capability of systems to navigate in unsure circumstances. In the available sen-

sors, Sonar sensors is a low-cost option, however, these are prone to false echoes and

reflections due to bad quality angular resolution. Another option is Infra-red and la-

ser range finders, these are also low cost, but the grey area is provision of measure-

ments from only one point in the scene. Other available option is of spectrum, Radar

and Light Detection and Ranging (LIDAR) systems, these can provide precise meas-

urements along with good angular resolutions [14]. However, they also have grey

areas. Most important of them are their high-power consumption and high expenses.

This complete picture and the revolution of low cost digital cameras has produced an

interest in vision-based setups for the vehicles, which are autonomous. Even in this

case, the disadvantage of distance must be of stereoscopic cameras. The recent re-

lease of the Microsoft Kinect addresses this issue by providing both a camera image

and a depth image. Kinect was primarily for entertainment market, however due to

its powerful capabilities to operate, it has gained a lot of popularity in sensing and

robotic community. Few of the examples of this popularity are applications related to

robot / human interaction, 3-D virtual environment construction, medicine, robot

tracking and sensing. Most of Kinect applications are coded in C [15]. In industry

and academia, use of image processing is now common place. Even inexperienced

users can use these tools. These are used to target hardware to implement by making

use of code generation automatically. Simulink provides a widely accepted environ-

ment for designing implementation of image processing algorithms as well. For ex-

ample, in automotive industry, Simulink is used to produce seamless code-generation

tools. These translate the final design into a target hardware, which is real time-

executable. These tools have a use in educational environments, it enables the stu-

dents to concentrate on major details rather than low level. The major contributions

of this study can be divided into three major spheres. Interface development allowing

Kinect to be involved in refined Simulink designs, it allows more users to access.

The targets discussed in the paper are Linux based, which are used in mobile auton-

omous robotic machines. Real time Kinect streams parallel camera and depth images.

2.10 Comparing the utility and usability of the Microsoft Kinect and

Leap Motion sensor devices in the context of their application

for gesture control of biomedical images

A study conducted by Hughes and Nextorov (2014) investigated the interaction of

medical images in operating room with a requirement of maintaining asepsis. This

arrangement has resulted in a complex type of arrangement in use of mouse and key-

board, between scrubbed clinicians and on other end non-scrubbed personnel [16]. It

is Microsoft Kinect or Leap motion which could give direct control of medical image

navigation and manipulation.

Figure 7- Leap Motion Sensor. Hughes et.al. (2015)

The authors admitted that many studies have already been undertaken to study use of

Leap and Microsoft Kinect in Operating Room, however, no study had compared the

sensors in terms of their usage. They aimed their study to compare the use, utility,

accuracy and acceptance of the two motion sensors. In this research, 42 persons par-

ticipates. Out of these 30 % were diagnostic Radiologists and 70 % were surgeons or

Interventional Radiologists. All the participants were having good computer skills

but limited gaming experience. In analysis of utility of two motion sensors, 50 % of

participants rated Microsoft Kinect v2 as very useful in their routine practice, how-

ever performance of Leap Motion Sensor was 38 %. Out of Surgeons and Interven-

tional Radiologists 54 % rated Microsoft Kinect as useful [13]. Younger participants

found Leap Motion interface as more useful than older. In 37.5 % participants, after

use of Kinect sensor, perception of leap motion sensor deteriorated. System accepta-

bility was better for Kinect Sensor, as compared to Leap Motion Sensor. With re-

spect to utility and use, Microsoft Kinect was found better. However, Leap motion

was found to have a better accuracy. Kinect was more acceptable to the users, alt-

hough Microsoft Kinect was tiresome physically. More than half of the surgeons and

interventional Radiologists found Microsoft Kinect v2 as very useful. As regards to

this study, Vascular and Orthopedic surgeons found the sensors to be more useful.

The measurement accuracy was found not to be of good standard, which can be at-

tributed to many factors including the system’s field of view.

For Leap Motion Sensor, user needed to place the cursor at end or start point of ana-

tomical structure and keep the hand sable before indicator of selection is seen [5].

More time was taken before selecting the measurement point. On the other hand,

Kinect proved to be better, as took short time. In certain cases, participant had

showed hand before selecting the end point, so the measurement command was com-

pleted prematurely. Few gestures were initially available for both sensors and were

available. However, later gestures were disabled and replaced by discrete input or

click. Due to requirement of time to implement measured command in four seconds

for the startup and end measurements, both sensors were found to be slower than

average time. In terms of time to task completion, as per the prior studies conducted,

with adequate practice, motion sensors performed better than the mouse. Fastest

time for a participant was 6.38 secs for Leap Motion Sensor and 7.54 secs for Mi-

crosoft Kinect V2. These times are lower than overall average time to indicate and

measure [11].

Figure 8- Motion Sensor illustration. Hughes et.al. (2015)

System use influences utility of system by surgeons. It has been shown by relation-

ship between use and utility, due to poor use, there comes poor utility. Study proved

that Leap Motion Sensor could not be equivalent to Kinect v2, because younger doc-

tors were more comfortable with use of Motion Sensors, as compared to Kinect [9].

2.11 A Depth-Based Fall Detection System Using a Kinect Sensor

Researchers have also tested Kinect sensors application in fall detection systems.

Samuel and Enea (2014), for instance, carried out a study proposing a fall detection

system basing on Microsoft Kinect. This system is privacy preserving and automatic.

The raw depth data, which is provided by the sensors is analyzed by means of ad-hoc

algorithm. This system implements a definite solution to categorize, all the blobs in

the specific scene. Whenever a person is identified, a tracking algorithm is followed

between different frames. When use of depth frame is made of, it allows to extract

human body, even when it is interacting with other things such as a wall, or a tree.

Inter-frame processing algorithm helps to efficiently solve the problem of blob fusion

[14]. If a depth blob attached to a person is near the floor, the fall is detected.

Figure 9- Fall detection illustration. Samuele et.al. (2014)

In the study, in top- view configuration, using Kinect Sensor method of automatic

fall detection has been proposed. Without relying on sensors, which are wearable and

by the exploitation of privacy- preserving depth data only, this approach allows de-

tecting a fall event. With the help of ad hoc discrimination algorithm, this system

could identify and bifurcate the stationery objects from human subjects, within scene.

Simultaneous tracking can be done and numerous human subjects can be monitored.

Authors confirmed through experiments the capability of identifying the human body

during a fall event. Moreover, the capability of algorithm recommended for tackling

the blob fusions in domain of depth.

The system proposed in this research has been tested and realized on PC with fea-

tures of Windows 7, i5 processor with a RAM of 4 GB. The proposed algorithm can

be adapted by diff depth sensors, and it needs only depth information as input data.

Moreover, embedded real time implementation has been done featuring Linaro

12.11, Cortex A-9 and 2 GB RAM. Authors foresee that future research activities

will focus to simultaneously tackle and manage various depth sensors by improving

and enhancing the performance of the algorithm. The system will be made to support

the tracking of subjects whenever it endeavor to cross areas covered by adjacent sen-

sors.

2.12 Experimental Studies on Human Body Communication Char-

acteristics based upon Capacitive Coupling

Researcher at the Academy of Sciences, Shenzhen, China studied Human Body

Communication and regarded it as technology of transmission for sensor network

applications for short range (Wen-cheng and Ze-dong, 2014). There are few full-

scale measurements, which described body channel propagation on capacitive cou-

pling [11]. The study has its focus on experimenting various body parts, investigating

the features of body channel. By making use of coupling technique, both in terms of

frequency and time, the characteristics of body channel may be measured. Based on

the results measurements, it was observed that the body maintained stable character-

istics. Elbow, wrist and knee affected channel affected the attenuation characteristics

[19].

2.13 Body Movement Analysis and Recognition

Different studies have also proposed human-robot interaction basing on innovative

combination of sensors. Yand and Hui (2014) conducted a study on communication

by non-verbal ways for communication of robots and humans by developing an un-

derstanding of human body gestures. The robot can express itself by making use of

body movements, such as facial expressions, movements of body parts and verbal

expression. For this communication, twelve gestures of upper body will be utilized.

Interactions of objects and humans are also included in these. Gestures are character-

ized by the information of arm, hand posture and arm. To capture the hand posture,

use is made of Cyber Glove II. Microsoft Kinect gives information for head and arm

posture [12]. This is an up to date solution of human gesture combination by the sen-

sors. Basing on the data obtained by posture data of body, proposal has been made of

human gestures recognition, which is real time, as well as effective. In this study,

experiments were also conducted to prove the efficacy and effectiveness of the ap-

proach proposed in this.

Figure 10- Movement analysis Glove. Yang et.al. (2012)

Human-computer interaction has recently gained the interest and attention of indus-

trial and academic communities, and is still not very old field as it started in 1990s.

This field has contributions from mechanical engineering, computer sciences and

mathematics. Unlike interactions of earlier times, more social dynamics aspect must

be expected in domain of human-robot interactions. As people want to interact with

robots, as they do with other humans, so robot human interaction is needed to be

made more believable. Robots should be able to make use of verbal and body lan-

guage, as well as facial expressions [10]. Some robots are already being used for this

goal. Nao Humanoid Robot1 can use gestures and body expressions. The main con-

cern of the study was to establish means of communication between robot and human

using body language. One of the main purpose of the study was to apply other than

verbal language to human-robot interaction in social domain. Upper body, gestures

are applied, which are 12 in number. These are involved in recommended system and

are all intuitive and natural gestures. They characterize themselves by arm, head and

posture information. Human-object interactions are involved in these gestures.

Figure 11- Humanoid robotics illustration. Clingal et.al. (2014)

A human body dataset is constructed to analyze the recommended recognition meth-

od. The dataset was made by making results from 25 samples of different body sizes,

culture backgrounds and genders. Efficiency and effectivity of the recommended

system has been proven by the experiments. Few of the major aspects of the study

are:

Kinect and Cyber Glove II are integrated to captured arm, head and hand pos-

ture. For this human gesture-capture sensor is recommended.

For recognition of upper body gestures, a real time and effective sensor is

recommended.

A gesture understanding and human robot interaction system is built to assist

humans to interact with robots.

A scenario was established in which, a user and a robot classroom interaction was

created for a case study of GUHRI system. The user is student and robot acts as lec-

turer. Robot can understand the upper body gestures, 12 in number. Robot is like

humans and can react by combining facial expression, verbal language or body

movement. The behavior of robot in class is triggered by the body language of the

user [7]. Here all the actions are completely consistent with the established scenario.

GUHRI system has also the ability to tackle unexpected situations like, if a user an-

swers a phone call suddenly, it can react appropriately. Regarding proper under-

standing of upper body gestures, dynamic are the important body language compo-

nents in daily life. They provide clue for communication to enhance performance for

this communication. To make robot- human interaction, robot should be able to un-

derstand static as well as dynamic gestures with the help of movement analysis and

recognition of human gestures. Human body 3-D combined information can be ob-

tained in real time by the Microsoft’s Kinect SDK. By the change in position of body

joint in temporal axis, motion information can be obtained. Activity recognition has

already also been done to by this information of body joint motion. Possibility of

ignoring hand gestures is still there, due to which chances of ignoring gestures by

hand are there. Future is likely to be marked by studies on recognition of gestures of

upper body and body motion, as well as requisite information by hand gestures. An-

other dimension is recognition of sensor form egocentric point of view. In the rec-

ommended GUHRI system of the paper, Kinect has been proposed as vision sensor.

It is not a perfect system and has many limitations like inability to change viewpoint

due to fixed position of Kinect Sensor. Due to this limitation, it is always not possi-

ble for remote to get maximum viewpoint of gestures by the human body. One of the

options available to solve this problem is to get gesture information by egocentric

perspective of the robot. This provides opportunity for changing the view point of the

robot, but it gives birth to some new problems. As the distinction between motion of

a camera and a real body motion will be difficult for the robot [11]. In future, further

work can also be done in this regard by understanding the integration of verbal clues

to GUHRI system to further increase the robot-human interaction. If robot is more

autonomous in seeing and hearing, it will become more like humans.

This paper has recommended in overall context, a GUHRI system, with understand-

ing of robots and human interactions and innovative understanding of gestures. Ro-

bot can understand 12 upper body part gestures which can be comprehended by the

robot. By a few features like facial expressions, body movements and verbal expres-

sion, robot also has the ability to express itself. A combination of sensors has been

recommended to combine Microsoft’s Kinect and Cyber Glove to capture posture of

head arm and hand simultaneously [3]. By doing this, an effective and real-time ges-

ture recognition mechanism has been recommended. In the experiments, human body

gesture dataset has been built. The efficiency of our gesture recognition has been

built by results of the experiments conducted. Till now, the gestures involved are

static gestures like of having question, to appreciate, to call, to drink etc. In this

study, the future recommendations are to understand dynamic gestures as to say no,

to clap, to wave hand. Another important recommended addition is of speech recog-

nition; it would make the interaction more real.

2.14 An Integrated Platform for Live 3-D Human Reconstruction

and Motion Capturing

There are also experiments and studies that show how Kinect technology can be used

for live 3-D human reconstruction and motion capturing. In their research, Imitrios

and Alexadis (2011) investigate the developments in 3-D capturing, processing and

provided ways to unlock pathways of 3-D applications. Their study addresses tasks

of real time capturing and motion tracking by explaining main features of an inte-

grated platform targeting future 3-D applications. Moreover, along this, an innova-

tive sensors calibration method has also been discussed. Basing on an increased de-

viation of volumetric Fourier transform based method, an innovative method of re-

construction has been from RGB-D has been recommended in this paper. The paper

also proposed, a qualitative evaluation of 3-D reconstruction mechanisms, as existing

evaluation methods have been found quite irrelevant. Overall, an accurate mecha-

nism of real time human body tracking has been recommended, that also was basd on

a generic and multiple depth based mechanism. Experiments conducted in the study

proved the lessons of the study.

In this study, including multi-Kinetic v2, capturing reconstruction of moving hu-

man’s other applications like fast reconstruction of humans, and based on skeleton-

motion tracking by depth cameras has been described and main elements of integrat-

ed system have been described elaborately. Based on these elements, innovative ap-

proaches have been recommended in this paper and discussion on existing approach-

es have also been explained. Along with this, an innovative mechanism for evalua-

tion of 3-D reconstruction system, has also been recommended. Some limitations of

ongoing researches have also been discussed. Imperfect synchronization issue with

RGBD sensors, may lower the construction quality, it is one of the main limitations

of this research. In domain of skeleton tracking mechanism, short comings of topol-

ogy change are to take over by fitting of skeleton scheme [2]. Moreover, by splitting

the body into upper and lower parts and fusing our mechanism of data from inertial

measurements, limitations can be overcome.

Figure 12- RGDB illustration. Immitrios et.al. (2014)

2.15 Automated Training and Maintenance through Kinect

Availability of Kinect at low cost rates and its provision of high quality sensors has

enabled researchers like Saket and Jagannath (2011) conduct a study to reduce bur-

den on mechanics involved in automobile maintenance, undertaken in centralized

workshops [1]. A system prototype has been recommended that works with Kinect.

Speech and gesture are the two modes of operation of this system. If on speech

mode, it can be controlled by various audio commands. It can also be controlled by

gesture mode. Gesture recognition is done by Kinect System. The system along with

RGB depth camera processed data of skeletal, by keep record of body joints. Recog-

nition of gestures is undertaken by checking user movements against the predefined

situation. Real time image data streams are captured by high density camera, 3-D

model is generated and superimposed on data being received in real timeframe.

In the recommended system, Kinect plays an important role. It works as a tracking

instrument for the developed Reality System. [6] The system recommended in this

paper utilizes few of the very important features of Kinect, which are speech recogni-

tion, joint estimation and tracking system of skeletal. One of the most important fea-

tures of Kinect is tracking of skeletal. Reason for it is the ability of finding user’s

position by using it, which is used for guiding the user in assembly procedure. It also

makes use in recognition of gesture. This assembly helps to bring the individual parts

together and join them as a single product. These assemblies, can further be bifurcat-

ed into full and partial assemblies. The basic mode, which is also called as full as-

sembly mode will teach the technicians about the procedure of assembling of particu-

lar product. In partial assembly mode, the role of Kinect becomes more important, as

the technician is guided in detail about the assembly of parts. When assembly of one

part is completed, next part assembly can be started with [12]. There are two differ-

ent modes, in which the system can work and these ate Gesture and Speech modes.

Basing on the user’s acquaintance and know-how/ experience on the system, the user

can select the mode, according to his/her convenience. If speech mode has been se-

lected, the user will command by speaking. On selecting Gesture Mode, user inter-

acts by using gestures, whereas the system guides by voice commands. For example,

the START word of command to start with the system.

The research has elaborately discussed about the use of Kinect sensor for tracking

and detection issues. Not only as tracking device, but Kinect is also being used as an

input device. The study is a step towards making automatic repair and maintenance

of vehicles. The recommended system will assist in reducing the work load on

skilled experts for considering regular activities. Instead this system can be used for

small jobs. By doing this, process of documentation will also become simple. The

supervisor has no need to roam around in this system [2]. The system keeps check on

each step, so the process of step wise verification also becomes simpler. The system

recommended in study is likely to bring many opportunities for engineering based

companies making use of Augmented Reality to make their complex tasks easy.

Overall, this system can contribute a lot towards an improvement in the system of

repair and maintenance.

2.15 Kinect in the Kitchen: Testing Depth Camera Interactions in

Practical Home Environments

Galen (2013), from University of California, Berkeley has carried out a study that

depth cameras are being used in millions of houses due to developments in Microsoft

Kinect.

Figure 13- Smart Home System illustration. Berkley University Journal et.al.

(2013)

This study has taken Kinect to real kitchens. Although, touchless gestural controls

can prove to be difficult for few but it enables the commands to be transformed into

movements of cooking. This smart kitchen enables the users to alter the scheme and

control with other limbs, when hands are not empty. The recommended system was

tested with 5 different persons, who cooked in respective kitchens and identified that

placing the Kinect was simple and reason of their success. An important challenge

was accidental commands in the kitchen [12].

The experiment proved that the users found the system, easy and pleasing with low

levels frustration. It was also felt that implementation of the system enabled to load

music and recipes. It was helpful, as the interaction style was general. All subjects

expressed that although, it was difficult and messy to cook but they were quite happy

about the experience. The observations were not favorable in view of those conduct-

ing the research. Accidental use of navigational aid caused a lot of mess. Other than

accidental pressing of buttons, during change of directions, sweeping hand also

caused problems. Some of the errors occurred, when the subjects pushed buttons,

while focusing elsewhere. Another problem that was experienced was that the sub-

jects mostly pushed the wrong buttons. Most of these wrong pressings were due to

pushing of buttons more quickly. Kinect SDK smoothing was the reason for this, by

the authors. The subjects liked the lock buttons on the screens, but were rarely used

by them [17]. During conduct of the experiment, few subjects could not identify that

the lock was not automatic but was result of automatic pushing of the button. It is

recommended that for the future use, locking system should be made automatic, es-

pecially when the subject turns sideways (resultantly position of the axis joint col-

lapses towards inner side) towards the side counters or towards the side of face coun-

ters behind. For this system, it is recommended to make this unlocking a process

involving 2 steps, instead of keeping it a single step process. The Kinect proved to be

extremely useful during the conduct of the experiment. Especially the ease of posi-

tioning Kinect was surprising for the users. During the conduct of experiment, cam-

era was so placed that the subject generally remained in the frame. One important

aspect was requirement of distance in the experiment. To do this, the cart was gener-

ally placed, out of the kitchen, and out of the way.

2.16 Kinect Gaming and Physiotherapy

Research conducted by Sachin and Singh (2014) from University of Pune recom-

mended a system that joins 2 applications of Kinect. These are Kinect gaming as well

as application of Kinect, used for physiotherapy. The recommended system, under-

take the tasks basing on critical features, such as depth recognition, tracking of skele-

tal and recognition of gestures. Kinect camera is the key instrument, as per the stud-

ies, which gets all the operations to be implemented [2]. The movement of subject

body was tracked by implementing skeletal tracking and by identifying key points on

the skeleton of human body. Depth recognition is another important feature of the

system. It is carried out to segment the front and rear ground of the image. Depend-

ing upon pixel color, it has also the ability to separate a person from the background.

Kinect is required to conduct these operations. One of the main reasons to do so is, as

it has the capability to produce RGB streams and depth at lesser cost than the usual

sensors in common use. Kinect can measure the distance of any given point from the

Kinect Sensor, as it has the time of flight camera. To undertake this open Kinect

driver framework for Kinect is being implemented. It has the capability to generate

depth images. For performance of applications, normally Kinect is used along with

console device [12]. Console device is quite costly, therefore, in this study an effort

is being made to give away with the console device, rather to tackle the problem of

tracking of human skeletal is being undertaken using Microsoft Kinect. In this study,

an effort has been made to maximize the hardware and by finishing the console de-

vice, the procedures are to be conducted by incorporating Kinect with developed and

refined system programs to undertake the particular set of operations [15]. The study

panel has recommended the final project implementation which can be utilized for

further development of applications.

3 Research Methodology

3.1 Introduction

This section lays out the procedures and methods employed in this research. In this

research documentary analysis will be primarily be used. This section will outline the

results and facts from previous research methods like sampling, research design and

data analysis. Additionally, concerns have been raised about the applicability of the

different Kinect innovations and discoveries (Bevilacqua, 2014). This research will

address those concerns. An experimental analysis on the effectiveness of Kinect in

assisted living environments is crucial as it helps Ambient Assisted Living (AAL)

organizations benchmark against best standards and practices. In his research,

Konstantinidis (2015) expressed the need for AAL organizations to adapt to external

environments and patient needs as a strategy that helps in improving both the tech-

nical and practical application of Kinect. This is particularly important as most Smart

home environments are shifting towards a service culture and staff reduction strategy

which has a more demanding clientele. This research will analyze results from clini-

cal experiments in Kinetic devices like Camera tracking. In his research Anastatiou

(2011), analyzed the efficacy of kinetic camera in tracking hand, elbow and trunk

movements.

In addition, a glimpse of available research works show that Kinect devices have

been extensively researched and documented. Experimental research has been done

in 3-D mapping technological improvements and in body tracking. In this context,

this research will analyze consequential advances in related technologies like GPU

systems and sensors that facilitate technological improvements and new Kinect ap-

plications. Technologies like Mo-cap, Kinect v1 and Kinect v2, have been used to

properly perform experiments in assisted living environments. Test for this system

involve sitting, walking and standing.

Figure 14- Pose Experiments, Kinect tests. (2013)

3.2 Model of the research

This research will employ a documentary analysis strategy and will primarily use

experimental and clinical studies. Experimental results will be used to determine the

impact of Kinect and the different applications in Assisted Living Environments.

Main upside of a documentary analysis is that it’s cost effective and relies on scien-

tifically approved approaches to conduct the study Clembers (2001). Documentary

analysis also tends to work with an unlimited scope making the research simple and

logistically easier compared to other research methods. Results from clinical tests

and applications were also used to answer the research objectives.

Statistical Package for science (SPSS) was used to analyze all the collected data after

which descriptive metrics like means, averages, percentages and frequencies were

used for further analysis. Data interpretation was conducted in respect to the frame of

reference of the research problem and objectives.

According to researchers like Robinson (2003), validity and reliability of data collec-

tion methods directly determines the accuracy of collected data. Reliability ensures

that instruments used yield consistent results. To ensure objectivity and accuracy of

the research a different department was tasked at auditing and inspecting the docu-

ments used in the research. Cronbach’s Alpha was used to check for consistency in

obtained results. The Alpha, which ranges from 0-1, measures the level of reliability

in an increasing rate. According to Dristern (1990), the minimum value of reliability

of a research should be at a value of 0.6.

In this research, the research team also corrected for inconsistencies, errors and mod-

ified the formulas used to increase accuracy.

3.3 Research Design

The research design employed in this study will outline the blueprint and plan for

answering the research questions and fulfilling research objectives. According to

Blumberg (2005), a research design shows the plan that will guide researchers in

answering the research questions.

Although researchers concur that it’s sometimes technical to perform a research us-

ing documentary analysis, they agree that it’s an important approach which can help

researchers get deeper insights especially if they use a combination of methodologies

Flinter (2009).

Figure 15- Research design

3.4 Primary Data

In the collection of data. More emphasis was placed on data that could be analyzed.

Quantitative: Will entail numerical data collected from questionnaire, interviews and

surveys. Quantitative data are easily analyzable and can be used to show patterns and

trend. Graphs, pie charts and tables can be used to further illustrate quantitative data

which can then be used to draw inferences. Email survey will be used because of its

easy admissibility and the potential to survey large number of respondents.

Qualitative: These are non-numerical data collected from methods like one-on-one

interviews and observations. Qualitative data can help in clearing any bias that may

result from quantitative data collection methods. Questions are asked directly to the

interviewer or respondent.

3.5 Summary

Results from various journals, books and literature sets will be used to form an opin-

ion on the use of Kinect and its application on smart living environments. Important-

ly, this research seeks to outline the future trends in Kinect applications and use in

AAL environments. Although researchers in this fields Webster (2014) believe the

application of Kinect to AAL is still at its infancy this research will delve into the

future of applications and its relationship with other technologies like the IoT (inter-

net of things) and Olympus camera.

4 Data analysis and presentation

4.1 Introduction

For comprehensive analysis, the following sections of the paper are organized in a

documentary analysis manner. In this case, documentary analysis is used as a tool to

gather evidence that center around use of Microsoft Kinect application, weaknesses

and use in assisted living environments. This section will analyze lab results from

conducted experiments, survey and studies in Kinect and Kinect components.

The most important reason documentary analysis was used for this research is that it

is efficient. In our case, documented research papers, journals are easily accessible

and their documented results verifiable. In this section, different research papers are

analyzed to form an opinion on the future and applications of Kinect in assisted liv-

ing environments. General research data was used to design the final data analysis

technique. This section analyzed existing protocols used in assisted living environ-

ments and proposed new protocols and areas of research. Key approach for this re-

search is to build on the works of previous researchers such as Yang et al (2015) and

Gradinaru (2016) both of who proposed new technologies for 3-D representation

using sensors. In his research, Gradinaru (2016) designed new systems and software

for capturing and display of animated information.

Other sister technologies involved in the development of Kinect applications like 3-D

sensing tools were also analyzed, for video and still cameras.

Figure 16- Gradinaru (2016) graphical representation of system

Some of the key areas targeted for analysis include;

Smart Home environments

Movement Detection Models

Internet of things and its impact on Kinect

Skeletal Tracking systems

4.2 Smart Home environments

Smart home systems play a critical in creation and continuity of Kinect operations in

assisted living environments. According to Kawatsu (2014) a smart home environ-

ment is one that creates interconnections between a physical environment. In a smart

home environment, people have the expectations that the technologies can be used to

improve their everyday life. Applications of smart home systems can be in commu-

nication, safety, welfare and appliances. The devices used in home systems environ-

ment consist of communication modules, cameras, sensors and actuators. Overall, a

server is used to manage all the operations of the smart home environments.

In their research, Baeg et al (2007) constructed from scratch a smart home environ-

ment in the research building of KITECH (Korea Institute of Industrial Technology).

This research aimed to demonstrate the efficacy and practicability of robot assisted

home environment. The research featured custom made sensors, actuators a robot

and a database.

The researchers made use of RFID (Radio-frequency identification) technology to

identify, track and follow objects within the home system. RFID uses radiofrequency

to track objects. RFID tag was used to identify objects in the environment. Basically,

objects with the tag were considered smart appliances. Apart from the smart envi-

ronment, the conceptual framework consisted of servers and a robot. Smart objects

were assigned sensor capabilities which meant they could they could communicate

with both the server and the robots.

Below figure shows the conceptual environment:

Figure 17- Conceptual Framework of a smart home environment

The smart environment was divided into layers; the first layer consisted of the real

home environment which has scattered setting of objects and appliances. The second

layer consisted of actuators and wireless sensors. This level includes additional sen-

sors like temperature sensors, RFID readers, smart lights, and humidity and security

sensors. In level three there were devices like tables, chairs and shelves which all had

RFID sensors to enable identification ease. In the fourth level there existed a com-

munication protocol which ensured that reliable and accurate communication be-

tween the home server and other devices in the vicinity. The server which managed

the relationship between the devices and the sensors was in level five.

Figure 18- Smart home environment layered description

In this experiment, the main use of the robot was to allow several key functions like;

mapping, localization, object recognition, and interaction. To that end, the robot was

equipped with ultra-violent sensors, cameras, ultra sound, a good processing speed

and adequate memory.

For this experiment, specific home services were selected to be performed that repli-

cated real home services. The objective of the smart home environment was to give

users close to real life services. Some of the functions to be performed in the smart

environment included; Object cleaning, running home errands and executing home

security functions.

Object cleaning; in this scenario, the service robot is tasked at tiding up the room or

environment. The robot does this by arranging objects in a required or preset way.

RFID installed in the roof of the home are used to direct the robot on navigation and

what objects to clean. Purpose of this part of the experiment was to investigate the

potential use of robots in tasks such as laundry cleaning, home arrangement and tasks

like doing dishes.

Performing errands; In this case, robots are tasked at identifying and fetching specif-

ic objects or smart items around the smart home. Fetched objects have RFID tags

which means they are easily identifiable within the network. The fetch function

works after receiving a command from a person. The robot then sends a request with

the position of the object to be fetched, after receiving the information it moves to

where the device is, grabs it and sends it back.

In this research, the researchers used two key modules; RFID interfaces and Com-

munication modules. The protocol used to operate the communication module was

the ZigBee protocol. The ZigBee protocol is an open standard protocol based on

802.15.4b. The protocol provides inter connections for different applications that is

low-power and wireless. ZigBee protocol was used for all the devices. On the other

hand, EPCglobal Gen2 was used for RFID modules. EPCglobal Gen2 employs a

standard for the use and applications of any RFID module.

The team used below physical layout for the research;

Figure 19- Smart home environment layout

This paper outlines innovative ways which can help improve assisted living envi-

ronments. The architecture employed and use of RFID systems proves that smart

home systems can be created from available materials and technology. Scenarios

performed by robots like cleaning, arrangements can be employed in assisted living

environments. According to the researchers, the goal was to create environment

where people are served by robots. The robots work by ensuring the environment is

as required. The robots employed in this research can be used to help individuals in

assisted living environments perform basic functions like cleaning, washing or house

arranging.

With such developments in robotics and creation of smart homes, Kinect v2 can be

used employed both for navigation and dense map creation. The Kinect v2, as op-

posed to v1 is based build on time-of-light principle which means that it can even be

used out of homes. RFID sensors employed in this research can be in particular very

useful when it comes to mobile robot movement.

For robotic applications, Kinect v2 sensor has been used by researchers to provide a

much better application primarily because of the ToF technology employed. By us-

ing ToF, accurate measurements for objects can be obtained and used. Also, due to

the high resolution cameras, a lot of information is captured. The result is that home

environments are accurately mapped with fine details and minimal errors. With

Kinect v2’s active illumination, surrounding images are captured even in dark envi-

ronments.

Research conducted by Hondori et al (2013), gave important insights in the applica-

tion of Microsoft Kinect v2 in a smart home setting. The research focused on ges-

tures and made use of sensor fusion on Kinect and inertia sensors. The goal of the

research was to access the significance of smart home systems in assisting post-

stroke patients’ complete day to day activities. To achieve this, Microsoft Kinect was

used to monitor activities such as spoon acceleration, wrist position, elbow position,

shoulder joints and angular positions. Purpose was to distinguish between healthy

and paralyzed individuals. This distinction is a complex problem in assisted living

environments. Microsoft Kinect and Inertia were successfully tested in these envi-

ronments. The use of a smart home systems in assisting stroke patients was driven by

the high cost associated with visiting rehab facilities. Convenience of having smart

home systems would allow doctors and therapist to remotely assist clients. The smart

home systems would help therapists monitor patients and analyze improve-

ments/progress.

As opposed to the smart home systems developed by Zheng et al (2013), the systems

developed and tested by Hondori et al (2013) did not rely on numeric integration of

inertia measurement unit (IMU). This research made use of inertia and Kinect sen-

sors at the same time. The main activity used to record movements was intake ges-

tures. Critical body functions like eating and drinking were selected. The setup in-

cluded Microsoft Kinect, Sensor fusion and Inertia sensor. Inertia sensors were

placed on different items like utensils which recorded movements of both the subject

and the items they were using. A Kinect sensor was also placed on the table to moni-

tor individual movements while eating and drinking.

Figure 20- Hondori et al (2013) system set up including inertia sensors and

Kinect sensors

Individuals were asked to perform different tasks in order to record the experimental

data.

Eating and drinking task- activities such as eating, cutting steak and drinking water

were performed and repeated for a couple of times. These movements were then ana-

lysed in 3-D trajectories as seen below.

Figure 21- Hondori et al (2013) 3-D trajectories

The body movements are measured in terms of degrees.

Right elbow- changes in the range of 50-110

Left elbow- changes in the range of 65-115

Kinect sensor data analysis- above figure shows body movements and changes.

The movements on the wrist and joints illustrate body movements. These shows

movements of the individual’s limbs while his head is still.

Figure 22- Hondori et al (2013) experimental data on body movements

Figure 23- Hondori et al (2013) limb changes in task like drinking and eating

Figure 24- Hondori et al (2013) inertia sensor data from individual’s items

Data measured from inertia sensors is illustrated by figure 23. The bias on the signal

is approximately 9.81 m/s due to gravity. This is adjusted and factored in each of the

3-D measuring unit. It was found that during cutting of the stake, the frequency rec-

orded was of the highest value while magnitude was stud. Frequency during drinking

is constant.

This researched proved that smart home environments could lessen the burden in-

curred by the post stroke patients. The systems could also give vital data to physician

for proper monitoring and study of patients. Microsoft Kinect and inertia sensors are

vital for the system. The researches demonstrated that it’s possible to capture move-

ments and positions such as angular displacement and limb gestures. While other

researchers have performed the same research using on-body sensing techniques this

research relied solely on Kinect and inertia sensors.

The system used in this research can be used in clinical assisted living environments.

A different research conducted by Mohamed et al (2013) assessed how smart home

systems can be used to assist individuals with disabilities while making use of Mi-

crosoft Kinect systems. The recommended systems aimed at monitoring elderly indi-

viduals. The systems recognized gestures and body actions and gave feedback

through a network. The key goal of the experiment was to monitor elderly individu-

als in their natural environment. To this end, two projects were initiated, DOMUS

and GUARDIAN ANGEL.

Objective of the GUARDIAN ANGEL project was to produce sensors that could be

integrated to any media type. Monitoring of all the various object parameters was a

key objective of the experiment. According to the researchers, Microsoft Kinect was

used because superior advantages compared to other sensors. Some of the key ad-

vantages included; RGB camera, depth camera and infrared transmitter.

Figure 25- Mohamed et al (2013) smart house used in the experiment

Some of the favorable characteristic of Microsoft Kinect is as illustrated in below

diagram;

Property Specification

Field of View (Horizontal, Vertical, Diagonal) 58 H, 45 V, 70 D

Depth image size VGA (640X480)

Spatial x/y resolution 3 mm

Depth z resolution (at 2m distance from sensor) 1cm

Maximum image throughput (frame rate) 60FPS

Color image size UXGA (1600x1200)

Data interface/power supply USB 2.0

Power consumption 2.25W

Operating environment Indoors

Table 1-Characteristics of Microsoft Kinect Components

Processing of the recorded data was done via three data streams generated by IR light

reflected from the scene. Below is an image of the natural user interface.

The data is transmitted in three streams image, depth and audio. The Kinect system

was relied upon to give accurate 3-D information.

Application

Processed data (Natural Human Interaction Library)

Data Streams (Image stream, Depth stream, Audio stream)

Kinect Sensor

Figure 26- Mohamed et al (2013) Natural User Interface

Tested activities included gestures that were done using hand positions. The Kinect

sensor assessed the position using 20 joints. X and Y coordinates of each joint was

calculated. Below are images of the gestures and postures. The application could

detect gestures and postures. Two methods were used to recognize gestures algo-

rithm or template based. Because of needed flexibility, the 1 dollar and N dollar al-

gorithm were used. These algorithms can be implemented in different environments

even in a prototyping context. In this case, an act done by an individual is recognized

and compared to previously recorded sets of points. 1 dollar algorithm takes note of

movements “unistroke”. A gesture is denoted by a continuous gesture “multistroke”.

In the 1 dollar unistroke four identifiers are used to recognize templates that are

compared to stored data.

Resampling

Rotation based on indicative angle

Scaling and translation

Score calculation

Figure 27- Mohamed et al (2013) Waist detection posture

Figure 28- Mohamed et al (2013) Waist detection posture

The recognition scenario is performed as shown in the figure 29 below.

Figure 29- Mohamed et al (2013) Kinect procedure for gesture recognition

A toolbox with Kinect SDK was used for the experiment. The toolbox utilizes both

Golden section search and 1 dollar method. Theoretically, the two methods both fa-

cilitate technical understanding of gestures like a circle as illustrated below.

Figure 30- Mohamed et al (2013)

Kinect toolbox recognition of circle gestures

The researchers concluded that ultimately a lot of Kinect sensors may be needed to

properly monitor a complete home environment like a large building hospital. The

researchers made use of a WIFI network in mesh topology. When a gesture was de-

Waiting for push/pull gesture

Waiting for skeleton moving

Recognizer Algorithm

Detecting Unistroke gesture

Network Communication

Device Action

tected an alert was sent via text or a simple alert. Communication from all frames

was hence not required. The system worked such that in case there was an emergen-

cy a text alert was transmitted.

The figure below shows the communication process.

Figure 31- Mohamed et al (2013) Kinect toolbox recognition of circle gestures

In general, the program can detect gestures and communicating it via text transmis-

sion. Unlike other smart home systems, in this research the sensors were non-

intrusive to the users. The researchers successfully found the appropriate algorithm

for gesture commands. For future experiments, the researchers aimed to using an

Ethernet gateway EIB/KNX to accommodate many actuators.

4.3 Movement detection models

There exist several researches that have devolved into movement detection models

and its application to Kinect environments. Some of the research like Chin et al

(2015), focuses on optimum desistance for Kinect model detection. The researchers

focused on accuracy and reliability of Kinect cameras and sensors. Apart from giving

insights on the quality of the pictures. Calculations were conducted to analyze abso-

lute error percentages at varying distance.

The researchers studied the Kinect camera as a research based camera. There exists

little research that illustrate the reliability and accuracy of the Kinect camera as a

research camera.

Thread client creation

Execution of the program connection on

the server and sending alert

Closing connection and

terminating thread

Main program

Gesture recognition

The Kinect camera hardware components are as shown below.

Figure 32- Chin et al (2013) Three Kinect sensors, IR light, RGB camera, IR

detector

According to the product specifications. The Kinect sensor has a dual depth range,

default and near range. At the two ranges the depth sensor returns 3-D images with

x,y and z co-ordinates. There exists a blind spot on at approximately 0-0.8m. At the

spot, the camera is not able to return accurate depth range data. This data can’t also

be generated on any spot after 4m. Near range blind spot exists at 0-0.4m. At this

spot the camera cannot generate raw depth data after 3m.

The distance analysis is as seen below.

Figure 33- Chin et al (2013) Depth sensor distance

C programming language is used to program the SDK of the Kinect sensor. The de-

veloper kit gives access to the source code and other technical resources like Kinect

studio. All these tools enable easier development of applications.

The sensor can calculate distance on a straight line between the sensor and the object.

The distance is obtained by a perpendicular line drawn by the sensor. When an image

frame is captured, the Kinect sensor returns max and min depth ranges in mm.

The diagram below shows the 16-bit raw depth frame returned by the Kinect sensor;

Figure 34- Chin et al (2013) Depth frame bit pixel

Technically all the bits have specific functions. For instance, the first three bits are

used as players identifiers whole the following 13 gives the distance in mm. The fol-

lowing programming operations are used to calculate the bits.

Figure 35- Chin et al (2013) Algorithm depth distance

To calculate depth distance, 5 tests were done for the two ranges, default range and

near range. Distance range goes from 200mm to 4000mm with a differentiation of

100mm. Objective of the 5 tests was to approximate the average distance. The 1st

equation calculated the average distance for the experiment. Additionally, the AMPE

(Absolute mean percentage error) was calculated to establish the best range of esti-

mate. Further, standard deviation was calculated to ascertain the precision of the pro-

vided depth data. To analyze consistency the Kuder-Richardson was used.

Summary of the equations used for the experiment is as shown below.

Average, x (mm) = (1)

AMPE(%) = | X | (2)

Standard Deviation = (3)

i is the number of each test, i= 1,2,3,4,5

x is the average of each distance

indicates to sum

n is the total number of test taken which n=5

rkr20 = (4)

rkr20 is the kuder-richardson formula 20

k is the total number of test items

p is the distance of testing is pass ±5 mm

q is the distance of testing is fail

is the variation of entire test

Equation 1- Chin et al (2013) Experiment equations

To conduct the experiment, the researchers placed a box (cardboard) in the field of

view of the sensor. The range of the application is between 200mm and 400mm. The

experiment was designed to primarily focus on the centre of view. Cardborad bodies

are used instead of human bodies. This was to channel the focus to depth distance as

opposed to detection of a human frame. Human frame (bodies) may present errors

due to curved outline surface.

Different range estimations.

The experiment used Microsoft SDK for the programming part (software

framework). The SDK, available for windows allows for depth skeleton tracking

which creates anomated avatar images. According to the software package of the

SDK, it’s indicated that the SDK supports depth values of up to 4000mm. The SDK,

has an upper limit depth of 800mm and a lower limit of 500mm. Below is an

experimental result of the comparison between debt distance and lower distance.

Equation 2- Chin et al (2013) Default range vs near range

Figure 36- Chin et al (2013) Average depth distance vs Actual distance

With the default range the Kinect sensor was able to show images for object as far as

4000mm in front of the camera and those cloase as 500mm. At the distances, the

sensor was capable of assuring accuracy, relaibility and precision. Further, less than

1% error (AMPE) was recorded between the range of 600mm to 2900mm. Above

graph graph shows a similar quadratic shape that plots near the actual range. The

error depth is at 1.5%.

The experiment concluded that for the two ranges, there exist different depth quality

images. The default range returns all human joints (20). On the other hand, the near

range returns 10 joints. For near range, the sensor tends to focus on the users head,

hands and torso. This is because at a near range, the sensor had a limited view

because of the close distance. Default range can be used for a lot of applications like

facial recognition, human pose estimation and robotics.

Figure 37- Chin et al (2013) Accuracy analysis AMPE vs Distance

Figure 38- Chin et al (2013) Precision analysis std vs Distance

In general, the researchers concluded that the Kinect sensor provides object infor-

mation with a high level of precision and accuracy. The Kinect could also be relied

to provide accurate distance. Additionally, the following conclusions were made by

the researchers;

Kinect sensors has low errors in measurement of depth distance. The error is

only more pronounced at 600mm low range and 2900mm high range.

There is a quadratic increase in random error of the sensor up to a maximum

range of 40mm.

The Kinect sensor shows consistency in different distance ranges.

The researchers recommend about 600mm to 3000mm for biomedical applications.

Different researchers like Alexiadis (2017), have proposed alternative methods for

motion and 3-D body detection using RGB-D streams. The method is based on vol-

umetric Fourier transformation method. The researchers also proposed a qualitative

evaluation framework for real-time 3-D reconstruction systems. In their paper, they

propose elements and ways of capturing and reconstruction of human 3-D appear-

ance.

For the system setup, the devices were placed in a radius pointing towards the loca-

tion of the object being captured. The radius used was [2m, 4m]. Because of the limi-

tation presented by Kinect v2 (one sensor per computer) a network architecture was

setup. For image storage, RGB JPEG and LZ4 (for compression) was used. The

models allowed for on-line constructions of 3-D images and higher quality results.

The mapping calibration was approximated using a fixed KRT matrix. The approxi-

mations were based on a dense 3-D rigid condensation. The SDK Kinect package

was used for the programming part. The below figure shows the calibration setup.

The researchers achieved external calibration through the use of “novel registration

model”. The model uses and easy to build structure that works as an anchor in which

all registration is featured. The model is built on Scale Invariant Feature Transform

(SIFT). The advantage of the calibration is that after it was set up, no human input

was further required. For the calibration object, the researchers went for an easy to

obtain materials and one which had unique patterns that could support a SIFT fea-

ture. The IKEA standard package box was used as a calibration structure. The image

of the box is as seen in figure-40, the size of the box used was 56×33×41 cm3.

The calibration procedure involved placing the calibration structure at the center of

the room where it could be properly captured by all the sensors. For each view point

image, a color image and depth image is captured. Since there are more than one

Kinect sensors in all sections of the space the researchers concluded it was better to

not operate the sensors simultaneously to avoid interference. Additionally, the re-

searchers performed a quick post synchronization procedure to quickly synchronize

the data obtained. The data was synchronized in 16msec.

As per the 3-D image texture, the vertices were clearly visible in multiple RGB cam-

eras. The different colors from the RGB cameras were synchronized to produce eve-

ry reconstructed vertex. The researchers found that the color quality of the image

significantly depended on the angle of view. To this end, the researchers assigned a

smaller weight to color information at the object boundaries.

Evaluation- the researchers use a capturing system fitted with calibrated RGB-D

sensors for performance evaluation. The kinect sensors are also used in the recon-

struction procedure and serve as checks and balances to ensure accuracy. The captur-

ing system is as shown below.

Figure 39-Alexiadis et al (2017)3-D Camera and sensor setup

In terms of volume, the researchers observed the image was distorted. The 3-D image

suffered from cut limbs, holes and other different distortions. On the other hand, the

appearance quality, defined by the image quality was measured using the Structural

Similarity Index-based Measure (SSIM).

After evaluation and determination of the algorithm to be used in volume-based

tracking. The researchers proposed the below stages for the proposed model.

2-4m

C3

C5

C2

C4

C0

C1

Figure 40- Alexiadis et al (2017) Stages for the proposed model

Results of the experiments show a reliable quality reconstruction technique. The re-

sults were mostly obtained through the Kinect v2 sensors in different configuration

and spatial set up. Even though the Kinect v2 provided above board quality pictures,

the reconstruction done by the researcher’s present image quality with less distortion.

The reconstruction method employed resembles the poison reconstruction as it ren-

ders images that are properly blended in color and texture. The quality of the recon-

struction is better than TSDF-based reconstruction.

Figure 41- Alexiadis et al (2017) Image quality reconstruction; Kinect data,

waterlight geometry and Poisson

Generally, the researchers set out to describe the key elements for a system that

tracks and captures real time 3-D images including skeletal motion movements. The

researchers propose a novel system for 3-D reconstruction that is replicable since the

elements used are widely available. Limitations such as the non-perfect

synchronization of RGB cameras is discussed and expounded. Further, the

researchers recommended areas of improvement like visual quality and frame rates.

For instance, pre-scanned users face can be used to reconstruct the face since it’s one

of the most important part of the bidy reconstruction.

For the experiment, the Kinect sensor became the most important component since

there it was able to correctly recreate and provide high quality images.

In the field of motion tracking, researchers like Tahavori et al (2013), have also

contributed a lot in testing the technical capabilities of the Kinect sensor. For their

research, the researchers used both Kinect for windows and Kinect for Xbox to check

the ideal device in detecting respiratory motion in patients. The results were that the

Kinect for windows gave a more accurate detection with error in the range of less

than 2 mm. The goal of the experiment was to use Kinect for depth distribution on

the body of the patient, this will then allow monitoring of the patient’s motion. The

researchers wanted to know the potential of using Kinect for measuring and detecting

respiratory motion.

To investigate the technical capabilities of the Kinect, the researchers used a planar

object that was mounted on an optical rail. The rail was used to ensure precision of

measurements. The researchers also made use of the Gail motion controller to inves-

tigate the respiratory displacement. There were volunteers in the experiment.

To check for the technical capabilities of the Kinect for Xbox vs windows they were

both mounted on the rail.

Below graph shows the performance of the two devices.

Figure 42- Tahavori et al (2013) Kinect for Xbox vs Windows

Above data was analysed using Matlab and the kinect programming package SDK.

To reduce noise, the data was averaged at 1000 frames of depth. For the experiment,

distance was varied with a range of 40-140cm nthen data was recorded for both

devices in normal and near mode. As seen from figure-43, both devices have a lower

linit of 50 cm.

The result showed that Kinect for Windows had a higher accuracy and precision

level compared to the Xbox Kinect. To further check the performance of the Kinect

for windows, the researchers condicted further tests. The rsearchers used a

rectangular box whith a measurement of 20cm × 20cm × 5cm. The box was placed

on the rail and the distance was varied around the range of 80-100cm. The box was

then moved in steps from the sensor and measured at 2mm, 5mm and 10mm. This

was done both in normal and near mode. It was concluded that the Kinect sensor for

windows in near mode showed an error of <1mm.

The researchers also analyzed the rotational accuracy of the Kinect sensor. This was

done by use of a rectangular object which was placed in front of the Kinect sensor.

To estimate accuracy of the Kinect sensor, the known rotational difference between

the Kinect sensor and the test object was compared. The results obtained were as

below;

Ground Truth Normal Mode Near-mode

3⸰ 1.4⸰ 2.5⸰

4⸰ 2.4⸰ 4.8⸰

7⸰ 3.6⸰ 6.8⸰

Table 2- Tahavori et al (2013) Normal mode vs near mode rotational results

There also exist research in other movement detection aspects like pose estimation. A

research conducted by Sengupta and Ohya (1996), showed how multiple cameras can

be effectively used to analyze a person’s pose. The aim of the paper was to introduce

a method of easily obtaining an approximation of the pose of a 3-D or a 2-D image.

The researchers make use of a 3-CAD model on which they hypothesize a set of

models using a spatial extent function. In the case, the hypothesized points are then

used to derive the pose parameter.

In the experiment setting, the researchers assume everything in the space is in 3-D

and they are modelled in advance. To pose the image of a human body, a new two

staged edge is proposed. The two staged edge is as show in the below image.

Figure 43- Sengupta and Ohya (1996) Two staged pose estimation illustration

Figure 44- Sengupta and Ohya (1996) back projection method estimation

The process of approximating the images, first involves processing of images ob-

tained from the multiple cameras as seen in figure -46. The images are obtained by

background subtraction and thresholding. To obtain a pose estimate, the image is

solved as a 3-D image and not as a CAD 2-D image. The 4*3 camera calibration ma-

trix is obtained and is then used to calculate the back projected ray.

When finding the approximate pose estimation, the researcher obtains a 3*3 matrix, a

3*1 rotation matrix which when calibrated maps a 3-D point with co-ordinate values

of X in the CAD model. The exact mapping between the three non-collinear points is

projected with volume V.

Experimental results showed that the cameras could successfully extract the edges of

the model and transform it to the appropriate pose parameter set by the researchers.

Results from this experiment provided the basis upon which several later experi-

ments were conducted in regard to Kinect development.

To illustrate the pose estimation technique, the researchers conducted the experiment

in a controlled environment which had CAD model of a human head positioned at

equal intervals from the cameras in a semi-circle. The edges of the models were

found using a zero crossing edge detector. For the experiments, the silhouette was

separated manually. Transformation parameters were calculated for each rigidity

constraint. The figure obtained from the experiments are as seen below.

Figure 45- Sengupta and Ohya (1996) images used for the experiment

Figure 46- Sengupta and Ohya (1996) extracted silhouette images

Figure 47- Sengupta and Ohya (1996) rendered images from the parameter set

Figure 48- Sengupta and Ohya (1996) rendered images of the transferred model

The researchers through the experiment presented a theoretical technique of pose

estimation. The designed algorithm could extract and estimate the edge of the silhou-

ettes through use of a spatial extent function. To verify the pose parameters, each

image was projected to the model images. This leads to better refining of pose pa-

rameters.

Finally, a stable value is obtained by constant repetition of the process. This is done

until a reasonable pose is obtained.

4.4 Skeletal Tracking systems

Several studies have delved into the research of how Kinect can be applied to skele-

tal tracking. Tao et al (2013), researched on the kinematic validity of Microsoft

Kinect in skeletal tracking for application in virtual limb rehabilitation. The research

investigated the extent to which Kinect can be used to track hand position, limbs,

ankles and body trunk. For the experiment, cameras were positioned in the range of

1.45m and 1.75m. The goal was the experiment was to determine the extent to which

the Kinect sensor can be used for limb rehabilitation through use of preset and repeti-

tive tasks. Additionally, the precision of the Kinect sensor is determined and ana-

lyzed.

For the experiment, the researchers used Optotrack 3-D motion capture system which

was placed on different locations. The participating individuals then performed dif-

ferent movements such as; leaning backwards, elbow flexing and trunk leaning. All

this was captured by a Kinect sensor placed at a height of 135 cm.

The results obtained from the experiment showed a simultaneous comparison be-

tween the sensor result and the motion capture system. The error of the mean squared

difference was then obtained. This showed that for the hand movements, the error

was 6.3cm and the variable error was at 2.4 cm. The constant error in all positions

was found to be less than 9.8cm. For trunk movements, obtained mean error was at

3.9 cm with a variable of 2.5cm.

Below figure shows the constant camera error with respect to the distance;

Figure 49- Tao et al (2013) constant camera error

Overall, the data obtained from the sensor, closely matched that from the Optotrack

motion tracker except that of the elbow. The elbow tracking showed varying results

because of the limitations of the Kinect sensor in modeling of the elbow.

The research concluded that the appropriate location of the camera with respect to

Kinect Skeletal tracking ought to be at 30*30 square and 1.45m and 1.75m from the

user, the camera can also vary with a distance of 0.15m to the left or to the right.

Figure 50- Tao et al (2013) variable camera error

In regards to geometric refinement and pose estimation, there was a research

conducted by Choe et al (2014), the researcher aim was to improve the accuracy of

Kinect camera for depth recognition and image reconstruction. The researchers used

a 3-D mesh to optimize the geometric refinement process. The approach the

researchers toom does not require additional Kinect camera or complex setup.

Effectively, the researchers were able to utilize shading information to perfect the

geometry refinement. The researcher used different lighting conditions to verify the

invariability of Kinect IR images.

Figure 51- Choe et al (2014) invariability of IR images and RGB under different

lighting conditions

The data capturing process the researchers used consisted of discrete IR shading im-

age acquisition. The Kinect fusion was used to obtain the first mesh as shown in fig-

ure-53 below.

The Kinect SDK is used to register depth map with a reconstructed surface.

Figure 52- Choe et al (2014) Data capturing system, used to obtain the base

mesh

Figure 53- Choe et al (2014) input shading image, projected mesh and depth

map

The research demonstrated that the captured IR images do not result in any overlap-

ping visible spectrum. The researcher also described a method of radiometrically

calibrating the Kinect IR. The research assumes that Lambertian BRDF which made

the result erratic.

5 Findings and conclusion

5.1 Findings

Microsoft Kinect presents and important technology that can be used in a wide array

of applications including assisting patients in assisted living environments. A lot of

research and experiments have been conducted to test the clinical and technical ca-

pability of the Kinect in physical therapy and body parts rehabilitation. Documentary

analysis conducted in chapter 4 covered some of the biggest and most important ex-

periments conducted to assess the technical capabilities of Kinect components such

as the IR camera. Increase in research interest in Microsoft signifies its relevance in

applications such as use in assisted living environments.

For ease of research, studies of the application of Kinect in Assisted Living environ-

ments can be classified into;

1. Experiments that evaluated accuracy, reliability and precision of Microsoft

Kinect

2. Experiments that evaluated the application of Kinect in Clinical settings and

Smart Home environments

3. Experiments that investigated use of Kinect for Movement Detection Models

4. Experiments that investigated use of Kinect in Skeletal tracking systems

Normally, in assisted living environments assessment is done by people who can be

doctors, nurses or hospital volunteers. This means that the assessment relies heavily

on a human touch which means higher labor costs and low scalability. For instance,

an activity like therapy would require a specialised/trained Physical therapist (PT) or

Occupational therapist (OCT). Given that these kinds of clinical assessment can be

done by people, it is subject to errors and inaccuracies.

To solve this problem, researchers are testing motion sensors. Notably, motion sen-

sors have in the past few years received significant interest because of their afforda-

bility and practicality. The commonly technologies used for motion sensing are opto-

electronics and nonoptoelectronics sensors. While optoelectronics use markers,

nonoptoelctronics do not. In instances where markers are used, they are placed on the

bodies of the individuals which are then tracked by a camera sensor. Where markers

are not used the sensors apply inertia, mechanical and magnetic techniques to track

motion.

Our findings show that Kinect can be used both for optoelectronics and nonoptoelc-

tronics experiments. For inertia systems as seen in chapter-4, researchers use sensor

fusion algorithms and human skeletal algorithms. Magnetic systems on the other

hand use motion capture technologies to transmit and receive signals that can be used

for position, orientation and pose of receiver. In studied experiments, we have seen

that the sensors are 6 DoF per receiver which is able to provide 3-D positioning.

Findings also show that Kinect can also be used in collaboration with wearable tech-

nologies such as wearable sensors, smart suits and music gloves. These devices to-

gether with Kinect sensors are able to follow the user’s motion passively or actively.

Review of visual based motion trackers show that they either use contrast based or

depth based imaging. Contrast based systems work by tracking different colour

markers attached to the bodies or hands that are being tracked. Depth-sensing sys-

tems use depth imagery segmentation and vision algorithms to track and detect hu-

man motions.

Importance of the Microsoft Kinect compared to other previous camera

Compared to other cameras, we found that Kinect has a lot of advantages and fea-

tures that make it ideal for motion tracking. For instance, Microsoft Kinect provides

a Software Development Kit that gives developers important access to body joint

positions.

Specification of the Kinect that are ideal for motion tracking include; RGB camera,

multi-array microphone, infrared projector and CMOS sensor. According to the ex-

periments analysed in chapter 4, it was found that truly, Kinect sensor can handle

both depth and infrared streams at 640X480 pixels which can be increased when

needed to 1280X1024. The stream supports 8-bit resolution and can accommodate

VGA or UYVY colour format.

The senor can be adjusted to near range or default range. At near range people within

0.4-3m are visible while in default range visibility is at 0.8m-2.5m. The microphone

is capable of processing 4 channels of 16-but audio at a rate of 16 kHz. The sensor

can visualize 6 people but is only capable of tracking 2 people at a time.

Reliability and Accuracy of Kinect

From analyse papers, it’s obvious that a lot of researchers have tried to ascertain the

reliability and accuracy of Kinect sensor. Generally, most researchers agree that Ki-

nect is good as a motion capture mainly because it’s easily available and affordable.

However, researchers point out that the technology suffers from occlusion. It has

been observed that at time, the Kinect sensor would recognise chair legs like they are

human legs. This means that for successful tracking, problems brought about by oc-

clusion need to be effectively addressed.

Important to note, accuracy tests of the Kinect show that its sensors are accurate

enough for use in smart living or assisted living environments. In a trial to test Kinect

application in assisted living environments, Dutta (2014) compared Kinect to Vicon

in the tracking of motion. The result of the research showed that in monitoring of

elderly falling Kinect was accurate enough to be used. In a different research on the

accuracy, Kurillo et al (2013) found that Microsoft Kinect provided more reliability

compared to MoCap system. In terms of range of motion measurements, Kinect

proved to be a more accurate measure compared to MoCap. This is at the backdrop

of research in different areas such as hip abduction, elbow flexing, knee flexing and

shoulder abductions. Other researchers like Hawi et al (2014) showed that Kinect had

an exemplary test-retest reliability but had low accuracy compared to goniometers.

The most important finding from all the literature sets studied in Chapter-4 was that

Kinect can be reliably used as a depth sensor. However, developers should factor in

occlusion issues and the noise usually experienced in skeletal tracking. Researchers

also agree that to solve most of the challenges presented by Kinect Kalman filters,

sensor fusion and calibration should be used.

Findings on Application of Kinect to patients with Neurological Disorders

Key application highlighted in this research is the use of Kinect in assisted living

environments. Assisted living environments usually has patients with different needs

like those with chronic diseases that require specialised care. Researchers such as

Llorens et al (2013) have pioneered research in this area with encouraging findings.

The researcher created a game that promote rehabilitation activities in patients with

Neurological Disorders. Clinical tests of the game showed significant improvements

in body balance and mobility of the patients.

Research conducted by Exell et al (2014) showed that electrical simulation could be

used to rehabilitate a patient’s arm. In conclusion, the researcher insisted that Kinect

is accurate enough although it needs further research.

5.2 Conclusions

This paper reviewed different literature and experiment results on Microsoft Kinect

in the field for assisted living. First, similar experiments using other technologies

were reviewed that aimed to provide solutions in assisted living environments. Limi-

tations and errors presented by these systems were analyzed and discussed. Previous

systems used in motion tracking were not as effective as Kinect. These systems were

only able to track specific body parts like palm, hand and face. The systems were

also not interactive and did not provide ease in programming the way the Kinect pro-

vides the Software Development Kit (SDK) which provides programmable access to

skeletal tracking.

The arrival of the Kinect has ushered a new age for motion sensors. Today, there

exist numerous research on the application of Kinect in motion sensing and in smart

environments. Kinect has proven to be more accurate, precise and reliable compared

to RGB systems. However, Kinect is not without limitation and faults. Issues such as

occlusion and noise still exist and require improvement. In assisted living environ-

ments, these problems can be significantly reduced by Kalman filtering, calibration

and sensor fusion.

Further, this researched discussed evaluation and performance of Kinect in assisted

living environments with patients requiring different levels of attention. Studies in

this area targeted different monitoring architectures and infrastructure in both home

environments and in specialized hospital monitoring environments. Experiments uti-

lized different body movements, games, cognitive therapy and exercises. Some ex-

periments resulted into successful assessment of falls, movements and even postures.

However, other studies lacked clinical evaluation of the results which raises ques-

tions on the effectiveness of the experiments.

In addition, this research compared Kinect with other sensor technologies both as a

whole and in the form of component by component. Examples of other analyzed de-

vices included Leap motion, Asus Xtion and Intel Creative Cameras. Although the

different cameras were suited for different small functions, Kinect proved to be the

better option for full body tracking.

The rapid growth in the field of smart environment and assisted living, and the con-

tinuous advancement in the field of artificial intelligence, have opened the possibility

for many options for further work in this study. A viable option is testing the Mi-

crosoft Kinect sensor with the Smart Environment for Assisted Living (SEAL) appli-

cations. In this case, Kinect could be the eyes and brains of the SEAL app, helping to

monitor the patient in real time. It will also be sending real time updates about the

patient into the SEAL app, and activating the SEAL app alarm in cases where by the

patient seems to be in danger. The alarm could be calling for help whenever the pa-

tient falls and he/she is not able to help him/herself up, and it could also send mes-

sages to the doctors or nurses when no activity (movement, breathing, etc.) is record-

ed from patient for a period. Kinect sensor could also be used for other machine

learning related studies.

References

Hemant Ghayvat, Jie Liu, Subhas Chandra Mukhopadhyay, and Xiang Gui “Well-

ness Sensor Networks: A Proposal and Implementation for Smart Homefor Assisted

Living” IEEE SENSORS JOURNAL, VOL. 15, NO. 12, PP 17341-17344 December

2015.

Lin Yang, Longyu Zhang, Haiwei Dong, Abdulhameed Alelaiwi, and Abdulmotaleb

El Saddik “Evaluating and Improving the Depth Accuracy of Kinect for Windows

v2” IEEE SENSORS JOURNAL, VOL. 15, NO. 8, PP 4275-4277August 2015.

Marek R. Ogiela, Tomasz Hachaj and Katarzyna Koptyra “Effectiveness comparison

of Kinect and Kinect 2 for recognition of Oyama karate techniques” 18th Interna-

tional Conference on Network-Based Information Systems, 2015.

Teng Deng1, Hui Li1, Jianfei Cai1, Tat-Jen Cham1, Henry Fuchs “Kinect Shadow

Detection and Classification” 2013 IEEE International Conference on Computer Vi-

sion Workshops

Joshua Fabian, Tyler Young and James C. Peyton Jones “Integration of Microsoft

Kinect With Simulink: Real-Time Object Tracking Example” EEE/ASME Transac-

tions On Mechatronics, vol. 19, no. 1, February 2014

Zhang, Z., 2012. Microsoft kinect sensor and its effect. IEEE multimedia, 19(2),

pp.4-10.

Han, J., Shao, L., Xu, D. and Shotton, J., 2013. Enhanced computer vision with mi-

crosoft kinect sensor: A review. IEEE transactions on cybernetics, 43(5), pp.1318-

1334.

Lange, B., Chang, C.Y., Suma, E., Newman, B., Rizzo, A.S. and Bolas, M., 2011,

August. Development and evaluation of low cost game-based balance rehabilitation

tool using the Microsoft Kinect sensor. In Engineering in medicine and biology soci-

ety, EMBC, 2011 annual international conference of the IEEE (pp. 1831-1834).

IEEE.

Stowers, J., Hayes, M. and Bainbridge-Smith, A., 2011, April. Altitude control of a

quadrotor helicopter using depth map from Microsoft Kinect sensor. In Mechatronics

(ICM), 2011 IEEE International Conference on (pp. 358-362). IEEE.

Galna, B., Barry, G., Jackson, D., Mhiripiri, D., Olivier, P. and Rochester, L., 2014.

Accuracy of the Microsoft Kinect sensor for measuring movement in people with

Parkinson's disease. Gait & posture, 39(4), pp.1062-1068.

Villaroman, N., Rowe, D. and Swan, B., 2011, October. Teaching natural user inter-

action using OpenNI and the Microsoft Kinect sensor. In Proceedings of the 2011

conference on Information technology education (pp. 227-232). ACM.

Azzari, G., Goulden, M.L. and Rusu, R.B., 2013. Rapid characterization of vegeta-

tion structure with a microsoft kinect sensor. Sensors, 13(2), pp.2384-2398.

Chang, C.Y., Lange, B., Zhang, M., Koenig, S., Requejo, P., Somboon, N., Sawchuk,

A.A. and Rizzo, A.A., 2012, May. Towards pervasive physical rehabilitation using

Microsoft Kinect. In Pervasive Computing Technologies for Healthcare

(PervasiveHealth), 2012 6th International Conference on (pp. 159-162). IEEE.

Biswas, K.K. and Basu, S.K., 2011, December. Gesture recognition using microsoft

kinect®. In Automation, Robotics and Applications (ICARA), 2011 5th International

Conference on (pp. 100-103). IEEE.

El-laithy, R.A., Huang, J. and Yeh, M., 2012, April. Study on the use of Microsoft

Kinect for robotics applications. In Position Location and Navigation Symposium

(PLANS), 2012 IEEE/ION (pp. 1280-1288). IEEE.

Gonzalez-Jorge, H., Riveiro, B., Vazquez-Fernandez, E., Martínez-Sánchez, J. and

Arias, P., 2013. Metrological evaluation of microsoft kinect and asus xtion sen-

sors. Measurement, 46(6), pp.1800-1806.

Moazzam, I., Kamal, K., Mathavan, S., Usman, S. and Rahman, M., 2013, October.

Metrology and visualization of potholes using the microsoft kinect sensor.

In Intelligent Transportation Systems-(ITSC), 2013 16th International IEEE Confer-

ence on (pp. 1284-1291). IEEE.

Kawatsu, C., Li, J. and Chung, C.J., 2013. Development of a fall detection system

with Microsoft Kinect. In Robot Intelligence Technology and Applications 2012 (pp.

623-630). Springer Berlin Heidelberg.

Kawatsu, C., Li, J. and Chung, C.J., 2013. Development of a fall detection system

with Microsoft Kinect. In Robot Intelligence Technology and Applications 2012 (pp.

623-630). Springer Berlin Heidelberg.

Weerasinghe, I.T., Ruwanpura, J.Y., Boyd, J.E. and Habib, A.F., 2012. Application

of Microsoft Kinect sensor for tracking construction workers. In Construction Re-

search Congress 2012: Construction Challenges in a Flat World (pp. 858-867).

Araujo, R.M., Graña, G. and Andersson, V., 2013, March. Towards skeleton bio-

metric identification using the microsoft kinect sensor. In Proceedings of the 28th

Annual ACM Symposium on Applied Computing (pp. 21-26). ACM.

Bevilacqua, V., Nuzzolese, N., Barone, D., Pantaleo, M., Suma, M., D'Ambruoso,

D., ... & Stroppa, F. (2014, June). Fall detection in indoor environment with kinect

sensor. In Innovations in Intelligent Systems and Applications (INISTA) Proceedings,

2014 IEEE International Symposium on (pp. 319-324). IEEE.

Konstantinidis, E. I., Antoniou, P. E., Bamparopoulos, G., & Bamidis, P. D. (2015).

A lightweight framework for transparent cross platform communication of controller

data in ambient assisted living environments. Information Sciences, 300, 124-139.

Anastasiou, D. (2011, May). Gestures in assisted living environments.

In International Gesture Workshop (pp. 1-12). Springer, Berlin, Heidelberg.

Baeg, S. H., Park, J. H., Koh, J., Park, K. W., & Baeg, M. H. (2007, October). Build-

ing a smart home environment for service robots based on RFID and sensor net-

works. In Control, Automation and Systems, 2007. ICCAS'07. International Confer-

ence on (pp. 1078-1082). IEEE.

Hondori, H.M., Khademi, M., Dodakian, L., Cramer, S.C. and Lopes, C.V., 2013. A

spatial augmented reality rehab system for post-stroke hand rehabilitation.

In MMVR (pp. 279-285).

Mohamed, A.B.H., Val, T., Andrieux, L. and Kachouri, A., 2013, January. Assisting

people with disabilities through Kinect sensors into a smart house. In Computer Med-

ical Applications (ICCMA), 2013 International Conference on (pp. 1-5). IEEE.

Chin, L.C., Basah, S.N., Yaacob, S., Din, M.Y. and Juan, Y.E., 2015, March. Accu-

racy and reliability of optimum distance for high performance Kinect Sensor.

In Biomedical Engineering (ICoBE), 2015 2nd International Conference on (pp. 1-

7). IEEE.

Alexiadis, D.S., Chatzitofis, A., Zioulis, N., Zoidi, O., Louizis, G., Zarpalas, D. and

Daras, P., 2017. An Integrated Platform for Live 3D Human Reconstruction and Mo-

tion Capturing. IEEE Transactions on Circuits and Systems for Video Technolo-

gy, 27(4), pp.798-813.

Tahavori, F., Alnowami, M., Jones, J., Elangovan, P., Donovan, E. and Wells, K.,

2013, October. Assessment of Microsoft Kinect technology (Kinect for Xbox and

Kinect for Windows) for patient monitoring during external beam radiotherapy.

In Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), 2013

IEEE (pp. 1-5). IEEE.

Sengupta, K. and Ohya, J., 1996, November. Pose estimation of human body part

using multiple cameras. In Robot and Human Communication, 1996., 5th IEEE In-

ternational Workshop on (pp. 146-151). IEEE.

Tao, G., Archambault, P.S. and Levin, M.F., 2013, August. Evaluation of Kinect

skeletal tracking in a virtual reality rehabilitation system for upper limb hemiparesis.

In Virtual Rehabilitation (ICVR), 2013 International Conference on (pp. 164-165).

IEEE.

Choe, G., Park, J., Tai, Y.W. and So Kweon, I., 2014. Exploiting shading cues in

kinect ir images for geometry refinement. In Proceedings of the IEEE Conference on

Computer Vision and Pattern Recognition (pp. 3922-3929).

Documents

MSc Thesis - UEF · MSc Thesis Ekundayo Olufemi A. {Contactless Measurement in Smart Environment for the Elderly People Using Kinect v2 Sensor.} School of Computer Science {International