human motion analysis final

8/8/2019 human motion analysis final

1/20

UNIVERSAL COLLEGE OF ENGINEERING & TECHNOLOGYDOKIPARRU(P), MEDIKONDURU (M), GUNTUR (Dt.) 5422438.

Affiliated to JNTU-Kakinada & Approved by AICTE New Delhi

DEPARTMENT OF ELECTRONIC AND COMMUNICATION ENGINEERING

TECHNICAL PAPER PRESENTATION

TOPIC:

REAL TIME HUMAN MOTION ANALYSIS BY IMAGE SKELETONIZATIONREAL TIME HUMAN MOTION ANALYSIS BY IMAGE SKELETONIZATION

DOMAIN:

DIGITAL IMAGE PROSSINGDIGITAL IMAGE PROSSING

PRESENTED BY

M.SESHA SAI RAM KUMAR

Dept: E.C.E, Redg.No:O8NFIA0429

E-Mail I.D: [email protected]

Ph: 91 33 77 333 5.

REAL TIME HUMAN MOTION ANAYSIS BY IMAGE SKELETONIZATION Page 1
mailto:[email protected]:[email protected]


2/20


3/20

1. ABSTRACT

In this paper a process is described for analyzing that motion of human targets in a video

stream. Moving targets are detected and their bounders extracted. From these a star" skeleton is

produced. Two motion cues are determined from this skeletonization: Body posture & Cyclic motion of skeleton segments. These cues are used to determine human activities such as walking or running and

even potentially, the target's gait. Unlike other methods, this does not require an a priori human model,

or a large number of pixels on target". Furthermore, it is computationally inexpensive and thus ideal for

real -world video applications such as "outdoor video surveillance".

2. INTRODUCTION

Using video in machine understanding has recently become a significant research topic.

One of the more active areas is activity understanding from video imagery. Understanding activities

involves bring able to detect and classify targets of interest and analyze what they are doing. Human

motion analysis is one such research area. There have been several good human detection schemes,

such as "Pedestrian detection using wavelet templates", which use static imagery.

But detecting and analyzing human motion in real time from video imagery has only recently

become viable with algorithms like Pfinder and w 4. These algorithms represent a good first step to the

problem of recognizing and analyzing humans. But they still have some drawbacks. In general, they

work by detecting features (such as hands feet and head), tracking them, and fitting them to some a

priori human models such as the cardboard models of ju etal.

There are two main drawbacks of these systems in their present forms: they are completely

human specific, and they require a great deal of image-based information in order to work effectively.

For general video applications it may be necessary to derive motion analysis tools which are not

constrained to human models, but are applicable to other types of targets, or even to classifying targets

into different types. In some real video applications, such as outdoor surveillance, it is unlikely thatthere will be enough "pixels on target" to adequately apply these methods. What is required is a fast,

robust system which can make broad assumptions about target motion from small amounts of image

data.



4/20

This paper proposes the use of the "star" skeletonization procedure for analyzing the motion

of the targets-- particularly, human targets. The notion is that a simple form of skeletonization which

only extracts the broad internal motion features of a target can be employed to analyze its motion.

Once a skeleton is extracted, motion cues can be determined from it. The cues dealt with in

this paper are: cyclic motion of "leg" segments, and the posture of the "torso" segment. These cues,

when taken together can be used to classify the motion of an erect human as "walking or "running".

3. SKELETONIZATION

Skeletonization is the process of peeling off of a pattern as many pixels as possible without

affecting the general shape of the pattern. In other words, after pixels have been peeled off, the pattern

should still be recognized. The skeleton hence obtained must have the following properties:

as thin as possible

connected

centered

3.1. SKELETONIZATION TECHNIQUES:

Skeletonization (i.e., skeleton extraction from a digital binary picture) provides region-based

shape features. It is a common preprocessing operation in raster-to-vector conversion or in pattern

recognition.

There are three major skeletonization techniques:

detecting ridges in distance map of the boundary points,

calculating the Voronoi diagram generated by the boundary points, and

the layer by layer erosion called thinning.

In digital spaces, only an approximation to the "true skeleton" can be extracted. There are

two requirements to be complied with:

topological (to retain the topology of the original object),

geometrical (forcing the "skeleton" being in the middle of the object and invariance under

the most important geometrical transformation including translation, rotation, and scaling)



5/20

3.1.1. DISTANCE TRANSFORMATION:

Skeletonization based on distance transformation requires the following 3-step process:

The original (binary) image is converted into feature and non-feature elements. Thefeature elements belong to the boundary of the object.

The distance map is generated where each element gives the distance to the nearestfeature element.

The ridges (local extremes) are detected as skeletal points.

The distance map resulted by the distance transformation depends on the chosen distance .

Figure.1.Extracted feature points are marked by pink squares (left) and distance map using city block (or 4-neighbour) distance (right).

Figure.2: Example of distance transformation:



6/20

3.1.2. VORONOI DIAGRAM:

The Voronoi diagram of a discrete set of points (called generating points) is the partition of

the given space into cells so that each cell contains exactly one generating point and the locus of all

points which are nearer to this generating point than to other generating points.

Example of Voronoi diagrams:

Figure.3: The 10 generating points (left) and their Voronoi diagram (right).

Example of Voronoi-skeleton:

Figure.4: Some border points of a rectangle form the set of generating points.



7/20

Figure.5 : The skeleton (marked by red lines) is approximated by a sub graph of the Voronoi diagram

Both requirements (i.e., the topological and the geometrical) can be fulfilled by the

skeletonization based on Voronoi diagrams, but it is an expensive process, especially for large and

complex objects.

3.1.3. THINNING :

It is an iterative object reduction technique for modeling fire propagation in digital spaces.

Figure.6:



8/20

The thinning has some beneficial properties:

It preserves the topology (retains the topology of the original object),

It preserves the shape (significant feature suitable for object recognition or classification is extracted),

It forces the "skeleton" being in the middle of the object, and

It produces one pixel/voxel width "skeleton".

4. Current algorithms for real-time skeletonizationSeveral human action skeletonization methods were proposed in the past few years.

However, there are still very few skeletonization methods that work in real time for human motion

analysis. One of the most relevant in this field is the method proposed by Fujyoshi et al. They propose

a method for analyzing the motion of a human target in a video stream. Moving targets are detected

and their boundaries extracted. A star skeleton is produced to form these boundaries. Two motion

cues are determined from this skeletonization: body posture, and cyclic motion activities such as

walking or running. From these, the targets gait can be computed. Skeletonization does not require an

a priori human model or a large number of pixels in the target. Furthermore, it is computationally

inexpensive, and thus ideal for real-time video analysis applications such as outdoor video surveillance.

The work of Chen et al follows a similar approach. They present a Hidden Markov Model

(HMM) based methodology for action-recognition using a star skeleton as a representative descriptor of

human posture. They implement a system to automatically recognize ten different types of actions,

which has been tested on real human action videos with a great deal of success. They also designed a

posture codebook, which contains representative star skeletons of each action type and defined a star

distance to measure the similarity between actions.

5. STAR" SKELETONIZATION

An important cue in determining the internal motion of a moving target is the change in its

boundary shape over time and a good way to quantify this is to use skeletonization. There are many

standard techniques for skeletonization such as thinning and distance transformation. However, thesetechniques are computationally expensive and moreover, are highly susceptible to noise in the target

boundary. The method proposed here provides a simple, real-time, robust way of detecting extremal

points on the boundary of the target to produce a star" skeleton. The star" skeleton consists of only

the gross extremities of the target joined to its centroid in a star" fashion. This procedure for

producing star skeletons is illustrated in figure 8.



9/20

Figure 7: Target pre-processing. A moving target region is morphologically dilated (twice) then eroded. Then its

border is extracted.

Figure 8: The boundary is \unwrapped" as a distance function from the centroid. This function is then smoothed

and extremal points are extracted



10/20

1. The centroid of the target image boundary (x c; y c) is determined.

1

1 b N

c i

ib

x x N =

=

1

1 b N c i

ib

y y N =

=

Where (x c; y c) is the average boundary pixel position, N b is the number of boundary pixels,

and (x i; y i) is a pixel on the boundary of the target.

2. The distances d i from the centroid (x c; yc) to each border point (x i; y i) are calculated

2 2( ) ( )i i c i cd x x y y= +

These are expressed as a one dimensional discrete function d(i) = d i. Note that this functionis periodic with period N b.

3. The signal d (i) is then smoothed for noise reduction, becoming d i .This can be done

using a linear smoothing filter or low pass filtering in the Fourier domain.

4. Local maxima of d i are taken as extremal points and the star" skeleton is constructed by

connecting them to the target centroid (x c; y c). Local maxima are detected by finding zero- crossings of

the difference function.

( )i = ( ) ( 1)d i d i

6. OUR APPROACH

Our approach uses the star skeleton method as a point of departure with some

modifications. We propose a more accurate skeleton representation for providing improved rhythmic

control from bodily action in real time.

The goal is to have a skeleton representation as close as possible to a real human skeleton

so that one can better measure the limbs articulation and trajectory in relation to their point of origin

(e.g. shoulders and neck) instead of the center of the star skeleton .



11/20

6.1 OBJECTS AND ALGORITHM DESCRIPTION:

We developed two objects for real time full-body skeletonization kin.skel and skel. draw.

Object kin.skel is responsible for the output of several important elements of the skeleton:

mass center coordinates, hands and feet coordinates, arms and legs angle.

Object skel.draw is responsible for the drawing of the skeleton in an output video matrix.

They are inherently connected since skel.draw provides the visual output to kin.skel.

6.2. KIN.SKELThis object was developed based with some modifications. The most relevant is the fact that

Fujiyoshi et al. determine the extremities of the skeleton by finding the maximum differences from the

edges to the mass center. In our case, we calculate them by analyzing the whole edge frame shot and

finding the rightmost and leftmost active pixels on the edge above and below the mass center, as well

as the uppermost pixel in the edge.

The real-time extraction of the skeleton is articulated in three stages: 1.edge detection,

2.mass center and extremities calculation, 3.outputs. Below we give a brief description of these stages.

6.2.1. EDGE DETECTIONFor the edge detection we use the algorithm developed by Canny. It takes a binary image as

input and outputs the silhouette of the human body. It is possible to change the edge detection

threshold at run time.

6.2.2. MASS CENTER AND EXTREMITIES CALCULATION

Once the edge of the subject is obtained, we use it for Calculating the mass center

coordinates by simply adding all the pixels coordinates and dividing them by the number of pixels. The

extremities calculations and the skeleton drawing are based on the correlations of i human proportions

with geometry described by the ancient Roman architect Vitruvius in Book III of his treatise De

Architectura .The Vitruvian human proportions laws were made famous by Leonardo Da Vincisdrawing Vitruvian Man - named in honor of the architect. From Vitruviuss description of the human

bodys proportions, we extrapolate the shoulders positions and use them to connect the arms, thus

obtaining a more accurate representation of the human body than the one given by (Figure 9).



12/20

Figure 9 The output of our algorithm based onVitruviuss description of the human bodys proportions.

6.2.3. OUTPUTS FROM THE OBJECT

There are two main outputs from the kin.skel object. One output is the set of points

coordinates that will feed the skel.draw object for the skeleton visualization. The other output sends all

the characteristics obtained by kin.skel (mass center, right arm angle, left arm angle, right leg angle, left

leg angle) packed as a list for further analysis in the Max environment.

Figure 10 Output of kin.skel in list format.

6.3. SKEL.DRAW

Once the major points coordinates are obtained, they are passed to a routine responsible

for the skeleton drawing. The several point coordinates that will constitute the line that unites the major

points are calculated by successively finding the middle point between extremities. To retain real-time

output, each line is drawn with only 30 pixels, which produces an adequate representation of the

skeleton. Figure 4 depicts all image processing stages from video input to skeleton drawing.



13/20

Figure 11 The video input on top left the binary image extraction below skeleton processing and draw byobjects kin.skel and skel.draw respectively

7. ADVANTAGES OF STAR SKELETONIZATION

There are three main advantages of this type of skeletonization process. It is not iterative

and is, therefore, computationally cheap. It also explicitly provides a mechanism for controlling scale

sensitivity. Finally, it relies on no a priori human model. The scale of features which can be detected is

directly configurable by changing the cutoff frequency c of the low-pass filter. Figure 12 shows two

smoothed versions of d(i) for different values of c: c = 0:01xN b and c = 0:025xN b. For the higher value

of c, more detail is included in the star" skeleton because more of the smaller boundary features are

retained in d i .So the method can be scaled for different levels of target complexity.

An interesting application of this scalability is the ability to measure the complexity of a

target by examining the number of extremal points extracted as a function of smoothing.



14/20

Figure 12: Effect of cut-off value c. When c is small only gross features are extracted, but larger values of c detect

more extremal points .

Other analysis techniques require a priori models of humans --such as the cardboard model

in order to analyze human activities. Using the skeletonization approach, no such models are required,so the method can be applied to other objects like animals and vehicles (see Figure 13). It is clear that

the structure and rigidity of the skeleton are important cues in analyzing different types of targets.



15/20

However, in this implementation, only human motion is considered. Also, unlike other

methods which require the tracking of specific features, this method uses only the object's boundary so

there is no requirement for a large number of pixels on target".

Figure 13: Skeletonization of different moving targets. It is clear the structure and rigidity of the skeleton is

significant in analyzing target motion .



16/20

8. Human motion analysis

One technique often used to analyze the motion or gait of an individual target is the cyclic

motion of skeletal components. However, in this implementation, the knowledge of individual joint

positions cannot be determined in real-time. So a more fundamental cyclic analysis must be performed.Another cue to the gait of the target is its posture. Using only a metric based on the star" skeleton, it is

possible to determine the posture of a moving human .

8.1. SIGNIFICANT FEATURES OF THE STAR" SKELETON

For the cases in which a human is moving in an upright position, it can be assumed that the

lower extremal points are legs, so choosing these as points to analyze cyclic motion seems a

reasonable approach. In particular, the left-most lower extremal point (l x; ly) is used as the cyclic point.

Note that this choice does not guarantee that the analysis is being performed on the same physical leg

at all times, but the cyclic structure of the motion will still be evident from this point's motion. If

f(xs i ; y s i )g is the set of extremal points, (l x; ly) is chosen according to the following condition: Then, the

angle (lx; ly) makes with the vertical is calculated as

(lx; l y) = (x s i; y s i): x s i = min si c y y

Documents

human motion analysis final