Posture-based and action-based graphs for boxing skill ... · Posture-based and action-based graphs for boxing skill visualization Q1 Yijun Shen a, He Wang b, Edmond S.L. Ho a, Longzhi

Citation: Shen, Yijun, Wang, He, Ho, Edmond, Yang, Longzhi and Shum, Hubert P. H. (2017) Posture-based and Action-based Graphs for Boxing Skill Visualization. Computers and Graphics. ISSN 0097-8493 (In Press)

Published by: Elsevier

URL: https://doi.org/10.1016/j.cag.2017.09.007 <https://doi.org/10.1016/j.cag.2017.09.007>

This version was downloaded from Northumbria Research Link: http://nrl.northumbria.ac.uk/32266/

Northumbria University has developed Northumbria Research Link (NRL) to enable users to access the University’s research output. Copyright © and moral rights for items on NRL are retained by the individual author(s) and/or other copyright owners. Single copies of full items can be reproduced, displayed or performed, and given to third parties in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge, provided the authors, title and full bibliographic details are given, as well as a hyperlink and/or URL to the original metadata page. The content must not be changed in any way. Full items must not be sold commercially in any format or medium without formal permission of the copyright holder. The full policy is available online: http://nrl.northumbria.ac.uk/policies.html

This document may differ from the final, published version of the research and has been made available online in accordance with publisher policies. To read and/or cite from the published version of the research, please visit the publisher’s website (a subscription may be required.)

http://nrl.northumbria.ac.uk/policies.html

ARTICLE IN PRESS

JID: CAG [m5G; October 16, 2017;22:13 ]

Computers & Graphics xxx (2017) xxx–xxx

Contents lists available at ScienceDirect

Computers & Graphics

journal homepage: www.elsevier.com/locate/cag

Special Section on Motion in Games 2016

Posture-based and action-based graphs for boxing skill visualization

Yijun Shen

a , He Wang

b , Edmond S.L. Ho

a , Longzhi Yang

a , Hubert P.H. Shum

a , ∗Q1

a Northumbria University, Newcastle-Upon-Tyne, United Kingdom

Q2

b University of Leeds, Leeds, United Kingdom

a r t i c l e i n f o

Article history:

Received 29 July 2017

Revised 25 September 2017

Accepted 26 September 2017

Available online xxx

Keywords:

Motion graph

Hidden Markov model

Information visualization

Dimensionality reduction

Human motion analysis

Boxing

a b s t r a c t

Automatic evaluation of sports skills has been an active research area. However, most of the existing re-

search focuses on low-level features such as movement speed and strength. In this work, we propose a

framework for automatic motion analysis and visualization, which allows us to evaluate high-level skills

such as the richness of actions, the flexibility of transitions and the unpredictability of action patterns.

The core of our framework is the construction and visualization of the posture-based graph that focuses

on the standard postures for launching and ending actions, as well as the action-based graph that focuses

on the preference of actions and their transition probability. We further propose two numerical indices,

the Connectivity Index and the Action Strategy Index, to assess skill level according to the graph. We

demonstrate our framework with motions captured from different boxers. Experimental results demon-

strate that our system can effectively visualize the strengths and weaknesses of the boxers.

© 2017 The Author(s). Published by Elsevier Ltd.

This is an open access article under the CC BY license. ( http://creativecommons.org/licenses/by/4.0/ )

1. Introduction 1

Computer technologies have taken on a crucial role in modern 2

sports and health sciences, in revolutionizing the way to observe, 3

analyze, and improve the performance of both amateur and profes- 4

sional athletes. Computer-managed weight lifting machines, tread- 5

mills and many other training equipment provide energy consump- 6

tion or repetition and weight management in many sport clubs. 7

Virtual reality technology has been applied in various training sys- 8

tems in baseball [1] , handball [2] and tennis [3] to assist more pro- 9

fessional sport activities. Nevertheless, these technologies are only 10

able to analyze motions at a low level, i.e. recording the timing or 11

repetitions of basic motions and comparing movement trajectories 12

with those performed by better players. More advanced technolo- 13

gies are needed for personalized and higher-level analysis compa- 14

rable to that from human experts. 15

In addition to the instantaneous movement features of the 16

sports players, Experienced sport coaches consider high-level fea- 17

tures such as the variety of actions and quality of transitions from 18

one action to another. Taking boxing as an example, professional 19

boxers have in basic actions such as defence, stepping and attack, 20

threading through which the transitions are carried out based on 21

∗ Corresponding author.

E-mail addresses: [email protected] (Y. Shen), [email protected]

(H. Wang), [email protected] (E.S.L. Ho), [email protected] (L.

Yang), [email protected] (H.P.H. Shum).

the strategy and the opponent’s reactions. The action transitions 22

of a good boxer need to be flexible and contain great variety to 23

achieve the optimal outcome. Such information often serves as an 24

important indicator in assessing the skill level of a player, and the 25

same principle applies to many other sports such as basketball 26

[4] and fencing [5] . Unfortunately, automatic systems for analyzing 27

and evaluating sports motions at such a high level is very limited. 28

In this paper, we propose a robust visualization system to ad- 29

dress the above limitations, by represent motions as an interactive 30

graph of high-level features, including the flexibility and richness 31

of the actions as well as the transitions of actions. Although we 32

use boxing as a demonstration in this paper, our method is generic 33

and can be applied to different sports. Our approach starts with 34

capturing the shadow boxing training motion of a boxer, in which 35

the boxer performs boxing with an imaginary opponent. An experi- 36

enced coach can effectively assess the boxer’s skill by watching the 37

shadowing boxing motions. As a positive side effect, this method of 38

motion analysis greatly reduces the complexity of motion capture 39

due to occlusion and collision and has shown to be very effective 40

in our system. The motion data is then processed and visualized 41

in two different graphs: the posture-based graph and the action- 42

based graph, for performance analysis. 43

In the posture-based graph, the semantic actions segmented 44

from the captured motion are grouped into clusters based on a 45

customized distance function that considers action specific fea- 46

tures. Our system then automatically generates a motion graph 47

structure known as Fat Graph [6] , which uses nodes to represent 48

https://doi.org/10.1016/j.cag.2017.09.007

0097-8493/© 2017 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license. ( http://creativecommons.org/licenses/by/4.0/ )

Please cite this article as: Y. Shen et al., Posture-based and Action-based Graphs for Boxing Skill Visualization, Computers & Graphics

(2017), https://doi.org/10.1016/j.cag.2017.09.007


http://www.ScienceDirect.com

http://www.elsevier.com/locate/cag

http://creativecommons.org/licenses/by/4.0/

mailto:[email protected]






http://creativecommons.org/licenses/by/4.0/


2 Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx

ARTICLE IN PRESS


groups of similar postures to start and end actions, and edges 49

to represent groups of action. By applying dimensional reduction 50

techniques, this graph can be visualized in a 3D space for per- 51

formance analysis and evaluation. The transition capability of the 52

boxer are visualized by the connectivity of the nodes, where the 53

richness and preference of the actions are visualized by the edges 54

in the graph. We further propose a skill evaluation metric known 55

as the Connectivity Index which evaluates the richness of actions 56

and the flexibility of transitions according to the graph. 57

Whilst the posture-based graph focuses on the variety of basic 58

postures and the transition flexibility between actions, the action- 59

based graph mainly considers the richness of actions and the tran- 60

sition probability among them. The action-based graph is con- 61

structed as a customized Hidden Markov Model (HMM) [7] , in 62

which similar actions are grouped into clusters that formulate the 63

nodes. The transition probability among actions is calculated and 64

is expressed as edges between nodes. The graph is visualized in a 65

3D space, and the positions of the nodes and edges are optimized 66

for better visualization. With such a graph, the pattern of action 67

launching can be easily identified in order to assess the boxing 68

strategy of the boxer. We further propose the Action Strategy In- 69

dex to evaluate the unpredictability of action patterns according to 70

the graph. 71

We conducted experiments on the motions captured from mul- 72

tiple boxers and evaluate their skills. The corresponding posture- 73

based and action-based graphs were generated. As shown in 74

Fig. 10 , we can easily evaluate the skills of different boxers with 75

our visualization system. 76

There are three main contributions of this work: 77

• We propose a framework for high-level skill analysis through 78

automatic motion analysis and visualization. Given a captured 79

motion from a sports player, our system automatically seg- 80

ments the motion into semantic action units and constructs 81

two graph structures. 82

• We propose the posture-based graph, which is a variant of the 83

Fat Graph, to visualize the skills according to different standard 84

postures for launching and ending actions. It allows the user to 85

identify the correctness of standard postures and the diversity 86

of actions. We further propose the Connectivity Index that eval- 87

uates the richness of actions and the flexibility of transitions. 88

• We propose the action-based graph, which is a variant of the 89

Hidden Markov Model (HMM), to visualize the skill according 90

to different groups of action. It allows the user to identify the 91

preference of actions and their transition probability. We fur- 92

ther propose the Action Strategy Index to evaluate the unpre- 93

dictability of action patterns. 94

The preliminary results of this work were published in a con- 95

ference paper [8] , which proposed only the posture-based graph. 96

In this paper, we extend the work by introducing the new action- 97

based graph. We perform analysis and experimental evaluation of 98

such a graph, and compare its performance with the posture-based 99

graph. We have also updated the paper thoroughly such that the 100

two graphs are presented in an organized and effective manner. 101

The rest of this paper is organized as follows. Related works 102

are reviewed in Section 2 . The details of motion capture and or- 103

ganization are given in Section 3 . In Sections 4 and 5 , we explain 104

the design and implementation of the posture-based graph and the 105

action-based graph respectively. Related experiments can be found 106

in Section 6 . The paper is concluded in Section 7 with future re- 107

search directions discussed. 108

2. Related work 109

2.1. Sports visualization 110

Helping athletes on skill improving via the visualization of 111

sport motions is a field that has not been fully explored in the 112

field of sports science. Existing research [9,10] mainly focuses 113

on the appearance changes of motions when body and motion 114

parameters are changed. For example, Yeadon [9,10] has done 115

research on how diving and somersault motions change when 116

the motions are launched at different timings by using physical 117

simulation. Although such tools are useful for the athletes to 118

interactively visualize possible results under different parameters, 119

they can only evaluate the performance of sports that do not 120

require complex maneuvers and strategies, such as jumping, high 121

jumping, sky jumping, or somersaults. In many sports games, 122

the performance depends not only on physical factors such as 123

velocity, power and strength, but also on flexibility to switch from 124

one motion to another and richness of the player’s motions. This 125

high-level information has not been used to visualize the skills 126

of the athlete in previous research and it is the major difference 127

between our work and the afore-mentioned ones. In this research, 128

we combine the approaches of motion graph [11–13] and dimen- 129

sionality reduction [14,15] to visualize high-level skills information 130

of the athletes for the skill assessments. 131

2.2. Motion graphs for motion modeling 132

The Motion Graph approach [11–13,16–19] is a method to inter- 133

actively reproduce continuous motions based on a graph generated 134

from captured motion data. Reitsma and Pollard [20] compared 135

different motion graph techniques comprehensively. Heck et al. 136

[21] further parametrized the motion space to control how the 137

motions are generated by blending samples in the motion graph. 138

Such an approach can be used for interactive character control 139

such as that in computer games. When it comes to graph con- 140

struction, [16,17] are the ones most similar to our method. Min 141

et al. [16] grouped similar postures and transitions into nodes and 142

edges. Their focus was the motion variety of synthesized motions 143

so they used generative models to fit the posture and motion data. 144

Our focus is on skill visualization through the analysis of postures 145

and motions so we can afford simpler and faster methods of analy- 146

sis. Beaudoin et al. [17] cluster postures first then find motion mo- 147

tifs by converting the motion matching task into a string matching 148

problem. Their priority was to find motifs that were representa- 149

tive while our focus is to visualize motion details and statistics 150

to help people assess the skills. Xia et al. [22] constructed a se- 151

ries of local mixtures of autoregressive models (MAR) for model- 152

ing the style variations among different motions for real-time style 153

transfer. They demonstrated style-rich motions can be generated 154

by combining their method and motion graph. 155

Since the Motion Graph produces a lot of edges and nodes 156

without any context, it becomes difficult to control generated mo- 157

tion as the user wishes. Safonova and Hodgins [23] optimized 158

the graph structure by combining motion graph and interpolation 159

techniques to improve performance. On the other hand, works to 160

resolve this problem by introducing a hierarchical structure were 161

proposed [6] . These approaches add topological structures into the 162

continuous unstructured data so that the motion synthesis can be 163

done at a higher level. In a sport like boxing, it is possible to cre- 164

ate a motion graph of semantic actions such as attack and defence, 165

which is known as the action-level motion graph [24,25] . A re- 166

cent work by Hyun et al. [4] proposed Motion Grammars to spec- 167

ify how character animations are generated by high-level symbolic 168

description. Such an approach can be used with existing animation 169

systems which are built based on motion graphs. Ho and Komura 170




Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx 3

ARTICLE IN PRESS


[26] built a finite state machine (FSM) based on Topology Coor- 171

dinates [27] for synthesizing two-character close interactions. The 172

sparse graph structure can be used for controlling the movement 173

of virtual wrestlers in computer games. The purpose of these ap- 174

proaches, however, is motion generation rather than the visualiza- 175

tion of the player’s skill. 176

In our research, we adapted a hierarchical motion graph struc- 177

ture called the Fat Graph [6] on the action level to analyze the con- 178

nectivity and the variety of a captured motion set. In a fat graph, 179

similar nodes are grouped together as fat nodes, and similar edges 180

are grouped as fat edges, allowing better organization of motion 181

data. The filtered motion graph is a variation of the Fat Graph, 182

in which the temporal relationship between poses are considered 183

[28] . Such a structure, however, is targeted for motion reconstruc- 184

tion and analysis rather than visualization [29] . 185

2.3. Statistical motion modeling 186

Dimensionality reduction methods have been proposed to vi- 187

sualize the overall structure of captured motions. Grochow et al 188

[14] proposed a method to project the 3D motions of a human 189

onto a 2D plane, and further reconstruct 3D motions by mapping 190

arbitrary points from the 2D plane back onto 3D joint space. PCA 191

[15] and ISOMAP [30] are proposed to map the motions onto 2D 192

planes. Due to the high variation of human motion, local PCA that 193

considers only a relevant subset of the whole motion database in 194

order to generate a locally linear space is proposed [31,32] . One can 195

generate motions from arbitrary points on the plane by interpo- 196

lating the postures of the original motion. Meanwhile, non-linear 197

methods [33,34] and Deep Learning [35] have also been used to re- 198

duce the dimensionality of motions. The Gaussian Process [36] and 199

the mixture of Gaussian Processes [36] can be used to represent 200

a set of human postures with a small number of Gaussian param- 201

eters. However, such methodologies do not take into the account 202

the connectivity structure of the motions. We apply dimensional- 203

ity reduction to our graph structure to visualize the connectivity 204

structure of captured motions on a 2D plane. 205

Other researchers have focused on the connectivities of mo- 206

tion/actions by methods such as Markov models. Hidden Markov 207

Model (HMM) [7] has been widely used in analyzing and synthe- 208

sizing human motion. Typically, the hidden states of the HMM are 209

the distribution of body poses and the dynamics of the motions 210

are represented by the transitions between the hidden states. The 211

parameters of the HMM can then be learned from training data us- 212

ing the Expectation-Maximization (EM) algorithm. Hara et al. [37] 213

proposed to model daily activities using HMM in intelligent house. 214

Françoise et al. [38] proposed to use HMM models for analyzing 215

Tai Chi motion sequences. An early work proposed by Brand and 216

Hertzmann [39] proposed to learn the dynamics of human mo- 217

tion using HMM in their motion style synthesis model. Tango and 218

Hilton [40] proposed to learn a HMM model from captured hu- 219

man motion for synthesizing in-between frames in keyframe an- 220

imation. Ren et al. [41] presented a data-driven approach for quan- 221

tifying naturalness of human motion including those synthesized 222

by HMM. While existing work focuses on finding statistical distri- 223

butions of motions, our focus is on visualizing the motion richness 224

and the transition dynamics for skill assessments. 225

3. Motion capture and organization 226

We first capture the motion required for analysis using motion 227

capture systems. Then, we propose an automatic system to seg- 228

ment long sequences of captured motion into meaningful actions, 229

which are used as building blocks of our posture-based and action- 230

based graphs. 231

Fig. 1. The shadow boxing motions of several boxers were captured using an optical

motion capture system.

Here, we follow the definition from [25] , in which a motion is 232

considered to be a raw sequence of captured human movement, 233

and an action is considered to be a short, meaningful segment of 234

movement within a motion. In the field of boxing, an action can 235

be an attack (such as a “left straight”, “jab” or a “right kick”), a 236

defense (such as “parries”, “blocking” or “ducking”) , a transition 237

(such as “stepping to the left”, “stepping forward” or “back step”), 238

or any combination of them. 239

Postures and actions are good entities for skill visualization, as 240

sports players typically plan their strategies and evaluate their per- 241

formances with such terms. For example, a boxer typically thinks 242

about what sort of attack/defense/transition should be launched 243

during a match. A coach typically evaluates the overall strategy in 244

the action level, as well as how well individual postures and ac- 245

tions are performed. 246

3.1. Motion capture 247

Although it would be best to capture the motions of all players 248

in multi-player sports because the data would reflect the features 249

of the motions, capturing multiple players remains difficult due 250

to the occlusions and collisions among players. Fortunately, it is 251

possible to only capture individual motions for our purposes with- 252

out compromising the true motion characteristics. In boxing or any 253

other martial arts, there is a training practice called “shadow box- 254

ing”. The boxer imagines a boxing session with another boxer, and 255

launches boxing actions to interact with such an imaginary oppo- 256

nent. The boxer launches not only offensive actions such as punch- 257

ing, but also defence, stepping, and the consecutive combination of 258

all such actions. There are similar practice methods in basketball 259

and soccer as well, in which players use the ball to conduct var- 260

ious techniques in the court, imagining that their opponents are 261

trying to take the ball away from them. The players thus perform 262

various actions to keep the ball and trick an imaginary opponent. 263

This technique has also been used by coaches for skill assessment 264

hence is suitable for our analysis. We employed an optical motion 265

capture system to acquire the performed motion as shown in Fig. 1 266

as it was less intrusive and highly accurate. Also, we preferred to 267

capture long and continuous clips of motion, such that the player 268

could perform the motion in a natural manner. 269

3.2. Motion analysis 270

After data capture, the system automatically segments mean- 271

ingful actions from the raw captured motion, and identifies the ef- 272





ARTICLE IN PRESS


Fig. 2. Upper: The movement segment is defined as the period between two double support supporting phases. Middle: The activity segment is defined as the period with

high acceleration. Lower: The action is the combination of movement segment and activity segment.

fective joints that contribute the most to the semantic meaning of 273

the actions. 274

For boxing motions, we observed that actions normally start 275

and end in a double supporting state (i.e. both feet touching the 276

floor), as the state is usually dynamically stable. We detect such a 277

state by monitoring the feet height and velocity and setting corre- 278

sponding thresholds. This allows us to segment the raw captured 279

motion into a set of movement segments , which are the periods be- 280

tween every two successive double supporting states, as visualized 281

in Fig. 2 Upper. 282

We also observed that actions normally require a relatively 283

larger force to be performed, such as a punch or a step. We de- 284

fine periods with a high-level of force exertion as the activity seg- 285

me286

can287

is a288

is s289

290

me291

act292

sho293

rela294

the295

me296

me297

is o298

thi299

ity.300

ces301

con302

act303

ing304

me305

sho306

sup307

308

an 309

spe310

In 311

the312

ity 313

pun314

tip315

eva316

4. Posture-based graph 317

The posture-based graph focuses on evaluating the common 318

postures that are used to start and end actions. In such a graph, 319

the nodes represent similar postures and the edges represent sim- 320

ilar actions. It allows us to evaluate the consistency of common 321

postures and the diversity of actions. 322

4.1. Graph construction 323

We adopt a Fat Graph structure [6] in the action level [25] to 324

generate the posture-based graph, as it can effectively simplify the 325

graph representation by grouping similar postures and actions to- 326

get 27

and 28

alg 29

pur 30

4.1. 31

32

are 33

des 34

ing 35

avo 36

ica 37

two 38

D (P

wh 39

in 40

join 41

ma 42

Pl

(2

nts . Since force is proportional to acceleration, these segments

be found when the sum of squares of acceleration of all joints

bove a threshold, as visualized in Fig. 2 Middle. The threshold

tatistically obtained from the acceleration profile of the motion.

Finally, the actions are composed by using the movement seg-

nts as the building blocks. The timing and the duration of the

ivity segments are used to determine if the movement segments

uld be merged together to form longer segments. Regarding the

tionship of the movement segments and the activity segments,

re could be three possible cases: (1) There is no activity seg-

nt inside a movement segment. In this case, the movement seg-

nt becomes a single action of pure body transition. (2) There

ne activity segment inside a movement segment. In this case,

s movement segment becomes an action with a special activ-

(3) There are one or more activity segments lying across suc-

sive movement segments. In this case, the movement segments

taining activity segments at the border are merged to form an

ion as visualized in Fig. 2 Lower. Note that due to this merg-

process, the resulting action may contain multiple activity seg-

nts. In our system, we implement an optional step to filter very

rt actions that are likely to be generated due to the noise of the

porting feet.

We define the effective joints to be the set of joints to represent

activity segment. In case (1) above, since the actions contain no

cial activities, the pelvis is considered to be the effective joint.

case (2) and (3), the effective joint is the joint that contributes

most to the sum of squares of the acceleration in the activ-

segment. In more complicated actions such as left-right combo

ches, there may be multiple effective joints as there are mul-

le activity segments. Such joints are used in later processes to

luate the similarity of actions.

ple 43

int 44

we 45

per 46

init 47

firs

clu

cor

of s

the

ease cite this article as: Y. Shen et al., Posture-based and Action-based

017), https://doi.org/10.1016/j.cag.2017.09.007

her. The Fat Graph was originally proposed for motion synthesis, 3

thus it is not optimized for skill visualization. We redesign the 3

orithms to generate nodes and edges in the Fat Graph for our 3

pose. 3

1. Fat Nodes 3

In our system, the nodes of the Fat Graph, known as Fat Nodes, 3

the common starting or ending postures of the actions. We 3

ign an unsupervised clustering scheme for grouping all start- 3

/ending postures into a finite set of posture groups, which 3

ids additional labour for posture labelling and grouping. Specif- 3

lly, we used k-means to cluster postures. The distance between 3

postures P 0 and P 1 is defined as: 3

0 , P 1 ) =

i = i total ∑

i =0

| θ0 (i ) − θ1 (i ) | (1)

ere θ0 ( i ) and θ1 ( i ) represent the 3D joint angle of the joint i 3

posture P 0 and P 1 respectively, and i total is the total number of 3

ts. Regarding the cluster number k , a large k would result in 3

ny clusters (Fat Nodes), which unnecessarily increases the com- 3

xity of the graph. A small k will cluster very different postures 3

o the same node, defeating the purpose of the graph. Therefore, 3

set up a posture difference threshold empirically based on ex- 3

ts’ suggestions. Then, we iteratively search for a proper k by 3

ially setting k = 1 and incrementing k by 1 until we find the 3

t value of k that does not violate the distance threshold. After 348

stering, we use the mean posture of a group to represent the 349

responding Fat Node. The nodes in the graph represent the set 350

tandard postures which the player starts various action from. In 351

case of boxing, they are usually the fighting postures that the 352

Graphs for Boxing Skill Visualization, Computers & Graphics



ARTICLE IN PRESS


th 353

354

a 355

n- 356

ce 357

a 358

er 359

or 360

as 361

to 362

363

364

i- 365

ge 366

at 367

368

s- 369

se 370

t- 371

is- 372

ed 373

n- 374

ut 375

376

ed 377

378

2)

ve 379

is 380

re 381

ve 382

be 383

he 384

385

c- 386

ok 387

t- 388

to 389

up 390

to 391

392

es 393

r- 394

ve 395

en 396

u- 397

ut 398

o- 399

ip 400

401

402

ss 403

ed 404

ng 405

406

Fat

a- 407

he 408

si- 409

r- 410

he 411

at 412

as. 413

ch 414

n- 415

ns 416

417

In 418

r- 419

ity 420

ot 421

th 422

423

3)

le- 424

lly 425

s- 426

he 427

e- 428

ly 429

ng 430

r- 431

an 432

433

434

vi- 435

so 436

th 437

438

es 439

at 440

n), 441

n- 442

es 443

h. 4 4 4

al 445

i- 446

at 447

on 448

449

boxer uses to guard his/her face against the opponent, with bo

feet landing on the ground and keeping shoulder width apart.

By evaluating the Fat Nodes alone, we can already tell if

boxer has multiple unnecessary standard postures, or if any sta

dard postures contain potential weakness. In general, experien

players have fewer Fat Nodes, such that they can start actions in

standard posture effectively without the needs of shifting to oth

ones. Novice players sometimes may have a particular Fat Node f

some particular actions. This is discouraged in boxing training

such postures hint the opponent as to what actions are going

be launched.

4.1.2. Fat Edges

We design the edges of a Fat Graph, known as Fat Edges, as d

rectional edges that represent groups of similar actions. Each ed

points from the Fat Node representing the starting posture to th

representing the ending posture.

Similar to the Fat Nodes, we implement an unsupervised clu

tering algorithm to group similar actions into Fat Edges. We u

k-means to cluster the actions and search for the smallest accep

able k for a given distance threshold. We define the actions d

tance according to the trajectory of the effective joints as explain

in Section 3.2 . This allows accurate clustering of actions and e

sures that the effects of the effective joints are not smoothed o

by other joints.

Formally, the distance between two actions A 0 and A 1 is defin

as

D (A 0 , A 1 ) =

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

∞

if A 0 and A 1 have different sequences of effective joints

j total ∑

j=0

f end ∑

f= f start

[ A 0 ( j)( f ) − A 1 ( j)( f )]

otherwise

(

where A 0 ( j )( f ) and A 1 ( j )( f ) represent the 3D positions of effecti

joint j in frame f in the action A − 0 and A 1 respectively, j total

the total number of effective joints in the actions, f start and f end a

the starting frame and ending frame of the considering effecti

joint. In case two effective joints with different duration are to

compared, the shorter one is linearly scaled to the duration of t

longer one.

In the field of boxing, a Fat Edge typically contains a set of a

tions with basic attacks or defences such as “straight punch”, “ho

punch”, “parry”, or a set of complex actions combining several a

tacks and defences. Since member actions in a Fat Edges have

share the same starting and ending Fat Nodes, if an action gro

contains multiple starting or ending poses, it is sub-divided in

multiple Fat Edges.

Again, by only looking at Fat Edges, one can tell the differenc

between experienced and novice players. Experienced players no

mally have Fat Edges with similar numbers of actions, as they ha

mastered a large variety of boxing actions and can switch betwe

them effectively using a small number of stable transition mane

vers. Novice boxers tend to have a larger number of Fat Edges b

each with a small number of actions, due to the inability to repr

duce boxing actions consistently. Figure 3 shows the relationsh

of Fat Nodes and Fat Edges.

4.2. The connectivity index

It requires deep knowledge and years of experience to asse

one’s skills in sports. Here, we make use of the posture-bas

graph and define an index representing the skill level, allowi

more objective and efficient skill assessment.

Please cite this article as: Y. Shen et al., Posture-based and Action-


Fig. 3. The Fat Node represents the standard fighting pose. The three outgoing

Edges represent different action groups.

In many types of sports, there are two important skill indic

tors. The first one is the richness of the actions that indicates t

resourcefulness of a player. The other is the flexibility of tran

tions between states so that the player can switch between diffe

ent states at will. Our posture-based graph captures both of t

indicators. The richness can be represented by the number of F

Edges, indicating how many kinds of maneuvers the player h

The flexibility is indicated by the connectivity of the graph, whi

is inversely proportional to the number of Fat Nodes. A fully co

nected graph shows great flexibility because there are transitio

between any two nodes.

Notice that these two factors are somehow contradicting.

general, the richer the actions are, the greater the number of diffe

ent starting and ending poses is hence the poorer the connectiv

of actions is. Independently considering either of them would n

suffice. We therefore define a Connectivity Index that evaluates bo

the action richness and the action flexibility of a player

CI =

Number of Fat Edges

Number of Fat Nodes (

To accurately reflect the skill level of a player, in our imp

mentation, we do not consider Fat Nodes that are not intentiona

created. For example, one of our boxers tripped over during a se

sion. While it is good that our system can objectively pick up t

posture generated by the accident, we do not include the corr

sponding Fat Nodes when calculating the Skill Index. Also, we on

consider Fat Edges that are consistently performed, as those havi

only a small number of member actions could be randomly pe

formed actions. Empirically, we consider edges having more th

2 member actions.

4.3. Visualization system

Here, we describe the design of our visualization system to

sualize the posture-based graph in an effective manner. We al

introduce interactive features for the user to view the graph wi

different levels of details.

The posture-based graph consists of high dimensional Fat Nod

(groups of similar postures of many degrees of freedom) and F

Edges (groups of similar actions in the spatial-temporal domai

which presents a challenge for visualization. To reduce the dime

sionality for better visualization, we propose two different schem

for nodes and edges due to their different nature in this grap

Specifically, we project the Fat Nodes on a 2D space using Princip

Component Analysis (PCA) as it creates a more consistent low d

mensional space compared with other methods. We represent F

Edges with 2D curves and augment the curves with a combinati

of geometric primitives to visualize the action features.

based Graphs for Boxing Skill Visualization, Computers & Graphics



ARTICLE IN PRESS


es larger as the size of the nodes increases.

4.3450

451

hig452

dep453

res454

are455

far 456

bet457

458

tat459

ysi460

cul461

of 462

a c463

eig464

465

[14466

we467

wo468

469

Z p470

pos471

box472

pri473

num474

allo475

ally476

4.3477

478

info479

480

on 481

wo482

Edg483

pla484

occ485

ing486

the487

cur488

edg489

sig490

wo491

492

som493

ene494

usi495

ene496

Fig.

repr

with

sho

tur 97

nor 98

val 99

lan 00

per 01

obt 02

03

edg 04

tify 05

hea 06

wil 07

rela 08

tac 09

10

Rig 11

vis 12

itiv 13

com 14

4.3 15

16

pla 17

lec 18

sho 19

20

be 21

ber 22

ber 23

Edg 24

Pl

(2

Fig. 4. From left to right, the character becom

.1. Visualizing Fat Nodes

Although the degree of freedom (DOF) of human postures are in

h dimensionality (45 DOF in our system), they are intrinsically

endent on each other [14] . In fact, the Fat Nodes can be rep-

ented effectively in a 2D space where nodes of similar postures

located together while those of different postures are located

apart. This allows viewers to easily understand the relationship

ween postures.

For each Fat Node, we obtain the mean posture as its represen-

ion. Given a set of postures, we apply principal component anal-

s (PCA) to reduce the dimensionality to 2. Essentially, we cal-

ate the covariance matrix to evaluate the intrinsic dependency

the dimensions. We then calculate the eigenvectors from such

ovariance matrix, and use the two eigenvectors with largest

envalues to form a feature vector.

PCA is used as it has shown to be effective on human postures

] . However, since we only have a small number of postures,

believe other dimensionality reduction techniques would also

rk.

We render the mean posture of each Fat Node onto a 2D X-

lan. This allows the user to identify inappropriately performed

tures. In boxing, novice boxers sometimes lose track of their

ing rhythm, and hence start or end an action with an inappro-

ate posture. We use the fatness of the character to represent the

ber of member postures in the node, as shown in Fig. 4 . This

ws the user to easily observe the postures that the player usu-

uses to start actions.

.2. Visualizing Fat Edges

Here, we explain how to visualize the Fat Edges, which contain

rmation of groups of similar actions.

We do not apply dimensionality reduction techniques directly

the action data itself because the low dimensional projection

uld be very complex. Instead, we propose to visualize each Fat

e by a 2D curve that represents its mean action on the X –Z

ne. We optimize the angle and sign of these curves to minimize

lusion. For edges with a starting node different from the end-

node, the edge angle is fixed. The only adjustable variable is

bending side of the curves, which is essentially the sign of the

ves. For those with the same starting and ending node, both

e angle and bending side can be controlled. We optimize the

ns and angles of the edges in a greedy manner such that they

uld blend towards a less dense region of the graph.

To visually distinguish between different Fat Edges, we add

e geometric patterns to the 2D curves. We collect the high-

rgy frames of all actions and project them onto a 1D space

ng the PCA system explained in Section 4.3.1 . Since the high-

rgy frames of different actions are typically distinguishing pos-



5. The geometric patterns for landmark values between -1 and 1. Each pattern

esents a landmark posture in an action. (Lower) Comparison of visualization

out/with the patterns. Each curve represents a group of action. The right image

ws the uses of landmark patterns to identify different types of action.

es, the projection essentially maps all action features onto a 4

malized 1D space in the range of [ −1 . 0 , 1 . 0] . To visualize the 4

ue in this 1D space, we design some geometric patterns for 4

dmark values -1.0, -0.5, 0.0, 0.5 and 1.0 as shown in Fig. 5 Up- 5

. The patterns to represent values between two landmarks are 5

ained by linear interpolation between nearby landmarks. 5

We further represent the number of member actions in the 5

e by the thickness of the curve. This allows the user to iden- 5

the player’s preferred actions. For instance, if a boxer relies 5

vily on single straight punches, the Fat Edge for such action 5

l be unreasonably thick, while edges for other attacks will be 5

tively thin, which demonstrates a potential lack of diversity at- 5

king strategies. 5

Through the comparison between Fig. 5 Lower Left and Lower 5

ht, it shows that adding the geometric patterns gives a better 5

ualization of actions in the edges. This strategy presents an intu- 5

e way to show the players preferences over actions of different 5

plexity. 5

.3. Interactive features 5

We integrate some interactive features in our system to dis- 5

y relevant information based on user input. When the user se- 5

ts any specific entities in the graph, related information will be 5

wn. 5

When a Fat Node is selected, its corresponding Fat Edges will 5

highlighted for easier observation. Information about the num- 5

of members in that node, number of outgoing edges, and num- 5

of incoming edges are displayed in a sub window. When a Fat 5

e is selected or highlighted (because of a Fat node selection), 5




ARTICLE IN PRESS


Fig. 6. The posture-based graph of the Boxer S. 1, 2 and 3 are Fat Nodes. 4 and 5

are two Fat Edges. 4 connects Node 2 and Node 3. 5 connects Node1 to itself.

we render the member actions included, such that the user can 525

understand the content of the edge. 526

As an example, in Fig. 6 , there are three Fat Nodes indicated by 527

red arrows and numbered as 1, 2 and 3, each visualized as a char- 528

acter with a mean posture in the node. The sizes of the nodes are 529

indicated by the body fatness. Node 1 is represented by the most 530

muscular character, which indicates the largest node size. Nodes 2 531

and 3 are far thinner. Fat Edges are rendered as curves between 532

nodes such as the ones shown by 4 and 5. The thicknesses of the 533

er 534

n. 535

th 536

nt 537

of 538

r- 539

1. 540

a 541

542

543

on 544

he 545

ts. 546

on 547

ns 548

549

550

p- 551

u- 552

en 553

e 554

ze 555

556

e 557

c- 558

at 559

rt 560

es 561

of 562

ch 563

nd 564

565

m 566

ty, 567

nd 568

up 569

ed 570

571

Fig. 7. The three HMM nodes represent action groups. The HMM edges represent

transitional probability between them.

is defined as 572

4)

on 573

se, 574

575

In 576

ed 577

re 578

he 579

an 580

nd 581

b- 582

o- 583

en 584

585

586

- 587

eir 588

xt 589

on 590

ed 591

ce 592

is 593

es 594

in 595

a- 596

a- 597

ex , 598

in 599

ng 600

tal 601

es 602

es. 603

of 604

605

5)

SI 606

e- 607

edges indicate the frequency of the actions taken. Edge 5 is thick

than edge 4, suggesting that this boxer takes action 5 more ofte

In addition, an edge can be smooth like a circle or bumpy wi

geometric patterns. A single pattern means one activity segme

such as a single punch, while multiple patterns indicate a series

activities such as a combo attack. Our system also supports inte

active features. Fig. 6 is the result when the user selects Node

All the edges starting from this node are highlighted, each with

small character performing the action on it.

5. Action-based graph

The action-based graph focuses on evaluating the transiti

probability from one action class to another. In such a graph, t

nodes represent groups of action with similar activity segmen

The edges represent the transition probability between two acti

groups. It allows us to evaluate the pattern of launching actio

and extract the strategy of the boxer.

5.1. Graph construction

We use the hidden Markov model (HMM) to organize the ca

tured motion, as it has been shown effective in modelling h

man motion. In the domain of character animation, HMM has be

mostly used in the posture level to create motion graphs [12] . W

adapt the graph into the action level such that we can visuali

the transition probability among actions.

The nodes of the graph represent different action groups. W

apply Eq. (2) to group the captured actions into a number of a

tion groups with k-means clustering. The process is similar to th

in Section 4.1 , in which we define a threshold based on expe

knowledge, and then incrementally increase the number of class

until the threshold is met. We denote k ′ as the total number

groups, | G i | as the number of actions in the i th action group (whi

is used in the visualization system for visualizing the fatness a

the placement of the node and will be described later).

The edges of the graph represent transitional probability fro

one action group to another. To obtain the transitional probabili

we go through the sequence of actions in the captured motion a

count the number of occurrences for an action belonging to gro

i to be followed by another belonging to group j , which is denot

as c ij . The transition probability of action group i to action group j 608



T i j =

c i j ∑ k ′ m =1

∑ k ′ n =1 c mn

(

where the denominator represents the total number of transiti

in the whole motion. Notice that i may be equal to j . In such a ca

two actions of the same action group are launched successively.

The concept of the action-based graph is shown in Fig. 7 .

general, experienced boxers tend to have a more evenly distribut

transitional probability across all actions, which means that the

should be edges connecting all the nodes. This indicates that t

boxer’s pattern is dynamic and cannot be easily predicted by

opponent. Conversely, novice boxers may have limited edges a

some thick edges connecting two nodes, which means a high pro

ability to launch those two groups action consecutively. An opp

nent may discover such a pattern and counter-act in advance wh

the first action is observed.

5.2. The action strategy index

In many sports, the unpredictability of action patterns is an im

portant skill indicator. Experienced players would diversify th

action patterns such that their opponents cannot predict the ne

action. However, novice players tend to perform actions based

predictable patterns (i.e. the sequence of actions to be launch

continuously), which can be easily identified. For example, a novi

boxer usually perform two straight punches successively. This

because the boxer is not able to link different types of punch

fluently, and therefore would perform the simplest punches aga

and again. The proposed action-level graph allows easy observ

tion of boxing patterns, as we can visualize the transitional prob

bility among actions. We further propose the Action Strategy Ind

which evaluates the unpredictability of action pattern. We obta

the number of outgoing HMM edges for each HMM node, formi

a set that is denoted as e = { e i } ∀ i ∈ [1 , k ′ ] , where k ′ is the to

number of HMM nodes. Skillful players would have similar valu

in the e set, while novice players would have very different valu

We therefore define the Action Strategy Index as the precision

e , that is, the reciprocal of its standard deviation

ASI =

1

σ (e ) (

where σ represents the standard deviation operator. A high A

value indicate that the player’s action patterns are more unpr

dictable, which indicates a higher skill level.




ARTICLE IN PRESS


Fig. shold as (a) 0, (b) 1 and (c) 2. The red shade indicates the inner circle covering nodes

of t e rare class.

5.3609

610

gra611

tion612

eva613

5.3614

615

act616

the617

usi618

act619

usi620

nod621

622

esp623

pea624

wh625

ten626

int627

wh628

in 629

res630

gen631

632

circ633

tha634

on.635

res636

sm637

pla638

is t639

imp640

opt641

con642

exi643

to 644

645

cla646

sem647

tra648

5.3649

650

the651

cul652

the653

als654

see655

Fig.

freq

belo

is 56

edg 57

lau 58

are 59

to 60

ticu 61

5.3 62

63

can 64

act 65

an 66

HM 67

dis 68

Thi 69

wit 70

of 71

ber 72

73

the 74

sua 75

per 76

ing 77

wit 78

wh 79

con 80

the 81

fre 82

and 83

the 84

dia 85

6. 86

87

the 88

Pl

(2

8. Action-based graphs of the same boxer generated by setting the frequency thre

he frequent class, and the blue shade indicate the outer circle covering nodes of th

. Visualization system

Here, we explain the visualization system for the action-level

ph. The system allows easy observation of the preference of ac-

and the boxing pattern. Both are very important aspects to

luate the high-level strategy of a boxer.

.1. Visualizing HMM Nodes

Each action group is represented by its corresponding median

ion, which is the action that is the closest to the mean value of

action group during k-means clustering. We render the nodes

ng human characters with the starting posture of the median

ion. The number of actions in each action group is visualized

ng the fatness of the corresponding character. The color of the

es are randomized.

As mentioned in Section 4.1.2 , we observe that some boxers,

ecially novices, may produce random actions that are not re-

table. Such actions may generate a large number of thin nodes,

ich distract the user from evaluating the actions that are of-

launched. Therefore, we classify the action groups with | G i | > a

o the frequent class , and groups with | G i | ≤ a into the rare class ,

ere G i is the number of member actions in a node as defined

Section 5.1 , a is a preset frequency threshold. Fig. 8 shows the

ult of setting different values of a . We find that setting a = 2

erates the best results.

We place the nodes belonging to the frequent class at an inner

le, and those belonging to the rare class at an outer circle, such

t the user can identify them easily and decide what to focus

For the inner circle, nodes are ordered according to the cor-

ponding value of | G i |, and are placed evenly at a circle with a

aller radius. For the outer circle, to minimize edge crossing, we

ce the nodes at a position on a circle with a larger radius that

he closest to the nodes with incoming and outgoing edges. To

lement this, we develop a simple optimization algorithm that

imizes the position of the nodes. During the optimization, we

strain the position to be at the circle and not overlapping with

sting nodes. We then minimize the sum of distance with respect

the nodes connecting to the current one.

By default, we render the HMM node belonging to the frequent

ss with solid colors, and those belonging to the rare class in

i-transparent colors. This further avoids the user being dis-

cted by the rarely performed actions.

.2. Visualizing HMM Edges

We visualize the edges using 2D curves. While we can render

edges with straight lines, the resultant group would be diffi-

t to observe as the lines overlap significantly. We augmented

edges with a small random curvature to solve the problem. We

o render the edges as semi-transparent such that the users can

through partially overlapped edges. The thickness of the edge



9. The action-based graph of the Boxer S. 1, 2 are HMM nodes belonging to the

uent class. 3, 4 are outgoing HMM edges from the node 1. 5, 6 are HMM nodes

nging to the rare class.

proportional to T ij calculated in Eq. (4) . As a result, a thicker 6

e connecting node i to node j indicates that the boxer often 6

nches action group j after action group i . The color of the edges 6

decided based on that of the source node. This helps the user 6

identify which action groups the boxer may launch after a par- 6

lar one. 6

.3. Interactive features 6

We also implement some interactive features such that the user 6

select what to view. The most important component of the 6

ion-based graph is the action itself. Therefore, we implement 6

interactive system such that when a user clicks on a particular 6

M node, the median action of the corresponding action group is 6

played. We also highlight the outgoing edges from such a node. 6

s allows the user to examine individual action group together 6

h the transition probability to the next groups. The information 6

the node, such as the number of member actions and the num- 6

of out-going HMM edges, are displayed on a separate window. 6

As an example, in Fig. 9 , there are 5 HMM nodes belonging to 6

frequent class including node 1 and 2. These nodes are vi- 6

lized with more muscular characters, meaning that the boxer 6

forms them more frequently. There are 3 HMM nodes belong- 6

to the rare class including node 5 and 6, which are visualized 6

h thinner characters. Node 1 has 5 outgoing HMM edges, in 6

ich edge 3 point towards another node, while edge 4 is a self- 6

necting edge. Edge 4 is thicker than the others, indicating that 6

boxer performs successive actions belonging to node 1 very 6

quently. The screen is captured when the user selects node 1, 6

as a result, all outgoing edges of node 1 are highlighted, and 6

character representing node 1 performs the corresponding me- 6

n action. 6

Experimental results 6

In this section, we present experimental results. We captured 6

motions of four boxers with varying skill levels. We first give 6




ARTICLE IN PRESS


n-bas

s, 689

es. 690

o- 691

lt 692

p- 693

694

nd 695

ch 696

Fig. 10. (Left) The posture-based graphs and (Right) the actio

detailed motion analysis and visualization of individual motion

and then compare them side by side using the proposed index

This demonstrates that our system is an effective tool for m

tion analysis, skill assessment and comparisons. As it is difficu



ed graphs for boxer N, M1, M2 and S (top to bottom), respectively.

to show the motions in pictures, we refer the readers to the su

plementary video for more details.

The four boxers chosen have different skill levels. As a grou

truth, their skills were evaluated by a professional boxing coa




ARTICLE IN PRESS


as skillful, medium, medium and novice respectively, and were de- 697

noted as S, M1, M2 and N. 698

6.1. Boxer evaluation 699

The boxers’ posture-based and action-based graphs are shown 700

in Fig. 10 , in which letter annotations are given to help explain the 701

graphs. These graphs allow the users to assess boxing skills even if 702

they are not familiar with boxing. 703

6.2. Boxer S 704

The first row of images in Fig. 10 shows the graphs of boxer 705

S. T706

sta707

box708

sta709

fro710

arm711

tur712

the713

and714

715

act716

(b)717

con718

gro719

the720

me721

and722

fre723

fre724

a s725

att726

ma727

the728

6.3729

730

box731

tion732

som733

bot734

the735

mo736

foo737

the738

mo739

per740

rien741

742

few743

par744

M1745

act746

als747

fre748

act749

att750

of 751

pre752

6.4753

754

pos755

Table 1

Statistics of the boxing motions. SL: Skill Level evaluated

by a professional boxing coach. PN: Posture Number (for

starting and ending actions). AN: Action Number.

Boxer S Boxer M1 Boxer M2 Boxer N

SL Skillful Medium Medium Novice

PN 138 160 112 176

AN 69 80 56 88

ever, a number of secondary postures (b) –(d). These postures are 756

all performed sub-optimally with his arms not guarding the head, 757

and should be avoided. Looking closely to the edges (f) going to 758

posture (c), we can find that the posture is performed as a subtle 759

movement to prepare various left punches. This should be avoid 760

as the opponent can tell the moves whenever seeing such a pos- 761

ture. Postures (g) and (h) are very different from the rest, and are 762

geometrically far from the other postures. These two postures are 763

performed because the boxer unintentionally raises the arms dur- 764

ing the capture. Our system can pick up the mistake and visualize 765

it i 66

67

the 68

larg 69

Thi 70

tion 71

Dif 72

nod 73

ble 74

Thi 75

pun 76

pun 77

ing 78

and 79

6.5 80

81

ma 82

larg 83

tur 84

tha 85

tur 86

tur 87

to 88

wit 89

due 90

wid 91

92

the 93

cla 94

con 95

com 96

ma 97

nec 98

tion 99

the 00

the 01

mo 02

and 03

an 04

6.6 05

06

Pl

(2

he posture-based graphs shows a main standard posture (a) to

rt and end actions, which is good for boxing as it allows the

er to transit from one action to another effectively through the

ndard posture. A large variety of actions (b) can be produced

m such a posture. There is a secondary posture in which the

s are further apart (c). This should be avoided as such a pos-

e is weak in blocking attacks. Posture (d) is generated because

boxer trips over during the training. Our system can pick up

visualize such a mistake accurately.

The action-based graph of boxer S shows there there are many

ions in the frequent group (a) and only a few in the rare group

. This shows that the boxer is experienced and his actions are

sistent. There is a major movement action (c) in the frequent

up (a), and such an action has good connections to many of

others. This is good as experienced boxers typically use move-

nt actions to adjust their position relative to their opponent,

launch attacks when the time is right. Other actions in the

quent group (a) are variations of attacks. For example, the more

quently used action (d) is a right-left combo and action (e) is

ingle right punch, which show that the boxer tends to start an

ack with the right punch. It is good to see that attacking actions

y connect to each other, which enhance the unpredictability of

boxer.

. Boxer M1

Next, we evaluate the posture-based graph of boxer M1. The

er has a main standard posture (a) to launch most of the ac-

s (b). However, he has a secondary posture (c) for launching

e attacks, and another (d) for launching a turning action. In

h postures, the arms are in a low position and cannot guard

boxer well from the opponent. More importantly, the relatively

re frequently used secondary posture (c) is performed with the

t distance much wider than the shoulder width. This means

boxer has limited mobility in this posture, as the legs must

ve towards each other before another stepping action can be

formed. These observations show that the boxer is not as expe-

ced and consistent as boxer S.

The corresponding action-based graph shows that there are

er frequent class actions (a) but more rare class ones (b) com-

ed to boxer S. This means that that the boxing action of boxer

is less consistent. The boxer has a large number of movement

ions (c) that are connected to all the rest of the action nodes. He

o has a variety of attack actions as shown in other actions in the

quent class (a). In particular, action (d) is a left-right combo and

ion (e) is a left punch, showing that the boxer tends to start an

ack with the left punch. Overall, there is an acceptable number

connections among attacks, demonstrating the acceptable un-

dictability of the boxer.

. Boxer M2

For boxer M2’s posture-based graph, there is a main standard

ture (a) launching the majority of actions (b). There are, how-



n the graph. 7

From boxer M2’s action-based graph, it can be observed that 7

re are relatively fewer actions in the frequent class (a), but a 7

e number of actions in the rare class combining (b) and (c). 7

s shows that the boxer is quite inconsistent in the boxing ac- 7

s, and could be because of the lack of training and experience. 7

ferent from the boxers discussed, boxer M2 has a largest action 7

e (d) of left punch. The second largest action node (e) is a dou- 7

left punch. The movement action node (f) is relatively small. 7

s shows that boxer M2 has a different boxing style to use left 7

ch as a major action to connect to other actions and his left 7

ch is dominant. Such a boxing style is not advised as a punch- 7

action, comparing to a movement one, consume more energy 7

expose a larger risk of being attacked. 7

. Boxer N 7

In the posture-based graph of the novice boxer N, there are two 7

jor standard postures (a) and (b) instead of one. There are a 7

e number of self-connecting actions (c) and (d) for both pos- 7

es, as well as a lot of actions (e) connecting the two. This shows 7

t the boxer is highly inconsistent in the boxing postures. Pos- 7

e (a), the more relatively frequently used one, is inferior to pos- 7

e (b), due to its wider foot distance. It does not allow the boxer 7

step freely. Posture (f), (g) and (h) are all secondary postures 7

h different posture variations. They are all not well performed 7

to the low arm positions limiting blocking capability, and the 7

e foot width limiting movement capability. 7

The corresponding action-based graph shows some actions in 7

frequent class (a) but a large number of actions in the rare 7

ss (b). This means that the novice boxer cannot perform actions 7

sistently. The action in the rare class (b) are mainly very long 7

bo that are randomly combined and cannot be reproduced. The 7

in action (c) is a movement action. Such an action cannot con- 7

t to a number of others in the rare class (b), and many ac- 7

s in the rare class (b) are not well connected. This means that 7

boxer’s action is more predictable, which is bad in a match as 8

opponent can guess what the boxer may launch next. The two 8

re frequently used attack action (d) and (e) are left-right combo 8

left punch respectively, showing that the boxer tends to start 8

attack with a left punch. 8

. Statistical analysis 8

Here, we give some statistics about the proposed system. 8




ARTICLE IN PRESS


Table 2

Statistics of the boxing motions. FNN: Fat Node Number

(brackets show numbers after removing accidentally cre-

ated nodes). FEN: Fat Edge Number (brackets show num-

bers of consistently performed edges). CI: Connectivity In-

dex.


FNN 3 (2) 6 (4) 3 (3) 5 (5)

FEN 20 (10) 36 (12) 16 (7) 57 (8)

CI 5.0 3.0 2.3 1.6

Table 3

Statistics of the boxing motions in the Aciton Graphs. NN:

Node Number. NNFC: Node Number for Frequent Class.

NNRC: Node Number for Rare Class. EN: Edge Number. ASI:

Action Strategy Index.


ng 807

of 808

809

h, 810

he 811

he 812

th 813

c- 814

815

h, 816

ed 817

c- 818

al- 819

er, 820

in, 821

ve 822

823

s- 824

of 825

a- 826

- 827

ri- 828

ns 829

k 830

nd 831

e 832

in 833

834

835

el 836

a- 837

ed 838

s- 839

is 840

on 841

ity 842

s 843

si- 844

m 845

il- 846

p- 847

tured from 4 boxers with varying skill levels. The evaluations from 848

our system aligns with that of a professional boxing coach. 849

Although we use boxing as our target sport in the experimenta- 850

tion section, the underpinning theoretical development can be ap- 851

plied to most sports that require swiftness, flexibility and creativ- 852

ity, such as tennis, fencing and basketball. The adaptation of the 853

proposed system to these sports and the comparison of the system 854

performances on different sports remain as future work. 855

We focus on analyzing the skill level of the boxers in terms of 856

high-level motion behaviour such as the richness of the action, the 857

transition of action and the unpredictability of boxing patterns. We 858

do not evaluate the lower-level parameters such as the speed of 859

the punches, which has been explored in previous works. It is an 860

interesting future direction to combine both high-level and low- 861

level evaluation in order to have a full assessment of the boxers. 862

There are limitations to our method. First, our method is based 863

on the assumption that the sports skills mainly consist of a finite 864

number of key postures and key actions. Admittedly, not all sports 865

follow this pattern. Second, the visualization and skill assessment 866

is based on an individual athlete, not considering skills related to 867

collaborations such as those in group sports, in which the assess- 868

ment might need to employ different criteria. 869

We argue that novice boxers tend to have different posture- 870

based graphs, while experienced boxers tend to have graphs of 871

a similar topology. This is because unlike experience boxers who 872

have only 1 to 2 main postures nodes, novice boxers tend to have 873

more nodes, resulting in a much larger variation on the graph 874

topology. As a future work, we would like to utilize the stem to 875

evaluate a large number of boxers in different skill levels to verify 876

this argument. 877

he 878

a- 879

ell 880

it 881

et, 882

w 883

as 884

In 885

ke 886

x- 887

ill 888

is 889

e 890

891

892

si- 893

he 894

895

896

be 897

898

899

all 900 m- 901 02. 902

903 eal 904 ual 905

906 ny- 907 10. 908

909

NN 7 11 9 16

NNFC 4 3 4 5

NNRC 3 8 5 11

EN 16 27 20 38

ASI 0.572 0.448 0.426 0.378

Table 1 shows the skill level assessed by a professional boxi

coach, as well as the number of postures and actions, for each

the boxers considered.

Table 2 shows the statistics related to the posture-based grap

including the number of fat nodes and fat edges, as well as t

Connectivity Index calculated with Eq. (3) . The index evaluates t

richness of actions and the flexibility of transitions. It aligns wi

the boxers’ skill level and more skillful boxers have higher Conne

tivity Indexes.

Table 3 shows the statistics related to the action-based grap

including the number of HMM nodes (which is further separat

into the number for the frequent class and the rare class respe

tively) and HMM edges, as well as the Action Strategy Index c

culated with Eq. (5) . It indicates the unpredictability of a box

and more skillful boxers are generally more unpredictable. Aga

it aligns with the boxer’ skill level and more skillful boxers ha

higher Action Strategy Indexes.

In terms of the computational cost, we run the proposed sy

tem on a laptop computer with a Core i7-6820HQ CPU, 16GB

RAM and a NVIDIA Quadro M10 0 0M graphic card. The comput

tional time to analyze the captured motion ( Section 3.2 ) and com

puting the graphs ( Sections 4 and 5 ) ranges from 6 to 9 s. The va

ation of computational time is mainly due to the iterative k-mea

clustering algorithm for both postures and actions, as a larger

requires longer computational time. The run-time cost is low a

we achieve frame rate higher than real-time (i.e. 60Hz). The fram

rate tends to be lower when there are more characters shown

the graphs.

7. Conclusion and discussions

In this paper, we proposed a method to visualize the high-lev

skills of boxers using an automatic motion analysis and visualiz

tion framework. The proposed posture-based graph is a customiz

Fat Graph that allows us to evaluate the quality of standard po

tures for launching and finishing actions. The action-based graph

a customized Hidden Markov Model that visualizes the transiti

probability among actions. We further introduce the Connectiv

Index that is deduced from the posture-based graph and allow

evaluation of the richness of actions and the flexibility of tran

tions, as well as the Action Strategy Index that is deduced fro

the action-based graph and allows evaluation of the unpredictab

ity of action patterns. The system is applied on the motion ca



In the future, we wish to extend the proposed algorithm to t

field of computer animation. Currently, when synthesizing anim

tions by motion graphs, experienced animators are required to t

what motions are missed or badly captured. With our system,

is possible to analyze the connectivity and variety of a motion s

which are two critical factors in motion synthesis. However, ho

to generalize these findings to give high-level suggestion, such

proposing the motions to capture, remains an open problem.

addition, we would like to develop a visualization system to ta

the adversarial nature of sports. For instance, although two bo

ers might have roughly the same skill level, in a match, one’s sk

composition might give him/her advantages over the other. Th

kind of analysis would be very useful in preparation for a gam

or predicting the result.

Acknowledg ment

This work was supported in part by the Engineering and Phy

cal Sciences Research Council (EPSRC) (Ref: EP/M002632/1 ) and t

Royal Society (Ref: IE160609).

Supplementary material

Supplementary material associated with this article can

found, in the online version, at 10.1016/j.cag.2017.09.007

References

[1] Komura T, Kuroda A, Shinagawa Y. Nicemeetvr: facing professional basebpitchers in the virtual batting cage. In: Proceedings of the 2002 ACM sy

posium on applied computing. In: SAC ’02. New York, NY, USA: ACM; 20p. 1060–5. ISBN 1-58113-445-2. doi: 10.1145/508791.5090 0 0 .

[2] Bideau B, Kulpa R, Menardais S, Fradet L, Multon F, Delamarche P, et al. R

handball goalkeeper vs. virtual handball thrower. Presence: Teleoper VirtEnviron 2003;12(4):411–21.

[3] Molet T, Aubel A, Capin T, Carion S, Lee E, Magnenat-Thalmann N, et al. Aone for tennis? Presence: Teleoper Virtual Environ 1999;8(2):140–56. doi:

1162/105474699566134 .


https://doi.org/10.13039/501100000266


https://doi.org/10.1145/508791.509000

https://doi.org/10.1162/105474699566134



ARTICLE IN PRESS


[4] Hyun K, Lee K, Lee J. Motion grammars for character animation. Comput Graph 910 Forum 2016;35(2):103–13. doi: 10.1111/cgf.12815 . 911

[5] Wampler K, Andersen E, Herbst E, Lee Y, Popovi ́c Z. Character animation 912 in two-player adversarial games. ACM Trans Graph 2010;29(3):26-1–26-13. 913 doi: 10.1145/1805964.1805970 . 914

[6] Shin HJ, Oh HS. Fat graphs: constructing an interactive character with contin- Q3 915 uous controls. Proceedings of ACM SIGGRAPH / Eurographics Symposium on 916 computer animation; 2006. p. 291–8. 917

[7] Rabiner LR. A tutorial on hidden Markov models and selected applications in 918 speech recognition. Proc IEEE 1989;77(2):257–86. doi: 10.1109/5.18626 . 919

[8] Shum HPH, Wang H, Ho ESL, Komura T. Skillvis: a visualization tool for box- 920 ing skill assessment. In: Proceedings of the 2016 international conference on 921 motion in games. In: MIG ’16. New York, NY, USA: ACM; 2016. p. 145–53. ISBN 922 978-1-4503-4592-7. doi: 10.1145/2994258.2994266 . 923

[9] Yeadon M. The simulation of aerial movement- iv. a computer simulation 924 model. J Biomech 1990;23(1):85–9. 925

[10] Yeadon M. The biomechanics of the human in flight. Am J Sports Med 926 1997;25(4):575–80. 927

[11] Arikan O, Forsyth D. Motion generation from examples. ACM Trans Graph 928 2002;21(3):483–90. 929

[12] Lee J, Chai J, Reitsma PSA, Hodgins JK, Pollard NS. Interactive control of avatars 930 animated with human motion data. ACM Trans Graph 2002;21(3):491–500. 931

[13] Kovar L, Gleicher M, Pighin F. Motion graphs. ACM Trans Graph 932 2002;21(3):473–82. 933

[14] Grochow K, Martin SL, Hertzmann A, Popovic’ Z. Style-based inverse kinemat- 934 ics. ACM Trans Graph (TOG) 2004;22(3). 935

[15] Shin HJ, Lee J. Motion synthesis and editing in low-dimensional spaces. Com- 936 put Animat Virtual Worlds (Special Issue: CASA 2006) 2006;17(3–4):219–27. 937

[16] Min J, Chai J. Motion graphs++: a compact generative model for semantic mo- 938 tion analysis and synthesis. ACM Trans Graph 2012;31(6):153. 939

[17] Beaudoin P, Coros S, van de Panne M, Poulin P. Motion-motif graphs. In: Pro- 940 ceedings of the 2008 ACM SIGGRAPH/Eurographics symposium on computer 941 animation. In: SCA ’08. Aire-la-Ville, Switzerland, Switzerland: Eurographics 942 Association; 2008. p. 117–26. ISBN 978-3-905674-10-1 943

[18] Arikan O, Forsyth DA, O’Brien JF. Motion synthesis from annotations. ACM 944 Trans Graph 2003;22(3):402–8. doi: 10.1145/882262.882284 . 945

[19] Li Y, Wang T, Shum H-Y. Motion texture: a two-level statistical model for 946 character motion synthesis. ACM Trans Graph 2002;21(3):465–72. doi: 10.1145/ 947 566654.566604 . 948

[20] Reitsma PSA, Pollard NS. Evaluating motion graphs for character animation. 949 ACM Trans Graph 2007;26(4). doi: 10.1145/1289603.1289609 . 950

[21] Heck R, Gleicher M. Parametric motion graphs. In: Proceedings of the 2007 951 symposium on interactive 3D graphics and games. In: I3D ’07. New York, NY, 952 USA: ACM; 2007. p. 129–36. ISBN 978-1-59593-628-8. doi: 10.1145/1230100. 953 1230123 . 954

[22] Xia S, Wang C, Chai J, Hodgins J. Realtime style transfer for unlabeled heteroge- 955 neous human motion. ACM Trans Graph 2015;34(4):119-1–119-10. doi: 10.1145/ 956 2766999 . 957

[23] Safonova A, Hodgins JK. Construction and optimal search of interpolated mo- 958 tion graphs. ACM Trans Graph 2007;26(3). doi: 10.1145/1276377.1276510 . 959

[24] Shum HPH, Komura T, Shiraishi M, Yamazaki S. Interaction patches for multi- 960 character animation. ACM Trans Graph 2008;27(5):114-1–114-8. doi: 10.1145/ 961 1409060.1409067 . 962

[25] Shum HPH, Komura T, Yamazaki S. Simulating multiple character interac- 963 tions with collaborative and adversarial goals. IEEE Trans Vis Comput Graph 964 2012;18(5):741–52. doi: 10.1109/TVCG.2010.257 . 965

[26] Ho ESL, Komura T. A finite state machine based on topology coordinates for 966 wrestling games. Comput Animat Virtual Worlds 2011;22(5):435–43. doi: 10. 967 1002/cav.376 . 968

[27] Ho ES, Komura T. Character motion synthesis by topology coordinates. Comput 969 Graph Forum 2009. doi: 10.1111/j.1467-8659.2009.01369.x . 970

[28] Plantard P, Shum HPH, Multon F. Filtered pose graph for efficient kinect 971 pose reconstruction. J Multimed Tools Appl 2017;76(3):4291–312. doi: 10.1007/ 972 s11042- 016- 3546- 4 . 973

[29] Plantard P, Shum HPH, Pierres A-S L, Multon F. Validation of an ergonomic 974 assessment method using kinect data in real workplace conditions. Appl Ergon 975 2016. doi: 10.1016/j.apergo.2016.10.015 . 976

[30] Shum HPH, Komura T, Shiratori T, Takagi S. Physically-based character control 977 in low dimensional space. In: Proceedings of the third international conference 978 on motion in games. In: MIG ’10, vol. 6459. Berlin, Heidelberg: Springer-Verlag; 979 2010. p. 23–34. ISBN 978-3-642-16957-1. doi: 10.1007/978- 3- 642- 16958- 8 _ 3 . 980

[31] Shum HPH, Ho ESL, Jiang Y, Takagi S. Real-time posture reconstruction for mi- 981 crosoft kinect. IEEE Trans Cybern 2013;43(5):1357–69. doi: 10.1109/TCYB.2013. 982 2275945 . 983

[32] Ho ESL, Shum HPH, Cheung Y-m, Yuen PC. Topology aware data-driven inverse 984 kinematics. Comput Graph Forum 2013;32(7):61–70. doi: 10.1111/cgf.12212 . 985

[33] Lawrence ND. Gaussian process latent variable models for visualisation of high 986 dimensional data. In: Proceedings of the 2003 neural information processing 987 systems; 2003. 988

[34] Wang JM, Fleet DJ, Hertzmann A. Gaussian process dynamical models for hu- 989 man motion. IEEE Trans Pattern Anal Mach Intell 2008;30(2):283–98. doi: 10. 990 1109/TPAMI.2007.1167 . 991

[35] Holden D, Saito J, Komura T. A deep learning framework for character motion 992 synthesis and editing. ACM Trans Graph 2016;35(4):138-1–138-11. doi: 10.1145/ 993 2897824.2925975 . 994

[36] Liu Z, Zhou L, Leung H, Shum HPH. Kinect posture reconstruction based on a 995 local mixture of gaussian process models. IEEE Trans Vis Comput Graph 2016. 996 doi: 10.1109/TVCG.2015.2510 0 0 0 . 997

[37] Hara K, Omori T, Ueno R. Detection of unusual human behavior in intelligent 998 house. In: Proceedings of the 12th IEEE workshop on neural networks for sig- 999 nal processing; 2002. p. 697–706. doi: 10.1109/NNSP.2002.1030081 . 10 0 0

[38] Françoise J, Roby-Brami A, Riboud N, Bevilacqua F. Movement sequence analy- 1001 sis using hidden Markov models: a case study in tai chi performance. In: Pro- 1002 ceedings of the 2nd international workshop on movement and computing. In: 1003 MOCO ’15. New York, NY, USA: ACM; 2015. p. 29–36. ISBN 978-1-4503-3457-0. 1004 doi: 10.1145/2790994.2791006 . 1005

[39] Brand M, Hertzmann A. Style machines. In: Proceedings of the 27th annual 1006 conference on computer graphics and interactive techniques. In: SIGGRAPH 1007 ’00. New York, NY, USA: ACM Press/Addison-Wesley Publishing Co.; 20 0 0. 1008 p. 183–92. ISBN 1-58113-208-5. doi: 10.1145/344779.344865 . 1009

[40] Tanco LM, Hilton A. Realistic synthesis of novel human movements from a 1010 database of motion capture examples. In: Proceedings of workshop on human 1011 motion; 20 0 0. p. 137–42. doi: 10.1109/HUMO.20 0 0.897383 . 1012

[41] Ren L, Patrick A, Efros AA, Hodgins JK, Rehg JM. A data-driven approach 1013 to quantifying natural human motion. ACM Trans Graph 2005;24(3):1090–7. 1014 doi: 10.1145/1073204.1073316 . 1015

Pl

(2




https://doi.org/10.1111/cgf.12815

https://doi.org/10.1145/1805964.1805970

https://doi.org/10.1109/5.18626

https://doi.org/10.1145/2994258.2994266

https://doi.org/10.1145/882262.882284

https://doi.org/10.1145/566654.566604

https://doi.org/10.1145/1289603.1289609

https://doi.org/10.1145/1230100.1230123

https://doi.org/10.1145/2766999

https://doi.org/10.1145/1276377.1276510

https://doi.org/10.1145/1409060.1409067

https://doi.org/10.1109/TVCG.2010.257

https://doi.org/10.1002/cav.376

https://doi.org/10.1111/j.1467-8659.2009.01369.x

https://doi.org/10.1007/s11042-016-3546-4

https://doi.org/10.1016/j.apergo.2016.10.015

https://doi.org/10.1007/978-3-642-16958-8_3

https://doi.org/10.1109/TCYB.2013.2275945

https://doi.org/10.1111/cgf.12212

https://doi.org/10.1109/TPAMI.2007.1167

https://doi.org/10.1145/2897824.2925975

https://doi.org/10.1109/TVCG.2015.2510000

https://doi.org/10.1109/NNSP.2002.1030081

https://doi.org/10.1145/2790994.2791006

https://doi.org/10.1145/344779.344865

https://doi.org/10.1109/HUMO.2000.897383

https://doi.org/10.1145/1073204.1073316


Documents

Posture-based and action-based graphs for boxing skill ... · Posture-based and action-based graphs for boxing skill visualization Q1 Yijun Shen a, He Wang b, Edmond S.L. Ho a, Longzhi