Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Citation: Shen, Yijun, Wang, He, Ho, Edmond, Yang, Longzhi and Shum, Hubert P. H. (2017) Posture-based and Action-based Graphs for Boxing Skill Visualization. Computers and Graphics. ISSN 0097-8493 (In Press)
Published by: Elsevier
URL: https://doi.org/10.1016/j.cag.2017.09.007 <https://doi.org/10.1016/j.cag.2017.09.007>
This version was downloaded from Northumbria Research Link: http://nrl.northumbria.ac.uk/32266/
Northumbria University has developed Northumbria Research Link (NRL) to enable users to access the University’s research output. Copyright © and moral rights for items on NRL are retained by the individual author(s) and/or other copyright owners. Single copies of full items can be reproduced, displayed or performed, and given to third parties in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge, provided the authors, title and full bibliographic details are given, as well as a hyperlink and/or URL to the original metadata page. The content must not be changed in any way. Full items must not be sold commercially in any format or medium without formal permission of the copyright holder. The full policy is available online: http://nrl.northumbria.ac.uk/policies.html
This document may differ from the final, published version of the research and has been made available online in accordance with publisher policies. To read and/or cite from the published version of the research, please visit the publisher’s website (a subscription may be required.)
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
Computers & Graphics xxx (2017) xxx–xxx
Contents lists available at ScienceDirect
Computers & Graphics
journal homepage: www.elsevier.com/locate/cag
Special Section on Motion in Games 2016
Posture-based and action-based graphs for boxing skill visualization
Yijun Shen
a , He Wang
b , Edmond S.L. Ho
a , Longzhi Yang
a , Hubert P.H. Shum
a , ∗Q1
a Northumbria University, Newcastle-Upon-Tyne, United Kingdom
Q2
b University of Leeds, Leeds, United Kingdom
a r t i c l e i n f o
Article history:
Received 29 July 2017
Revised 25 September 2017
Accepted 26 September 2017
Available online xxx
Keywords:
Motion graph
Hidden Markov model
Information visualization
Dimensionality reduction
Human motion analysis
Boxing
a b s t r a c t
Automatic evaluation of sports skills has been an active research area. However, most of the existing re-
search focuses on low-level features such as movement speed and strength. In this work, we propose a
framework for automatic motion analysis and visualization, which allows us to evaluate high-level skills
such as the richness of actions, the flexibility of transitions and the unpredictability of action patterns.
The core of our framework is the construction and visualization of the posture-based graph that focuses
on the standard postures for launching and ending actions, as well as the action-based graph that focuses
on the preference of actions and their transition probability. We further propose two numerical indices,
the Connectivity Index and the Action Strategy Index, to assess skill level according to the graph. We
demonstrate our framework with motions captured from different boxers. Experimental results demon-
strate that our system can effectively visualize the strengths and weaknesses of the boxers.
© 2017 The Author(s). Published by Elsevier Ltd.
This is an open access article under the CC BY license. ( http://creativecommons.org/licenses/by/4.0/ )
1. Introduction 1
Computer technologies have taken on a crucial role in modern 2
sports and health sciences, in revolutionizing the way to observe, 3
analyze, and improve the performance of both amateur and profes- 4
sional athletes. Computer-managed weight lifting machines, tread- 5
mills and many other training equipment provide energy consump- 6
tion or repetition and weight management in many sport clubs. 7
Virtual reality technology has been applied in various training sys- 8
tems in baseball [1] , handball [2] and tennis [3] to assist more pro- 9
fessional sport activities. Nevertheless, these technologies are only 10
able to analyze motions at a low level, i.e. recording the timing or 11
repetitions of basic motions and comparing movement trajectories 12
with those performed by better players. More advanced technolo- 13
gies are needed for personalized and higher-level analysis compa- 14
rable to that from human experts. 15
In addition to the instantaneous movement features of the 16
sports players, Experienced sport coaches consider high-level fea- 17
tures such as the variety of actions and quality of transitions from 18
one action to another. Taking boxing as an example, professional 19
boxers have in basic actions such as defence, stepping and attack, 20
threading through which the transitions are carried out based on 21
∗ Corresponding author.
E-mail addresses: [email protected] (Y. Shen), [email protected]
(H. Wang), [email protected] (E.S.L. Ho), [email protected] (L.
Yang), [email protected] (H.P.H. Shum).
the strategy and the opponent’s reactions. The action transitions 22
of a good boxer need to be flexible and contain great variety to 23
achieve the optimal outcome. Such information often serves as an 24
important indicator in assessing the skill level of a player, and the 25
same principle applies to many other sports such as basketball 26
[4] and fencing [5] . Unfortunately, automatic systems for analyzing 27
and evaluating sports motions at such a high level is very limited. 28
In this paper, we propose a robust visualization system to ad- 29
dress the above limitations, by represent motions as an interactive 30
graph of high-level features, including the flexibility and richness 31
of the actions as well as the transitions of actions. Although we 32
use boxing as a demonstration in this paper, our method is generic 33
and can be applied to different sports. Our approach starts with 34
capturing the shadow boxing training motion of a boxer, in which 35
the boxer performs boxing with an imaginary opponent. An experi- 36
enced coach can effectively assess the boxer’s skill by watching the 37
shadowing boxing motions. As a positive side effect, this method of 38
motion analysis greatly reduces the complexity of motion capture 39
due to occlusion and collision and has shown to be very effective 40
in our system. The motion data is then processed and visualized 41
in two different graphs: the posture-based graph and the action- 42
based graph, for performance analysis. 43
In the posture-based graph, the semantic actions segmented 44
from the captured motion are grouped into clusters based on a 45
customized distance function that considers action specific fea- 46
tures. Our system then automatically generates a motion graph 47
structure known as Fat Graph [6] , which uses nodes to represent 48
https://doi.org/10.1016/j.cag.2017.09.007
0097-8493/© 2017 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license. ( http://creativecommons.org/licenses/by/4.0/ )
Please cite this article as: Y. Shen et al., Posture-based and Action-based Graphs for Boxing Skill Visualization, Computers & Graphics
(2017), https://doi.org/10.1016/j.cag.2017.09.007
2 Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
groups of similar postures to start and end actions, and edges 49
to represent groups of action. By applying dimensional reduction 50
techniques, this graph can be visualized in a 3D space for per- 51
formance analysis and evaluation. The transition capability of the 52
boxer are visualized by the connectivity of the nodes, where the 53
richness and preference of the actions are visualized by the edges 54
in the graph. We further propose a skill evaluation metric known 55
as the Connectivity Index which evaluates the richness of actions 56
and the flexibility of transitions according to the graph. 57
Whilst the posture-based graph focuses on the variety of basic 58
postures and the transition flexibility between actions, the action- 59
based graph mainly considers the richness of actions and the tran- 60
sition probability among them. The action-based graph is con- 61
structed as a customized Hidden Markov Model (HMM) [7] , in 62
which similar actions are grouped into clusters that formulate the 63
nodes. The transition probability among actions is calculated and 64
is expressed as edges between nodes. The graph is visualized in a 65
3D space, and the positions of the nodes and edges are optimized 66
for better visualization. With such a graph, the pattern of action 67
launching can be easily identified in order to assess the boxing 68
strategy of the boxer. We further propose the Action Strategy In- 69
dex to evaluate the unpredictability of action patterns according to 70
the graph. 71
We conducted experiments on the motions captured from mul- 72
tiple boxers and evaluate their skills. The corresponding posture- 73
based and action-based graphs were generated. As shown in 74
Fig. 10 , we can easily evaluate the skills of different boxers with 75
our visualization system. 76
There are three main contributions of this work: 77
• We propose a framework for high-level skill analysis through 78
automatic motion analysis and visualization. Given a captured 79
motion from a sports player, our system automatically seg- 80
ments the motion into semantic action units and constructs 81
two graph structures. 82
• We propose the posture-based graph, which is a variant of the 83
Fat Graph, to visualize the skills according to different standard 84
postures for launching and ending actions. It allows the user to 85
identify the correctness of standard postures and the diversity 86
of actions. We further propose the Connectivity Index that eval- 87
uates the richness of actions and the flexibility of transitions. 88
• We propose the action-based graph, which is a variant of the 89
Hidden Markov Model (HMM), to visualize the skill according 90
to different groups of action. It allows the user to identify the 91
preference of actions and their transition probability. We fur- 92
ther propose the Action Strategy Index to evaluate the unpre- 93
dictability of action patterns. 94
The preliminary results of this work were published in a con- 95
ference paper [8] , which proposed only the posture-based graph. 96
In this paper, we extend the work by introducing the new action- 97
based graph. We perform analysis and experimental evaluation of 98
such a graph, and compare its performance with the posture-based 99
graph. We have also updated the paper thoroughly such that the 100
two graphs are presented in an organized and effective manner. 101
The rest of this paper is organized as follows. Related works 102
are reviewed in Section 2 . The details of motion capture and or- 103
ganization are given in Section 3 . In Sections 4 and 5 , we explain 104
the design and implementation of the posture-based graph and the 105
action-based graph respectively. Related experiments can be found 106
in Section 6 . The paper is concluded in Section 7 with future re- 107
search directions discussed. 108
2. Related work 109
2.1. Sports visualization 110
Helping athletes on skill improving via the visualization of 111
sport motions is a field that has not been fully explored in the 112
field of sports science. Existing research [9,10] mainly focuses 113
on the appearance changes of motions when body and motion 114
parameters are changed. For example, Yeadon [9,10] has done 115
research on how diving and somersault motions change when 116
the motions are launched at different timings by using physical 117
simulation. Although such tools are useful for the athletes to 118
interactively visualize possible results under different parameters, 119
they can only evaluate the performance of sports that do not 120
require complex maneuvers and strategies, such as jumping, high 121
jumping, sky jumping, or somersaults. In many sports games, 122
the performance depends not only on physical factors such as 123
velocity, power and strength, but also on flexibility to switch from 124
one motion to another and richness of the player’s motions. This 125
high-level information has not been used to visualize the skills 126
of the athlete in previous research and it is the major difference 127
between our work and the afore-mentioned ones. In this research, 128
we combine the approaches of motion graph [11–13] and dimen- 129
sionality reduction [14,15] to visualize high-level skills information 130
of the athletes for the skill assessments. 131
2.2. Motion graphs for motion modeling 132
The Motion Graph approach [11–13,16–19] is a method to inter- 133
actively reproduce continuous motions based on a graph generated 134
from captured motion data. Reitsma and Pollard [20] compared 135
different motion graph techniques comprehensively. Heck et al. 136
[21] further parametrized the motion space to control how the 137
motions are generated by blending samples in the motion graph. 138
Such an approach can be used for interactive character control 139
such as that in computer games. When it comes to graph con- 140
struction, [16,17] are the ones most similar to our method. Min 141
et al. [16] grouped similar postures and transitions into nodes and 142
edges. Their focus was the motion variety of synthesized motions 143
so they used generative models to fit the posture and motion data. 144
Our focus is on skill visualization through the analysis of postures 145
and motions so we can afford simpler and faster methods of analy- 146
sis. Beaudoin et al. [17] cluster postures first then find motion mo- 147
tifs by converting the motion matching task into a string matching 148
problem. Their priority was to find motifs that were representa- 149
tive while our focus is to visualize motion details and statistics 150
to help people assess the skills. Xia et al. [22] constructed a se- 151
ries of local mixtures of autoregressive models (MAR) for model- 152
ing the style variations among different motions for real-time style 153
transfer. They demonstrated style-rich motions can be generated 154
by combining their method and motion graph. 155
Since the Motion Graph produces a lot of edges and nodes 156
without any context, it becomes difficult to control generated mo- 157
tion as the user wishes. Safonova and Hodgins [23] optimized 158
the graph structure by combining motion graph and interpolation 159
techniques to improve performance. On the other hand, works to 160
resolve this problem by introducing a hierarchical structure were 161
proposed [6] . These approaches add topological structures into the 162
continuous unstructured data so that the motion synthesis can be 163
done at a higher level. In a sport like boxing, it is possible to cre- 164
ate a motion graph of semantic actions such as attack and defence, 165
which is known as the action-level motion graph [24,25] . A re- 166
cent work by Hyun et al. [4] proposed Motion Grammars to spec- 167
ify how character animations are generated by high-level symbolic 168
description. Such an approach can be used with existing animation 169
systems which are built based on motion graphs. Ho and Komura 170
Please cite this article as: Y. Shen et al., Posture-based and Action-based Graphs for Boxing Skill Visualization, Computers & Graphics
(2017), https://doi.org/10.1016/j.cag.2017.09.007
Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx 3
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
[26] built a finite state machine (FSM) based on Topology Coor- 171
dinates [27] for synthesizing two-character close interactions. The 172
sparse graph structure can be used for controlling the movement 173
of virtual wrestlers in computer games. The purpose of these ap- 174
proaches, however, is motion generation rather than the visualiza- 175
tion of the player’s skill. 176
In our research, we adapted a hierarchical motion graph struc- 177
ture called the Fat Graph [6] on the action level to analyze the con- 178
nectivity and the variety of a captured motion set. In a fat graph, 179
similar nodes are grouped together as fat nodes, and similar edges 180
are grouped as fat edges, allowing better organization of motion 181
data. The filtered motion graph is a variation of the Fat Graph, 182
in which the temporal relationship between poses are considered 183
[28] . Such a structure, however, is targeted for motion reconstruc- 184
tion and analysis rather than visualization [29] . 185
2.3. Statistical motion modeling 186
Dimensionality reduction methods have been proposed to vi- 187
sualize the overall structure of captured motions. Grochow et al 188
[14] proposed a method to project the 3D motions of a human 189
onto a 2D plane, and further reconstruct 3D motions by mapping 190
arbitrary points from the 2D plane back onto 3D joint space. PCA 191
[15] and ISOMAP [30] are proposed to map the motions onto 2D 192
planes. Due to the high variation of human motion, local PCA that 193
considers only a relevant subset of the whole motion database in 194
order to generate a locally linear space is proposed [31,32] . One can 195
generate motions from arbitrary points on the plane by interpo- 196
lating the postures of the original motion. Meanwhile, non-linear 197
methods [33,34] and Deep Learning [35] have also been used to re- 198
duce the dimensionality of motions. The Gaussian Process [36] and 199
the mixture of Gaussian Processes [36] can be used to represent 200
a set of human postures with a small number of Gaussian param- 201
eters. However, such methodologies do not take into the account 202
the connectivity structure of the motions. We apply dimensional- 203
ity reduction to our graph structure to visualize the connectivity 204
structure of captured motions on a 2D plane. 205
Other researchers have focused on the connectivities of mo- 206
tion/actions by methods such as Markov models. Hidden Markov 207
Model (HMM) [7] has been widely used in analyzing and synthe- 208
sizing human motion. Typically, the hidden states of the HMM are 209
the distribution of body poses and the dynamics of the motions 210
are represented by the transitions between the hidden states. The 211
parameters of the HMM can then be learned from training data us- 212
ing the Expectation-Maximization (EM) algorithm. Hara et al. [37] 213
proposed to model daily activities using HMM in intelligent house. 214
Françoise et al. [38] proposed to use HMM models for analyzing 215
Tai Chi motion sequences. An early work proposed by Brand and 216
Hertzmann [39] proposed to learn the dynamics of human mo- 217
tion using HMM in their motion style synthesis model. Tango and 218
Hilton [40] proposed to learn a HMM model from captured hu- 219
man motion for synthesizing in-between frames in keyframe an- 220
imation. Ren et al. [41] presented a data-driven approach for quan- 221
tifying naturalness of human motion including those synthesized 222
by HMM. While existing work focuses on finding statistical distri- 223
butions of motions, our focus is on visualizing the motion richness 224
and the transition dynamics for skill assessments. 225
3. Motion capture and organization 226
We first capture the motion required for analysis using motion 227
capture systems. Then, we propose an automatic system to seg- 228
ment long sequences of captured motion into meaningful actions, 229
which are used as building blocks of our posture-based and action- 230
based graphs. 231
Fig. 1. The shadow boxing motions of several boxers were captured using an optical
motion capture system.
Here, we follow the definition from [25] , in which a motion is 232
considered to be a raw sequence of captured human movement, 233
and an action is considered to be a short, meaningful segment of 234
movement within a motion. In the field of boxing, an action can 235
be an attack (such as a “left straight”, “jab” or a “right kick”), a 236
defense (such as “parries”, “blocking” or “ducking”) , a transition 237
(such as “stepping to the left”, “stepping forward” or “back step”), 238
or any combination of them. 239
Postures and actions are good entities for skill visualization, as 240
sports players typically plan their strategies and evaluate their per- 241
formances with such terms. For example, a boxer typically thinks 242
about what sort of attack/defense/transition should be launched 243
during a match. A coach typically evaluates the overall strategy in 244
the action level, as well as how well individual postures and ac- 245
tions are performed. 246
3.1. Motion capture 247
Although it would be best to capture the motions of all players 248
in multi-player sports because the data would reflect the features 249
of the motions, capturing multiple players remains difficult due 250
to the occlusions and collisions among players. Fortunately, it is 251
possible to only capture individual motions for our purposes with- 252
out compromising the true motion characteristics. In boxing or any 253
other martial arts, there is a training practice called “shadow box- 254
ing”. The boxer imagines a boxing session with another boxer, and 255
launches boxing actions to interact with such an imaginary oppo- 256
nent. The boxer launches not only offensive actions such as punch- 257
ing, but also defence, stepping, and the consecutive combination of 258
all such actions. There are similar practice methods in basketball 259
and soccer as well, in which players use the ball to conduct var- 260
ious techniques in the court, imagining that their opponents are 261
trying to take the ball away from them. The players thus perform 262
various actions to keep the ball and trick an imaginary opponent. 263
This technique has also been used by coaches for skill assessment 264
hence is suitable for our analysis. We employed an optical motion 265
capture system to acquire the performed motion as shown in Fig. 1 266
as it was less intrusive and highly accurate. Also, we preferred to 267
capture long and continuous clips of motion, such that the player 268
could perform the motion in a natural manner. 269
3.2. Motion analysis 270
After data capture, the system automatically segments mean- 271
ingful actions from the raw captured motion, and identifies the ef- 272
Please cite this article as: Y. Shen et al., Posture-based and Action-based Graphs for Boxing Skill Visualization, Computers & Graphics
(2017), https://doi.org/10.1016/j.cag.2017.09.007
4 Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
Fig. 2. Upper: The movement segment is defined as the period between two double support supporting phases. Middle: The activity segment is defined as the period with
high acceleration. Lower: The action is the combination of movement segment and activity segment.
fective joints that contribute the most to the semantic meaning of 273
the actions. 274
For boxing motions, we observed that actions normally start 275
and end in a double supporting state (i.e. both feet touching the 276
floor), as the state is usually dynamically stable. We detect such a 277
state by monitoring the feet height and velocity and setting corre- 278
sponding thresholds. This allows us to segment the raw captured 279
motion into a set of movement segments , which are the periods be- 280
tween every two successive double supporting states, as visualized 281
in Fig. 2 Upper. 282
We also observed that actions normally require a relatively 283
larger force to be performed, such as a punch or a step. We de- 284
fine periods with a high-level of force exertion as the activity seg- 285
me286
can287
is a288
is s289
290
me291
act292
sho293
rela294
the295
me296
me297
is o298
thi299
ity.300
ces301
con302
act303
ing304
me305
sho306
sup307
308
an 309
spe310
In 311
the312
ity 313
pun314
tip315
eva316
4. Posture-based graph 317
The posture-based graph focuses on evaluating the common 318
postures that are used to start and end actions. In such a graph, 319
the nodes represent similar postures and the edges represent sim- 320
ilar actions. It allows us to evaluate the consistency of common 321
postures and the diversity of actions. 322
4.1. Graph construction 323
We adopt a Fat Graph structure [6] in the action level [25] to 324
generate the posture-based graph, as it can effectively simplify the 325
graph representation by grouping similar postures and actions to- 326
get 27
and 28
alg 29
pur 30
4.1. 31
32
are 33
des 34
ing 35
avo 36
ica 37
two 38
D (P
wh 39
in 40
join 41
ma 42
Pl
(2
nts . Since force is proportional to acceleration, these segments
be found when the sum of squares of acceleration of all joints
bove a threshold, as visualized in Fig. 2 Middle. The threshold
tatistically obtained from the acceleration profile of the motion.
Finally, the actions are composed by using the movement seg-
nts as the building blocks. The timing and the duration of the
ivity segments are used to determine if the movement segments
uld be merged together to form longer segments. Regarding the
tionship of the movement segments and the activity segments,
re could be three possible cases: (1) There is no activity seg-
nt inside a movement segment. In this case, the movement seg-
nt becomes a single action of pure body transition. (2) There
ne activity segment inside a movement segment. In this case,
s movement segment becomes an action with a special activ-
(3) There are one or more activity segments lying across suc-
sive movement segments. In this case, the movement segments
taining activity segments at the border are merged to form an
ion as visualized in Fig. 2 Lower. Note that due to this merg-
process, the resulting action may contain multiple activity seg-
nts. In our system, we implement an optional step to filter very
rt actions that are likely to be generated due to the noise of the
porting feet.
We define the effective joints to be the set of joints to represent
activity segment. In case (1) above, since the actions contain no
cial activities, the pelvis is considered to be the effective joint.
case (2) and (3), the effective joint is the joint that contributes
most to the sum of squares of the acceleration in the activ-
segment. In more complicated actions such as left-right combo
ches, there may be multiple effective joints as there are mul-
le activity segments. Such joints are used in later processes to
luate the similarity of actions.
ple 43
int 44
we 45
per 46
init 47
firs
clu
cor
of s
the
ease cite this article as: Y. Shen et al., Posture-based and Action-based
017), https://doi.org/10.1016/j.cag.2017.09.007
her. The Fat Graph was originally proposed for motion synthesis, 3
thus it is not optimized for skill visualization. We redesign the 3
orithms to generate nodes and edges in the Fat Graph for our 3
pose. 3
1. Fat Nodes 3
In our system, the nodes of the Fat Graph, known as Fat Nodes, 3
the common starting or ending postures of the actions. We 3
ign an unsupervised clustering scheme for grouping all start- 3
/ending postures into a finite set of posture groups, which 3
ids additional labour for posture labelling and grouping. Specif- 3
lly, we used k-means to cluster postures. The distance between 3
postures P 0 and P 1 is defined as: 3
0 , P 1 ) =
i = i total ∑
i =0
| θ0 (i ) − θ1 (i ) | (1)
ere θ0 ( i ) and θ1 ( i ) represent the 3D joint angle of the joint i 3
posture P 0 and P 1 respectively, and i total is the total number of 3
ts. Regarding the cluster number k , a large k would result in 3
ny clusters (Fat Nodes), which unnecessarily increases the com- 3
xity of the graph. A small k will cluster very different postures 3
o the same node, defeating the purpose of the graph. Therefore, 3
set up a posture difference threshold empirically based on ex- 3
ts’ suggestions. Then, we iteratively search for a proper k by 3
ially setting k = 1 and incrementing k by 1 until we find the 3
t value of k that does not violate the distance threshold. After 348
stering, we use the mean posture of a group to represent the 349
responding Fat Node. The nodes in the graph represent the set 350
tandard postures which the player starts various action from. In 351
case of boxing, they are usually the fighting postures that the 352
Graphs for Boxing Skill Visualization, Computers & Graphics
Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx 5
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
th 353
354
a 355
n- 356
ce 357
a 358
er 359
or 360
as 361
to 362
363
364
i- 365
ge 366
at 367
368
s- 369
se 370
t- 371
is- 372
ed 373
n- 374
ut 375
376
ed 377
378
2)
ve 379
is 380
re 381
ve 382
be 383
he 384
385
c- 386
ok 387
t- 388
to 389
up 390
to 391
392
es 393
r- 394
ve 395
en 396
u- 397
ut 398
o- 399
ip 400
401
402
ss 403
ed 404
ng 405
406
Fat
a- 407
he 408
si- 409
r- 410
he 411
at 412
as. 413
ch 414
n- 415
ns 416
417
In 418
r- 419
ity 420
ot 421
th 422
423
3)
le- 424
lly 425
s- 426
he 427
e- 428
ly 429
ng 430
r- 431
an 432
433
434
vi- 435
so 436
th 437
438
es 439
at 440
n), 441
n- 442
es 443
h. 4 4 4
al 445
i- 446
at 447
on 448
449
boxer uses to guard his/her face against the opponent, with bo
feet landing on the ground and keeping shoulder width apart.
By evaluating the Fat Nodes alone, we can already tell if
boxer has multiple unnecessary standard postures, or if any sta
dard postures contain potential weakness. In general, experien
players have fewer Fat Nodes, such that they can start actions in
standard posture effectively without the needs of shifting to oth
ones. Novice players sometimes may have a particular Fat Node f
some particular actions. This is discouraged in boxing training
such postures hint the opponent as to what actions are going
be launched.
4.1.2. Fat Edges
We design the edges of a Fat Graph, known as Fat Edges, as d
rectional edges that represent groups of similar actions. Each ed
points from the Fat Node representing the starting posture to th
representing the ending posture.
Similar to the Fat Nodes, we implement an unsupervised clu
tering algorithm to group similar actions into Fat Edges. We u
k-means to cluster the actions and search for the smallest accep
able k for a given distance threshold. We define the actions d
tance according to the trajectory of the effective joints as explain
in Section 3.2 . This allows accurate clustering of actions and e
sures that the effects of the effective joints are not smoothed o
by other joints.
Formally, the distance between two actions A 0 and A 1 is defin
as
D (A 0 , A 1 ) =
⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨
⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩
∞
if A 0 and A 1 have different sequences of effective joints
j total ∑
j=0
f end ∑
f= f start
[ A 0 ( j)( f ) − A 1 ( j)( f )]
otherwise
(
where A 0 ( j )( f ) and A 1 ( j )( f ) represent the 3D positions of effecti
joint j in frame f in the action A − 0 and A 1 respectively, j total
the total number of effective joints in the actions, f start and f end a
the starting frame and ending frame of the considering effecti
joint. In case two effective joints with different duration are to
compared, the shorter one is linearly scaled to the duration of t
longer one.
In the field of boxing, a Fat Edge typically contains a set of a
tions with basic attacks or defences such as “straight punch”, “ho
punch”, “parry”, or a set of complex actions combining several a
tacks and defences. Since member actions in a Fat Edges have
share the same starting and ending Fat Nodes, if an action gro
contains multiple starting or ending poses, it is sub-divided in
multiple Fat Edges.
Again, by only looking at Fat Edges, one can tell the differenc
between experienced and novice players. Experienced players no
mally have Fat Edges with similar numbers of actions, as they ha
mastered a large variety of boxing actions and can switch betwe
them effectively using a small number of stable transition mane
vers. Novice boxers tend to have a larger number of Fat Edges b
each with a small number of actions, due to the inability to repr
duce boxing actions consistently. Figure 3 shows the relationsh
of Fat Nodes and Fat Edges.
4.2. The connectivity index
It requires deep knowledge and years of experience to asse
one’s skills in sports. Here, we make use of the posture-bas
graph and define an index representing the skill level, allowi
more objective and efficient skill assessment.
Please cite this article as: Y. Shen et al., Posture-based and Action-
(2017), https://doi.org/10.1016/j.cag.2017.09.007
Fig. 3. The Fat Node represents the standard fighting pose. The three outgoing
Edges represent different action groups.
In many types of sports, there are two important skill indic
tors. The first one is the richness of the actions that indicates t
resourcefulness of a player. The other is the flexibility of tran
tions between states so that the player can switch between diffe
ent states at will. Our posture-based graph captures both of t
indicators. The richness can be represented by the number of F
Edges, indicating how many kinds of maneuvers the player h
The flexibility is indicated by the connectivity of the graph, whi
is inversely proportional to the number of Fat Nodes. A fully co
nected graph shows great flexibility because there are transitio
between any two nodes.
Notice that these two factors are somehow contradicting.
general, the richer the actions are, the greater the number of diffe
ent starting and ending poses is hence the poorer the connectiv
of actions is. Independently considering either of them would n
suffice. We therefore define a Connectivity Index that evaluates bo
the action richness and the action flexibility of a player
CI =
Number of Fat Edges
Number of Fat Nodes (
To accurately reflect the skill level of a player, in our imp
mentation, we do not consider Fat Nodes that are not intentiona
created. For example, one of our boxers tripped over during a se
sion. While it is good that our system can objectively pick up t
posture generated by the accident, we do not include the corr
sponding Fat Nodes when calculating the Skill Index. Also, we on
consider Fat Edges that are consistently performed, as those havi
only a small number of member actions could be randomly pe
formed actions. Empirically, we consider edges having more th
2 member actions.
4.3. Visualization system
Here, we describe the design of our visualization system to
sualize the posture-based graph in an effective manner. We al
introduce interactive features for the user to view the graph wi
different levels of details.
The posture-based graph consists of high dimensional Fat Nod
(groups of similar postures of many degrees of freedom) and F
Edges (groups of similar actions in the spatial-temporal domai
which presents a challenge for visualization. To reduce the dime
sionality for better visualization, we propose two different schem
for nodes and edges due to their different nature in this grap
Specifically, we project the Fat Nodes on a 2D space using Princip
Component Analysis (PCA) as it creates a more consistent low d
mensional space compared with other methods. We represent F
Edges with 2D curves and augment the curves with a combinati
of geometric primitives to visualize the action features.
based Graphs for Boxing Skill Visualization, Computers & Graphics
6 Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
es larger as the size of the nodes increases.
4.3450
451
hig452
dep453
res454
are455
far 456
bet457
458
tat459
ysi460
cul461
of 462
a c463
eig464
465
[14466
we467
wo468
469
Z p470
pos471
box472
pri473
num474
allo475
ally476
4.3477
478
info479
480
on 481
wo482
Edg483
pla484
occ485
ing486
the487
cur488
edg489
sig490
wo491
492
som493
ene494
usi495
ene496
Fig.
repr
with
sho
tur 97
nor 98
val 99
lan 00
per 01
obt 02
03
edg 04
tify 05
hea 06
wil 07
rela 08
tac 09
10
Rig 11
vis 12
itiv 13
com 14
4.3 15
16
pla 17
lec 18
sho 19
20
be 21
ber 22
ber 23
Edg 24
Pl
(2
Fig. 4. From left to right, the character becom
.1. Visualizing Fat Nodes
Although the degree of freedom (DOF) of human postures are in
h dimensionality (45 DOF in our system), they are intrinsically
endent on each other [14] . In fact, the Fat Nodes can be rep-
ented effectively in a 2D space where nodes of similar postures
located together while those of different postures are located
apart. This allows viewers to easily understand the relationship
ween postures.
For each Fat Node, we obtain the mean posture as its represen-
ion. Given a set of postures, we apply principal component anal-
s (PCA) to reduce the dimensionality to 2. Essentially, we cal-
ate the covariance matrix to evaluate the intrinsic dependency
the dimensions. We then calculate the eigenvectors from such
ovariance matrix, and use the two eigenvectors with largest
envalues to form a feature vector.
PCA is used as it has shown to be effective on human postures
] . However, since we only have a small number of postures,
believe other dimensionality reduction techniques would also
rk.
We render the mean posture of each Fat Node onto a 2D X-
lan. This allows the user to identify inappropriately performed
tures. In boxing, novice boxers sometimes lose track of their
ing rhythm, and hence start or end an action with an inappro-
ate posture. We use the fatness of the character to represent the
ber of member postures in the node, as shown in Fig. 4 . This
ws the user to easily observe the postures that the player usu-
uses to start actions.
.2. Visualizing Fat Edges
Here, we explain how to visualize the Fat Edges, which contain
rmation of groups of similar actions.
We do not apply dimensionality reduction techniques directly
the action data itself because the low dimensional projection
uld be very complex. Instead, we propose to visualize each Fat
e by a 2D curve that represents its mean action on the X –Z
ne. We optimize the angle and sign of these curves to minimize
lusion. For edges with a starting node different from the end-
node, the edge angle is fixed. The only adjustable variable is
bending side of the curves, which is essentially the sign of the
ves. For those with the same starting and ending node, both
e angle and bending side can be controlled. We optimize the
ns and angles of the edges in a greedy manner such that they
uld blend towards a less dense region of the graph.
To visually distinguish between different Fat Edges, we add
e geometric patterns to the 2D curves. We collect the high-
rgy frames of all actions and project them onto a 1D space
ng the PCA system explained in Section 4.3.1 . Since the high-
rgy frames of different actions are typically distinguishing pos-
ease cite this article as: Y. Shen et al., Posture-based and Action-based
017), https://doi.org/10.1016/j.cag.2017.09.007
5. The geometric patterns for landmark values between -1 and 1. Each pattern
esents a landmark posture in an action. (Lower) Comparison of visualization
out/with the patterns. Each curve represents a group of action. The right image
ws the uses of landmark patterns to identify different types of action.
es, the projection essentially maps all action features onto a 4
malized 1D space in the range of [ −1 . 0 , 1 . 0] . To visualize the 4
ue in this 1D space, we design some geometric patterns for 4
dmark values -1.0, -0.5, 0.0, 0.5 and 1.0 as shown in Fig. 5 Up- 5
. The patterns to represent values between two landmarks are 5
ained by linear interpolation between nearby landmarks. 5
We further represent the number of member actions in the 5
e by the thickness of the curve. This allows the user to iden- 5
the player’s preferred actions. For instance, if a boxer relies 5
vily on single straight punches, the Fat Edge for such action 5
l be unreasonably thick, while edges for other attacks will be 5
tively thin, which demonstrates a potential lack of diversity at- 5
king strategies. 5
Through the comparison between Fig. 5 Lower Left and Lower 5
ht, it shows that adding the geometric patterns gives a better 5
ualization of actions in the edges. This strategy presents an intu- 5
e way to show the players preferences over actions of different 5
plexity. 5
.3. Interactive features 5
We integrate some interactive features in our system to dis- 5
y relevant information based on user input. When the user se- 5
ts any specific entities in the graph, related information will be 5
wn. 5
When a Fat Node is selected, its corresponding Fat Edges will 5
highlighted for easier observation. Information about the num- 5
of members in that node, number of outgoing edges, and num- 5
of incoming edges are displayed in a sub window. When a Fat 5
e is selected or highlighted (because of a Fat node selection), 5
Graphs for Boxing Skill Visualization, Computers & Graphics
Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx 7
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
Fig. 6. The posture-based graph of the Boxer S. 1, 2 and 3 are Fat Nodes. 4 and 5
are two Fat Edges. 4 connects Node 2 and Node 3. 5 connects Node1 to itself.
we render the member actions included, such that the user can 525
understand the content of the edge. 526
As an example, in Fig. 6 , there are three Fat Nodes indicated by 527
red arrows and numbered as 1, 2 and 3, each visualized as a char- 528
acter with a mean posture in the node. The sizes of the nodes are 529
indicated by the body fatness. Node 1 is represented by the most 530
muscular character, which indicates the largest node size. Nodes 2 531
and 3 are far thinner. Fat Edges are rendered as curves between 532
nodes such as the ones shown by 4 and 5. The thicknesses of the 533
er 534
n. 535
th 536
nt 537
of 538
r- 539
1. 540
a 541
542
543
on 544
he 545
ts. 546
on 547
ns 548
549
550
p- 551
u- 552
en 553
e 554
ze 555
556
e 557
c- 558
at 559
rt 560
es 561
of 562
ch 563
nd 564
565
m 566
ty, 567
nd 568
up 569
ed 570
571
Fig. 7. The three HMM nodes represent action groups. The HMM edges represent
transitional probability between them.
is defined as 572
4)
on 573
se, 574
575
In 576
ed 577
re 578
he 579
an 580
nd 581
b- 582
o- 583
en 584
585
586
- 587
eir 588
xt 589
on 590
ed 591
ce 592
is 593
es 594
in 595
a- 596
a- 597
ex , 598
in 599
ng 600
tal 601
es 602
es. 603
of 604
605
5)
SI 606
e- 607
edges indicate the frequency of the actions taken. Edge 5 is thick
than edge 4, suggesting that this boxer takes action 5 more ofte
In addition, an edge can be smooth like a circle or bumpy wi
geometric patterns. A single pattern means one activity segme
such as a single punch, while multiple patterns indicate a series
activities such as a combo attack. Our system also supports inte
active features. Fig. 6 is the result when the user selects Node
All the edges starting from this node are highlighted, each with
small character performing the action on it.
5. Action-based graph
The action-based graph focuses on evaluating the transiti
probability from one action class to another. In such a graph, t
nodes represent groups of action with similar activity segmen
The edges represent the transition probability between two acti
groups. It allows us to evaluate the pattern of launching actio
and extract the strategy of the boxer.
5.1. Graph construction
We use the hidden Markov model (HMM) to organize the ca
tured motion, as it has been shown effective in modelling h
man motion. In the domain of character animation, HMM has be
mostly used in the posture level to create motion graphs [12] . W
adapt the graph into the action level such that we can visuali
the transition probability among actions.
The nodes of the graph represent different action groups. W
apply Eq. (2) to group the captured actions into a number of a
tion groups with k-means clustering. The process is similar to th
in Section 4.1 , in which we define a threshold based on expe
knowledge, and then incrementally increase the number of class
until the threshold is met. We denote k ′ as the total number
groups, | G i | as the number of actions in the i th action group (whi
is used in the visualization system for visualizing the fatness a
the placement of the node and will be described later).
The edges of the graph represent transitional probability fro
one action group to another. To obtain the transitional probabili
we go through the sequence of actions in the captured motion a
count the number of occurrences for an action belonging to gro
i to be followed by another belonging to group j , which is denot
as c ij . The transition probability of action group i to action group j 608
Please cite this article as: Y. Shen et al., Posture-based and Action-
(2017), https://doi.org/10.1016/j.cag.2017.09.007
T i j =
c i j ∑ k ′ m =1
∑ k ′ n =1 c mn
(
where the denominator represents the total number of transiti
in the whole motion. Notice that i may be equal to j . In such a ca
two actions of the same action group are launched successively.
The concept of the action-based graph is shown in Fig. 7 .
general, experienced boxers tend to have a more evenly distribut
transitional probability across all actions, which means that the
should be edges connecting all the nodes. This indicates that t
boxer’s pattern is dynamic and cannot be easily predicted by
opponent. Conversely, novice boxers may have limited edges a
some thick edges connecting two nodes, which means a high pro
ability to launch those two groups action consecutively. An opp
nent may discover such a pattern and counter-act in advance wh
the first action is observed.
5.2. The action strategy index
In many sports, the unpredictability of action patterns is an im
portant skill indicator. Experienced players would diversify th
action patterns such that their opponents cannot predict the ne
action. However, novice players tend to perform actions based
predictable patterns (i.e. the sequence of actions to be launch
continuously), which can be easily identified. For example, a novi
boxer usually perform two straight punches successively. This
because the boxer is not able to link different types of punch
fluently, and therefore would perform the simplest punches aga
and again. The proposed action-level graph allows easy observ
tion of boxing patterns, as we can visualize the transitional prob
bility among actions. We further propose the Action Strategy Ind
which evaluates the unpredictability of action pattern. We obta
the number of outgoing HMM edges for each HMM node, formi
a set that is denoted as e = { e i } ∀ i ∈ [1 , k ′ ] , where k ′ is the to
number of HMM nodes. Skillful players would have similar valu
in the e set, while novice players would have very different valu
We therefore define the Action Strategy Index as the precision
e , that is, the reciprocal of its standard deviation
ASI =
1
σ (e ) (
where σ represents the standard deviation operator. A high A
value indicate that the player’s action patterns are more unpr
dictable, which indicates a higher skill level.
based Graphs for Boxing Skill Visualization, Computers & Graphics
8 Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
Fig. shold as (a) 0, (b) 1 and (c) 2. The red shade indicates the inner circle covering nodes
of t e rare class.
5.3609
610
gra611
tion612
eva613
5.3614
615
act616
the617
usi618
act619
usi620
nod621
622
esp623
pea624
wh625
ten626
int627
wh628
in 629
res630
gen631
632
circ633
tha634
on.635
res636
sm637
pla638
is t639
imp640
opt641
con642
exi643
to 644
645
cla646
sem647
tra648
5.3649
650
the651
cul652
the653
als654
see655
Fig.
freq
belo
is 56
edg 57
lau 58
are 59
to 60
ticu 61
5.3 62
63
can 64
act 65
an 66
HM 67
dis 68
Thi 69
wit 70
of 71
ber 72
73
the 74
sua 75
per 76
ing 77
wit 78
wh 79
con 80
the 81
fre 82
and 83
the 84
dia 85
6. 86
87
the 88
Pl
(2
8. Action-based graphs of the same boxer generated by setting the frequency thre
he frequent class, and the blue shade indicate the outer circle covering nodes of th
. Visualization system
Here, we explain the visualization system for the action-level
ph. The system allows easy observation of the preference of ac-
and the boxing pattern. Both are very important aspects to
luate the high-level strategy of a boxer.
.1. Visualizing HMM Nodes
Each action group is represented by its corresponding median
ion, which is the action that is the closest to the mean value of
action group during k-means clustering. We render the nodes
ng human characters with the starting posture of the median
ion. The number of actions in each action group is visualized
ng the fatness of the corresponding character. The color of the
es are randomized.
As mentioned in Section 4.1.2 , we observe that some boxers,
ecially novices, may produce random actions that are not re-
table. Such actions may generate a large number of thin nodes,
ich distract the user from evaluating the actions that are of-
launched. Therefore, we classify the action groups with | G i | > a
o the frequent class , and groups with | G i | ≤ a into the rare class ,
ere G i is the number of member actions in a node as defined
Section 5.1 , a is a preset frequency threshold. Fig. 8 shows the
ult of setting different values of a . We find that setting a = 2
erates the best results.
We place the nodes belonging to the frequent class at an inner
le, and those belonging to the rare class at an outer circle, such
t the user can identify them easily and decide what to focus
For the inner circle, nodes are ordered according to the cor-
ponding value of | G i |, and are placed evenly at a circle with a
aller radius. For the outer circle, to minimize edge crossing, we
ce the nodes at a position on a circle with a larger radius that
he closest to the nodes with incoming and outgoing edges. To
lement this, we develop a simple optimization algorithm that
imizes the position of the nodes. During the optimization, we
strain the position to be at the circle and not overlapping with
sting nodes. We then minimize the sum of distance with respect
the nodes connecting to the current one.
By default, we render the HMM node belonging to the frequent
ss with solid colors, and those belonging to the rare class in
i-transparent colors. This further avoids the user being dis-
cted by the rarely performed actions.
.2. Visualizing HMM Edges
We visualize the edges using 2D curves. While we can render
edges with straight lines, the resultant group would be diffi-
t to observe as the lines overlap significantly. We augmented
edges with a small random curvature to solve the problem. We
o render the edges as semi-transparent such that the users can
through partially overlapped edges. The thickness of the edge
ease cite this article as: Y. Shen et al., Posture-based and Action-based
017), https://doi.org/10.1016/j.cag.2017.09.007
9. The action-based graph of the Boxer S. 1, 2 are HMM nodes belonging to the
uent class. 3, 4 are outgoing HMM edges from the node 1. 5, 6 are HMM nodes
nging to the rare class.
proportional to T ij calculated in Eq. (4) . As a result, a thicker 6
e connecting node i to node j indicates that the boxer often 6
nches action group j after action group i . The color of the edges 6
decided based on that of the source node. This helps the user 6
identify which action groups the boxer may launch after a par- 6
lar one. 6
.3. Interactive features 6
We also implement some interactive features such that the user 6
select what to view. The most important component of the 6
ion-based graph is the action itself. Therefore, we implement 6
interactive system such that when a user clicks on a particular 6
M node, the median action of the corresponding action group is 6
played. We also highlight the outgoing edges from such a node. 6
s allows the user to examine individual action group together 6
h the transition probability to the next groups. The information 6
the node, such as the number of member actions and the num- 6
of out-going HMM edges, are displayed on a separate window. 6
As an example, in Fig. 9 , there are 5 HMM nodes belonging to 6
frequent class including node 1 and 2. These nodes are vi- 6
lized with more muscular characters, meaning that the boxer 6
forms them more frequently. There are 3 HMM nodes belong- 6
to the rare class including node 5 and 6, which are visualized 6
h thinner characters. Node 1 has 5 outgoing HMM edges, in 6
ich edge 3 point towards another node, while edge 4 is a self- 6
necting edge. Edge 4 is thicker than the others, indicating that 6
boxer performs successive actions belonging to node 1 very 6
quently. The screen is captured when the user selects node 1, 6
as a result, all outgoing edges of node 1 are highlighted, and 6
character representing node 1 performs the corresponding me- 6
n action. 6
Experimental results 6
In this section, we present experimental results. We captured 6
motions of four boxers with varying skill levels. We first give 6
Graphs for Boxing Skill Visualization, Computers & Graphics
Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx 9
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
n-bas
s, 689
es. 690
o- 691
lt 692
p- 693
694
nd 695
ch 696
Fig. 10. (Left) The posture-based graphs and (Right) the actio
detailed motion analysis and visualization of individual motion
and then compare them side by side using the proposed index
This demonstrates that our system is an effective tool for m
tion analysis, skill assessment and comparisons. As it is difficu
Please cite this article as: Y. Shen et al., Posture-based and Action-
(2017), https://doi.org/10.1016/j.cag.2017.09.007
ed graphs for boxer N, M1, M2 and S (top to bottom), respectively.
to show the motions in pictures, we refer the readers to the su
plementary video for more details.
The four boxers chosen have different skill levels. As a grou
truth, their skills were evaluated by a professional boxing coa
based Graphs for Boxing Skill Visualization, Computers & Graphics
10 Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
as skillful, medium, medium and novice respectively, and were de- 697
noted as S, M1, M2 and N. 698
6.1. Boxer evaluation 699
The boxers’ posture-based and action-based graphs are shown 700
in Fig. 10 , in which letter annotations are given to help explain the 701
graphs. These graphs allow the users to assess boxing skills even if 702
they are not familiar with boxing. 703
6.2. Boxer S 704
The first row of images in Fig. 10 shows the graphs of boxer 705
S. T706
sta707
box708
sta709
fro710
arm711
tur712
the713
and714
715
act716
(b)717
con718
gro719
the720
me721
and722
fre723
fre724
a s725
att726
ma727
the728
6.3729
730
box731
tion732
som733
bot734
the735
mo736
foo737
the738
mo739
per740
rien741
742
few743
par744
M1745
act746
als747
fre748
act749
att750
of 751
pre752
6.4753
754
pos755
Table 1
Statistics of the boxing motions. SL: Skill Level evaluated
by a professional boxing coach. PN: Posture Number (for
starting and ending actions). AN: Action Number.
Boxer S Boxer M1 Boxer M2 Boxer N
SL Skillful Medium Medium Novice
PN 138 160 112 176
AN 69 80 56 88
ever, a number of secondary postures (b) –(d). These postures are 756
all performed sub-optimally with his arms not guarding the head, 757
and should be avoided. Looking closely to the edges (f) going to 758
posture (c), we can find that the posture is performed as a subtle 759
movement to prepare various left punches. This should be avoid 760
as the opponent can tell the moves whenever seeing such a pos- 761
ture. Postures (g) and (h) are very different from the rest, and are 762
geometrically far from the other postures. These two postures are 763
performed because the boxer unintentionally raises the arms dur- 764
ing the capture. Our system can pick up the mistake and visualize 765
it i 66
67
the 68
larg 69
Thi 70
tion 71
Dif 72
nod 73
ble 74
Thi 75
pun 76
pun 77
ing 78
and 79
6.5 80
81
ma 82
larg 83
tur 84
tha 85
tur 86
tur 87
to 88
wit 89
due 90
wid 91
92
the 93
cla 94
con 95
com 96
ma 97
nec 98
tion 99
the 00
the 01
mo 02
and 03
an 04
6.6 05
06
Pl
(2
he posture-based graphs shows a main standard posture (a) to
rt and end actions, which is good for boxing as it allows the
er to transit from one action to another effectively through the
ndard posture. A large variety of actions (b) can be produced
m such a posture. There is a secondary posture in which the
s are further apart (c). This should be avoided as such a pos-
e is weak in blocking attacks. Posture (d) is generated because
boxer trips over during the training. Our system can pick up
visualize such a mistake accurately.
The action-based graph of boxer S shows there there are many
ions in the frequent group (a) and only a few in the rare group
. This shows that the boxer is experienced and his actions are
sistent. There is a major movement action (c) in the frequent
up (a), and such an action has good connections to many of
others. This is good as experienced boxers typically use move-
nt actions to adjust their position relative to their opponent,
launch attacks when the time is right. Other actions in the
quent group (a) are variations of attacks. For example, the more
quently used action (d) is a right-left combo and action (e) is
ingle right punch, which show that the boxer tends to start an
ack with the right punch. It is good to see that attacking actions
y connect to each other, which enhance the unpredictability of
boxer.
. Boxer M1
Next, we evaluate the posture-based graph of boxer M1. The
er has a main standard posture (a) to launch most of the ac-
s (b). However, he has a secondary posture (c) for launching
e attacks, and another (d) for launching a turning action. In
h postures, the arms are in a low position and cannot guard
boxer well from the opponent. More importantly, the relatively
re frequently used secondary posture (c) is performed with the
t distance much wider than the shoulder width. This means
boxer has limited mobility in this posture, as the legs must
ve towards each other before another stepping action can be
formed. These observations show that the boxer is not as expe-
ced and consistent as boxer S.
The corresponding action-based graph shows that there are
er frequent class actions (a) but more rare class ones (b) com-
ed to boxer S. This means that that the boxing action of boxer
is less consistent. The boxer has a large number of movement
ions (c) that are connected to all the rest of the action nodes. He
o has a variety of attack actions as shown in other actions in the
quent class (a). In particular, action (d) is a left-right combo and
ion (e) is a left punch, showing that the boxer tends to start an
ack with the left punch. Overall, there is an acceptable number
connections among attacks, demonstrating the acceptable un-
dictability of the boxer.
. Boxer M2
For boxer M2’s posture-based graph, there is a main standard
ture (a) launching the majority of actions (b). There are, how-
ease cite this article as: Y. Shen et al., Posture-based and Action-based
017), https://doi.org/10.1016/j.cag.2017.09.007
n the graph. 7
From boxer M2’s action-based graph, it can be observed that 7
re are relatively fewer actions in the frequent class (a), but a 7
e number of actions in the rare class combining (b) and (c). 7
s shows that the boxer is quite inconsistent in the boxing ac- 7
s, and could be because of the lack of training and experience. 7
ferent from the boxers discussed, boxer M2 has a largest action 7
e (d) of left punch. The second largest action node (e) is a dou- 7
left punch. The movement action node (f) is relatively small. 7
s shows that boxer M2 has a different boxing style to use left 7
ch as a major action to connect to other actions and his left 7
ch is dominant. Such a boxing style is not advised as a punch- 7
action, comparing to a movement one, consume more energy 7
expose a larger risk of being attacked. 7
. Boxer N 7
In the posture-based graph of the novice boxer N, there are two 7
jor standard postures (a) and (b) instead of one. There are a 7
e number of self-connecting actions (c) and (d) for both pos- 7
es, as well as a lot of actions (e) connecting the two. This shows 7
t the boxer is highly inconsistent in the boxing postures. Pos- 7
e (a), the more relatively frequently used one, is inferior to pos- 7
e (b), due to its wider foot distance. It does not allow the boxer 7
step freely. Posture (f), (g) and (h) are all secondary postures 7
h different posture variations. They are all not well performed 7
to the low arm positions limiting blocking capability, and the 7
e foot width limiting movement capability. 7
The corresponding action-based graph shows some actions in 7
frequent class (a) but a large number of actions in the rare 7
ss (b). This means that the novice boxer cannot perform actions 7
sistently. The action in the rare class (b) are mainly very long 7
bo that are randomly combined and cannot be reproduced. The 7
in action (c) is a movement action. Such an action cannot con- 7
t to a number of others in the rare class (b), and many ac- 7
s in the rare class (b) are not well connected. This means that 7
boxer’s action is more predictable, which is bad in a match as 8
opponent can guess what the boxer may launch next. The two 8
re frequently used attack action (d) and (e) are left-right combo 8
left punch respectively, showing that the boxer tends to start 8
attack with a left punch. 8
. Statistical analysis 8
Here, we give some statistics about the proposed system. 8
Graphs for Boxing Skill Visualization, Computers & Graphics
Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx 11
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
Table 2
Statistics of the boxing motions. FNN: Fat Node Number
(brackets show numbers after removing accidentally cre-
ated nodes). FEN: Fat Edge Number (brackets show num-
bers of consistently performed edges). CI: Connectivity In-
dex.
Boxer S Boxer M1 Boxer M2 Boxer N
FNN 3 (2) 6 (4) 3 (3) 5 (5)
FEN 20 (10) 36 (12) 16 (7) 57 (8)
CI 5.0 3.0 2.3 1.6
Table 3
Statistics of the boxing motions in the Aciton Graphs. NN:
Node Number. NNFC: Node Number for Frequent Class.
NNRC: Node Number for Rare Class. EN: Edge Number. ASI:
Action Strategy Index.
Boxer S Boxer M1 Boxer M2 Boxer N
ng 807
of 808
809
h, 810
he 811
he 812
th 813
c- 814
815
h, 816
ed 817
c- 818
al- 819
er, 820
in, 821
ve 822
823
s- 824
of 825
a- 826
- 827
ri- 828
ns 829
k 830
nd 831
e 832
in 833
834
835
el 836
a- 837
ed 838
s- 839
is 840
on 841
ity 842
s 843
si- 844
m 845
il- 846
p- 847
tured from 4 boxers with varying skill levels. The evaluations from 848
our system aligns with that of a professional boxing coach. 849
Although we use boxing as our target sport in the experimenta- 850
tion section, the underpinning theoretical development can be ap- 851
plied to most sports that require swiftness, flexibility and creativ- 852
ity, such as tennis, fencing and basketball. The adaptation of the 853
proposed system to these sports and the comparison of the system 854
performances on different sports remain as future work. 855
We focus on analyzing the skill level of the boxers in terms of 856
high-level motion behaviour such as the richness of the action, the 857
transition of action and the unpredictability of boxing patterns. We 858
do not evaluate the lower-level parameters such as the speed of 859
the punches, which has been explored in previous works. It is an 860
interesting future direction to combine both high-level and low- 861
level evaluation in order to have a full assessment of the boxers. 862
There are limitations to our method. First, our method is based 863
on the assumption that the sports skills mainly consist of a finite 864
number of key postures and key actions. Admittedly, not all sports 865
follow this pattern. Second, the visualization and skill assessment 866
is based on an individual athlete, not considering skills related to 867
collaborations such as those in group sports, in which the assess- 868
ment might need to employ different criteria. 869
We argue that novice boxers tend to have different posture- 870
based graphs, while experienced boxers tend to have graphs of 871
a similar topology. This is because unlike experience boxers who 872
have only 1 to 2 main postures nodes, novice boxers tend to have 873
more nodes, resulting in a much larger variation on the graph 874
topology. As a future work, we would like to utilize the stem to 875
evaluate a large number of boxers in different skill levels to verify 876
this argument. 877
he 878
a- 879
ell 880
it 881
et, 882
w 883
as 884
In 885
ke 886
x- 887
ill 888
is 889
e 890
891
892
si- 893
he 894
895
896
be 897
898
899
all 900 m- 901 02. 902
903 eal 904 ual 905
906 ny- 907 10. 908
909
NN 7 11 9 16
NNFC 4 3 4 5
NNRC 3 8 5 11
EN 16 27 20 38
ASI 0.572 0.448 0.426 0.378
Table 1 shows the skill level assessed by a professional boxi
coach, as well as the number of postures and actions, for each
the boxers considered.
Table 2 shows the statistics related to the posture-based grap
including the number of fat nodes and fat edges, as well as t
Connectivity Index calculated with Eq. (3) . The index evaluates t
richness of actions and the flexibility of transitions. It aligns wi
the boxers’ skill level and more skillful boxers have higher Conne
tivity Indexes.
Table 3 shows the statistics related to the action-based grap
including the number of HMM nodes (which is further separat
into the number for the frequent class and the rare class respe
tively) and HMM edges, as well as the Action Strategy Index c
culated with Eq. (5) . It indicates the unpredictability of a box
and more skillful boxers are generally more unpredictable. Aga
it aligns with the boxer’ skill level and more skillful boxers ha
higher Action Strategy Indexes.
In terms of the computational cost, we run the proposed sy
tem on a laptop computer with a Core i7-6820HQ CPU, 16GB
RAM and a NVIDIA Quadro M10 0 0M graphic card. The comput
tional time to analyze the captured motion ( Section 3.2 ) and com
puting the graphs ( Sections 4 and 5 ) ranges from 6 to 9 s. The va
ation of computational time is mainly due to the iterative k-mea
clustering algorithm for both postures and actions, as a larger
requires longer computational time. The run-time cost is low a
we achieve frame rate higher than real-time (i.e. 60Hz). The fram
rate tends to be lower when there are more characters shown
the graphs.
7. Conclusion and discussions
In this paper, we proposed a method to visualize the high-lev
skills of boxers using an automatic motion analysis and visualiz
tion framework. The proposed posture-based graph is a customiz
Fat Graph that allows us to evaluate the quality of standard po
tures for launching and finishing actions. The action-based graph
a customized Hidden Markov Model that visualizes the transiti
probability among actions. We further introduce the Connectiv
Index that is deduced from the posture-based graph and allow
evaluation of the richness of actions and the flexibility of tran
tions, as well as the Action Strategy Index that is deduced fro
the action-based graph and allows evaluation of the unpredictab
ity of action patterns. The system is applied on the motion ca
Please cite this article as: Y. Shen et al., Posture-based and Action-
(2017), https://doi.org/10.1016/j.cag.2017.09.007
In the future, we wish to extend the proposed algorithm to t
field of computer animation. Currently, when synthesizing anim
tions by motion graphs, experienced animators are required to t
what motions are missed or badly captured. With our system,
is possible to analyze the connectivity and variety of a motion s
which are two critical factors in motion synthesis. However, ho
to generalize these findings to give high-level suggestion, such
proposing the motions to capture, remains an open problem.
addition, we would like to develop a visualization system to ta
the adversarial nature of sports. For instance, although two bo
ers might have roughly the same skill level, in a match, one’s sk
composition might give him/her advantages over the other. Th
kind of analysis would be very useful in preparation for a gam
or predicting the result.
Acknowledg ment
This work was supported in part by the Engineering and Phy
cal Sciences Research Council (EPSRC) (Ref: EP/M002632/1 ) and t
Royal Society (Ref: IE160609).
Supplementary material
Supplementary material associated with this article can
found, in the online version, at 10.1016/j.cag.2017.09.007
References
[1] Komura T, Kuroda A, Shinagawa Y. Nicemeetvr: facing professional basebpitchers in the virtual batting cage. In: Proceedings of the 2002 ACM sy
posium on applied computing. In: SAC ’02. New York, NY, USA: ACM; 20p. 1060–5. ISBN 1-58113-445-2. doi: 10.1145/508791.5090 0 0 .
[2] Bideau B, Kulpa R, Menardais S, Fradet L, Multon F, Delamarche P, et al. R
handball goalkeeper vs. virtual handball thrower. Presence: Teleoper VirtEnviron 2003;12(4):411–21.
[3] Molet T, Aubel A, Capin T, Carion S, Lee E, Magnenat-Thalmann N, et al. Aone for tennis? Presence: Teleoper Virtual Environ 1999;8(2):140–56. doi:
1162/105474699566134 .
based Graphs for Boxing Skill Visualization, Computers & Graphics
12 Y. Shen et al. / Computers & Graphics xxx (2017) xxx–xxx
ARTICLE IN PRESS
JID: CAG [m5G; October 16, 2017;22:13 ]
[4] Hyun K, Lee K, Lee J. Motion grammars for character animation. Comput Graph 910 Forum 2016;35(2):103–13. doi: 10.1111/cgf.12815 . 911
[5] Wampler K, Andersen E, Herbst E, Lee Y, Popovi ́c Z. Character animation 912 in two-player adversarial games. ACM Trans Graph 2010;29(3):26-1–26-13. 913 doi: 10.1145/1805964.1805970 . 914
[6] Shin HJ, Oh HS. Fat graphs: constructing an interactive character with contin- Q3 915 uous controls. Proceedings of ACM SIGGRAPH / Eurographics Symposium on 916 computer animation; 2006. p. 291–8. 917
[7] Rabiner LR. A tutorial on hidden Markov models and selected applications in 918 speech recognition. Proc IEEE 1989;77(2):257–86. doi: 10.1109/5.18626 . 919
[8] Shum HPH, Wang H, Ho ESL, Komura T. Skillvis: a visualization tool for box- 920 ing skill assessment. In: Proceedings of the 2016 international conference on 921 motion in games. In: MIG ’16. New York, NY, USA: ACM; 2016. p. 145–53. ISBN 922 978-1-4503-4592-7. doi: 10.1145/2994258.2994266 . 923
[9] Yeadon M. The simulation of aerial movement- iv. a computer simulation 924 model. J Biomech 1990;23(1):85–9. 925
[10] Yeadon M. The biomechanics of the human in flight. Am J Sports Med 926 1997;25(4):575–80. 927
[11] Arikan O, Forsyth D. Motion generation from examples. ACM Trans Graph 928 2002;21(3):483–90. 929
[12] Lee J, Chai J, Reitsma PSA, Hodgins JK, Pollard NS. Interactive control of avatars 930 animated with human motion data. ACM Trans Graph 2002;21(3):491–500. 931
[13] Kovar L, Gleicher M, Pighin F. Motion graphs. ACM Trans Graph 932 2002;21(3):473–82. 933
[14] Grochow K, Martin SL, Hertzmann A, Popovic’ Z. Style-based inverse kinemat- 934 ics. ACM Trans Graph (TOG) 2004;22(3). 935
[15] Shin HJ, Lee J. Motion synthesis and editing in low-dimensional spaces. Com- 936 put Animat Virtual Worlds (Special Issue: CASA 2006) 2006;17(3–4):219–27. 937
[16] Min J, Chai J. Motion graphs++: a compact generative model for semantic mo- 938 tion analysis and synthesis. ACM Trans Graph 2012;31(6):153. 939
[17] Beaudoin P, Coros S, van de Panne M, Poulin P. Motion-motif graphs. In: Pro- 940 ceedings of the 2008 ACM SIGGRAPH/Eurographics symposium on computer 941 animation. In: SCA ’08. Aire-la-Ville, Switzerland, Switzerland: Eurographics 942 Association; 2008. p. 117–26. ISBN 978-3-905674-10-1 943
[18] Arikan O, Forsyth DA, O’Brien JF. Motion synthesis from annotations. ACM 944 Trans Graph 2003;22(3):402–8. doi: 10.1145/882262.882284 . 945
[19] Li Y, Wang T, Shum H-Y. Motion texture: a two-level statistical model for 946 character motion synthesis. ACM Trans Graph 2002;21(3):465–72. doi: 10.1145/ 947 566654.566604 . 948
[20] Reitsma PSA, Pollard NS. Evaluating motion graphs for character animation. 949 ACM Trans Graph 2007;26(4). doi: 10.1145/1289603.1289609 . 950
[21] Heck R, Gleicher M. Parametric motion graphs. In: Proceedings of the 2007 951 symposium on interactive 3D graphics and games. In: I3D ’07. New York, NY, 952 USA: ACM; 2007. p. 129–36. ISBN 978-1-59593-628-8. doi: 10.1145/1230100. 953 1230123 . 954
[22] Xia S, Wang C, Chai J, Hodgins J. Realtime style transfer for unlabeled heteroge- 955 neous human motion. ACM Trans Graph 2015;34(4):119-1–119-10. doi: 10.1145/ 956 2766999 . 957
[23] Safonova A, Hodgins JK. Construction and optimal search of interpolated mo- 958 tion graphs. ACM Trans Graph 2007;26(3). doi: 10.1145/1276377.1276510 . 959
[24] Shum HPH, Komura T, Shiraishi M, Yamazaki S. Interaction patches for multi- 960 character animation. ACM Trans Graph 2008;27(5):114-1–114-8. doi: 10.1145/ 961 1409060.1409067 . 962
[25] Shum HPH, Komura T, Yamazaki S. Simulating multiple character interac- 963 tions with collaborative and adversarial goals. IEEE Trans Vis Comput Graph 964 2012;18(5):741–52. doi: 10.1109/TVCG.2010.257 . 965
[26] Ho ESL, Komura T. A finite state machine based on topology coordinates for 966 wrestling games. Comput Animat Virtual Worlds 2011;22(5):435–43. doi: 10. 967 1002/cav.376 . 968
[27] Ho ES, Komura T. Character motion synthesis by topology coordinates. Comput 969 Graph Forum 2009. doi: 10.1111/j.1467-8659.2009.01369.x . 970
[28] Plantard P, Shum HPH, Multon F. Filtered pose graph for efficient kinect 971 pose reconstruction. J Multimed Tools Appl 2017;76(3):4291–312. doi: 10.1007/ 972 s11042- 016- 3546- 4 . 973
[29] Plantard P, Shum HPH, Pierres A-S L, Multon F. Validation of an ergonomic 974 assessment method using kinect data in real workplace conditions. Appl Ergon 975 2016. doi: 10.1016/j.apergo.2016.10.015 . 976
[30] Shum HPH, Komura T, Shiratori T, Takagi S. Physically-based character control 977 in low dimensional space. In: Proceedings of the third international conference 978 on motion in games. In: MIG ’10, vol. 6459. Berlin, Heidelberg: Springer-Verlag; 979 2010. p. 23–34. ISBN 978-3-642-16957-1. doi: 10.1007/978- 3- 642- 16958- 8 _ 3 . 980
[31] Shum HPH, Ho ESL, Jiang Y, Takagi S. Real-time posture reconstruction for mi- 981 crosoft kinect. IEEE Trans Cybern 2013;43(5):1357–69. doi: 10.1109/TCYB.2013. 982 2275945 . 983
[32] Ho ESL, Shum HPH, Cheung Y-m, Yuen PC. Topology aware data-driven inverse 984 kinematics. Comput Graph Forum 2013;32(7):61–70. doi: 10.1111/cgf.12212 . 985
[33] Lawrence ND. Gaussian process latent variable models for visualisation of high 986 dimensional data. In: Proceedings of the 2003 neural information processing 987 systems; 2003. 988
[34] Wang JM, Fleet DJ, Hertzmann A. Gaussian process dynamical models for hu- 989 man motion. IEEE Trans Pattern Anal Mach Intell 2008;30(2):283–98. doi: 10. 990 1109/TPAMI.2007.1167 . 991
[35] Holden D, Saito J, Komura T. A deep learning framework for character motion 992 synthesis and editing. ACM Trans Graph 2016;35(4):138-1–138-11. doi: 10.1145/ 993 2897824.2925975 . 994
[36] Liu Z, Zhou L, Leung H, Shum HPH. Kinect posture reconstruction based on a 995 local mixture of gaussian process models. IEEE Trans Vis Comput Graph 2016. 996 doi: 10.1109/TVCG.2015.2510 0 0 0 . 997
[37] Hara K, Omori T, Ueno R. Detection of unusual human behavior in intelligent 998 house. In: Proceedings of the 12th IEEE workshop on neural networks for sig- 999 nal processing; 2002. p. 697–706. doi: 10.1109/NNSP.2002.1030081 . 10 0 0
[38] Françoise J, Roby-Brami A, Riboud N, Bevilacqua F. Movement sequence analy- 1001 sis using hidden Markov models: a case study in tai chi performance. In: Pro- 1002 ceedings of the 2nd international workshop on movement and computing. In: 1003 MOCO ’15. New York, NY, USA: ACM; 2015. p. 29–36. ISBN 978-1-4503-3457-0. 1004 doi: 10.1145/2790994.2791006 . 1005
[39] Brand M, Hertzmann A. Style machines. In: Proceedings of the 27th annual 1006 conference on computer graphics and interactive techniques. In: SIGGRAPH 1007 ’00. New York, NY, USA: ACM Press/Addison-Wesley Publishing Co.; 20 0 0. 1008 p. 183–92. ISBN 1-58113-208-5. doi: 10.1145/344779.344865 . 1009
[40] Tanco LM, Hilton A. Realistic synthesis of novel human movements from a 1010 database of motion capture examples. In: Proceedings of workshop on human 1011 motion; 20 0 0. p. 137–42. doi: 10.1109/HUMO.20 0 0.897383 . 1012
[41] Ren L, Patrick A, Efros AA, Hodgins JK, Rehg JM. A data-driven approach 1013 to quantifying natural human motion. ACM Trans Graph 2005;24(3):1090–7. 1014 doi: 10.1145/1073204.1073316 . 1015
Pl
(2
ease cite this article as: Y. Shen et al., Posture-based and Action-based
017), https://doi.org/10.1016/j.cag.2017.09.007
Graphs for Boxing Skill Visualization, Computers & Graphics