Upload
trinhtuong
View
248
Download
2
Embed Size (px)
Citation preview
Noname manuscript No.(will be inserted by the editor)
Example-Based Caricature Generation with ExaggerationControl
Wei Yang · Masahiro Toyoura · Jiayi Xu · Fumio Ohnuma · Xiaoyang
Mao
Abstract Caricature is a popular artistic media widelyused for effective communications. The fascination of
caricature lies in its expressive depiction of a person’sprominent features, which is usually realized throughthe so called exaggeration technique. This paper pro-
poses a new example based automatic caricature gen-eration system supporting the exaggeration of both theshape of facial components and the spatial relationships
among the components. Given the photograph of a face,the system automatically computes the feature vectorsrepresenting the shape of facial components as well as
the spatial relationship among the components. Thosefeatures are exaggerated and then used to search thelearning database for the corresponding caricature com-
ponents and for arranging the retrieved components tocreate the caricature. Experimental results show thatour system can generate the caricatures of the examplestyle capturing the prominent features of the subjects.
Keywords Caricature · Example-based · Exaggera-tion · Visual appearance facial feature
1 Introduction
Caricatures serve as an effective media for communica-tion in many settings, range from artistic portraits to
W. Yang · M. Toyoura · X. MaoUniversity of Yamanashi,4-3-11, Kofu, Yamanashi, 400-8511, JapanE-mail: [email protected]
Jiayi XuHangzhou Dianzi University,Jianggan, Hangzhou, Zhejiang, China
Fumio OhnumaInstitute of Your Branding,1-5-20-502, Nagamachi, Taihaku, Sendai, Japan
satirical or parodic illustrations and comics. By empha-sizing a person’s most prominent features, it enables ob-
servers to identify the subject at a glance. It is a highlyindividualized form of art, but not everyone is capableof drawing a caricature on his or her own. Given the
challenges involved, people who want caricatures forpersonal use would benefit greatly from a generationsystem that anyone could use to produce caricatures
without much difficulty. This paper proposes a new au-tomatic caricature generation technique: an example-based caricature generation system that can create ex-
aggerated representation of both the facial componentshape and the spatial relationships among the compo-nents.
Since the 1980s, there are tremendous number of re-search projects have been conducted on the computer
generation of caricature [21]. Most of early techniquesrely more or less on user’s input for either extract-ing features from input face or controlling the style
of resulting caricatures. Recently, example based ap-proaches [6–8,14–17,22,25] are attracting large atten-tions for its advantage of being able to reflect the unique
style of individual artists. Those methods rely on train-ing data comprising many sets of input images and theircorresponding caricatures. They allow users to express
divergent artistic styles by simply substituting the ap-propriate examples into the learning database. How-ever, most of existing example based methods use learn
either texture or features in eigenspace and hence giveno high level control over the exaggeration of individualfacial components.
Our technique takes an example-based approach based
on the learning of visual appearance features. The pro-posed method learns the relationship between photographsand caricatures in a component-by-component way to
reproduce the expressive styles of artists faithfully while
2 Wei Yang et al.
giving the users the control over the degree of exag-
geration. The system comprises the construction of alearning database and the runtime generation of cari-catures. The construction of the learning database links
the pairs of facial images between photographs and cor-responding caricatures in a component-by-componentway. Given the photograph of a face, the system au-
tomatically computes the feature vectors characteriz-ing the shape of each facial component. Those featuresare exaggerated and then used to search the learning
database for the corresponding caricature components[26]. In this paper, we further enhance the system byalso considering the spatial relationship among the fa-
cial components. It is known that the relative positionof facial components is important in the perception ofa face [20]. Recently, Klare et al. [12] used crowdsourc-
ing to qualitatively describe 25 features and labeled theimportance of each feature through machine learning.They concluded that Level 1 features (including “facelength”, “face shape”, etc.) are the most discriminative
and Level 2 features (including “eye separation”, “noseto eye distance”, “nose to mouth distance”, “mouthto chin distance”, etc.) are more discriminative than
Level 3 features, which are the features characterizingindividual facial components, in recognizing a person.Based on those research results, the system presented
in this paper deforms the face outline so that it re-sembles that of the input image more and exaggeratethe relative positions among components at the final
composition stage. User studies have been conductedto investigate the effect of exaggeration and to vali-date whether the generated caricature can capture the
prominent features of the input face.The remainder of the paper is organized as follows:
after reviewing the related works in Section 2, Section 3
first presents the overall structure of proposed systemand then describes the details of algorithms. Section4 describes the results of experiments and Section 5
concludes the paper.
2 Related Research
There is a considerable amount of past research on com-
puter generation of caricature. Generally, these existingstudies fall into three broad categories: interactive, rulebased and example based.
As interactive approach, Akleman et al. [1] provideda simple morphing template for the user to manuallydeform the facial features. Later, they improved the al-
gorithm with a new deformation algorithm that usessimplicial complex [2]. Gooch et al. [11] converted aphotograph to a simple line illustration, and then ma-
nipulated the feature grid imposed on the illustration.
Interactive methods give users the control over the fea-
tures to be exaggerated as well as the degree of exag-geration, but on the other hand usually add more loadsto users [19]. Also it can be difficult to produce ideal
results for a user without the knowledge on caricature.
Rule-based approaches simulate the predefined rulesto draw caricature. Most rule-based methods producean exaggerated representation by analyzing the sub-
ject’s features and then manipulating an average modelaccordingly. The first such work by Brennan [5] puts 165feature points on the “average face”. The feature points
are moved with an amount proportional to the differ-ence from the corresponding reference points, and areconnected to create a line-drawing caricature. Koshimizuet al. [13] applied the same idea in their interactive sys-
tem (PICASSO) which can generate very impressiveline drawing style caricatures. Chiang et al. [9] pro-posed a method by morphing a caricature drawn by the
artist based on the difference from average model. Moet al. [18] used normalized deviation from the averagemodel to exaggerate the distinctive features. Teseng et
al. [23,24] used both inter and intra correlations of size,shape and position features for exaggeration. They sub-dued some of the features to emphasize the other fea-
tures. Chen et al. [8] considered the two relative prin-ciples described in [20], and proposed “T-Shape” rulefor emphasizing the relative position among facial ele-
ments. They measured the similarity between the car-icature and the photograph with Modified HausdorffDistance and minimized the distance to improve their
results. Under a rule-based method, reflecting differ-ences in drawing styles normally requires making changesto parameter extraction processes and creation rules for
each style.
While example-based methods rely on training datacomprising many sets of input images and their corre-sponding caricatures, they allow users to express di-
vergent artistic styles by simply substituting the ap-propriate examples into the learning database. Chen elal. [6–8] first built a success system for learning from
a collection of actual professional portraits using non-parametric texture-synthesis. Liang et al. [14] furtherextended their method by separating a portrait into
two models — polyline-based shapes and shading-basedtextures. These methods can reproduce the renderingstyle of portrait sketch faithfully but gives no high level
control over the exaggerated depiction of individual fa-cial components due to the underlying learning algo-rithm using texture synthesis. Other researchers tried
to combine hierarchical or graph facial model with ex-ample based approach [17,23–25], but paid no attentionto the exaggeration of prominent features either. Shet
et al. [22] used cascade correlation neural network to
Example-Based Caricature Generation with Exaggeration Control 3
learn the exaggeration degree of caricaturist. Liu et al.
[15] applied PCA to obtain the principle components ofthe facial features, and then used Support Vector Re-gression (SVR) to predict the result for given a face im-
age. They further proposed a non-linear mapping modelusing semi-supervised manifold regularization learning[16].
In rule-based and example-based methods alike, fea-ture vector design plays an integral part in improving
overall caricature quality. Many existing caricature gen-eration techniques use eigenspaces defined via PCA.However, when people draw caricatures, they normally
capture the visually perceptable features of individualfacial components, such as thin eyes, drooping eyebrows,and so on, as well as the composite arrangement of those
components. In that sense, methods that use eigenspacesmay fail to reflect artistic styles fully. Our study takesthe example-based approach but incorporates component-
specific visual appearance feature learning to gain ahigh level control over the exaggeration of prominentfeatures of individual components as well as the spatial
relationship among them..
3 Proposed Method
Figure 1 provides an overview of the proposed method,
which comprises the construction of a learning databaseand the generation of caricatures. The construction ofthe learning database uses pairs of facial images and
corresponding caricatures as input. Based on geometri-cal shape information, extracted from the facial imageusing the active shape model (ASM) for facial feature
points detection [10], and hair regions, extracted usingour original method, our construction process calculatesvisual appearance feature vectors and links them to the
corresponding caricature parts. Given that males andfemales have different facial features, we created newASM average models for males and females, respec-
tively, in order to improve ASM fitting accuracy.
When generating caricatures, our system first calcu-lates a feature vector for each facial component usingthe method same as that used for the database con-
struction step as well as a feature vector for describingthe relative positions of the facial components. Next,the system performs exaggeration processing on the fea-
ture vectors. The exaggerated feature vector of each fa-cial component is used to search the example databasefor the corresponding exaggerated caricature compo-
nent. To arrange the gathered caricature components,the system first deforms the retrieved face outline com-ponent so that its shape resemble that of the input face.
Then the exaggerated relative position feature vectors
…
Eye
nV
Eyebrow
nV
Search
Eyes
EyebrowsEyebrows
Part-specific learning data
Input image
Exaggeration
face/hair feature
vector V
Training Running
Hair shape Face shape
Obtain
similar
parts
Arrange partsOutput image
Part position
Exaggeration
position feature
vector P
Face outline
deformation
Face shape
Sample images Sample caricature
Fig. 1 System framework
W5
W1 W2
H1 H2
H4H3
W3W4
W6
W8
W7
H6H5
H7
H8
θ3
θ4θ1
θ2
θ6θ5
θ8
θ10
θ7
θ9
Fig. 2 The feature vectors used in the proposed system
are used to decide the positions of caricature compo-
nents.
3.1 Designing and computing visual appearance
feature vectors
Humans perceive faces based more on characteristic in-
formation — large, thin, and drooping eyes, for ex-ample — than on precise shape information. For ourstudy, we sought advice from professional caricaturists
and designed the visual appearance features vectors forhairstyle and four facial components (eyebrows, eyes,nose, mouth) attached with outline of face, and the rel-
ative positions of those components. The feature vectorsfor the different components all have different numbersof dimensions, with the coordinates for each dimension
normalized to the range of real numbers from 0.0 to1.0. The following subsections describes the details ofthese feature vectors together with the algorithms for
computing them from input image.
4 Wei Yang et al.
3.1.1 Facial components
As shown in Figure 2, the eyebrow features that have
the strongest visual impact are thickness and angle. Aseyebrow thickness normally tapers off gradually nearthe side of the head, our system measures vertical thick-
ness at two locations: the inner end of the eyebrow andthe outer end of the eyebrow. Eyebrow angle, mean-while, varies according to the change in vertical posi-
tion on the outer end relative to the vertical position ofthe inner end. We thus select two points on the outerend and measure the angles of the straight lines con-
necting the two selected points to the inner end. Eyesalso have two main defining traits: shape characteris-tics (thin, slit eyes or large eyes, for instance) and an-
gle characteristics (eyes that slant upward or eyes thatdroop downward, for example). Shape features are de-fined by the aspect ratio of the eye, while angle features
are defined by the angle of the straight line connectingthe inner corner of the eye and the outer corner of theeye. Nose shape corresponds to the ratio between nose
width and nose height, while nose size depends on theratio of nose width to overall face width. The amount ofspace between the nose and mouth is another person-
to-person variable; as shown in Figure 2, we also incor-porate two types of top-down angles into the features.For the mouth feature, we use the ratio between mouth
width and mouth height to define shape and the ratiobetween mouth width and overall face width to definesize.
3.1.2 Hairstyle
In many cases, especially for caricatures with long hair,
a large part of face outline is implicitly represented asthe boundary of the hair region. To avoid visible ar-tifact caused by the composition of hairstyle and face
outline, we treat hairstyle and face outline as one sin-gle component. We use the feature vector of hairstylein searching example database and then deform the
face outline of the retrieved caricature to resemble thatof the input face. To determine the hair feature vec-tor, one first needs to extract the hair region. How-
ever, the sheer person-to-person diversity of hairstyleshas prevented researchers from developing a hairstyle-extraction equivalent of the ASMmethod for facial shape
extraction. For our study, we implemented our own hairregion extraction method based on the watershed algo-rithm [3]. The algorithm places multiple seeds on the
image, expands the seeds along gradients, and then di-vides the image into regions according to the boundariesthat form where the gradients are high. Our system au-
tomatically sets seeds for three regions: the skin region,
PeyeL
PeyebrowL
PeyeR
PeyebrowR
Pmouth
Fig. 3 Direction vectors used in the proposed system
H1
H2
Hn
Fn F
1
F2
Fig. 4 Hair vectors used in the proposed system
the hair region, and the “other” region, by making useof the ASM control points.
The system then uses the obtained hair region tocalculate the feature vector for capturing the visualappearance of the hairstyle. Generally, hairstyle types
fall into basic length categories: short, semi-short, semi-long, and long are several examples. In other words, howfar a person’s hair goes down his or her head is one im-
portant feature of the person’s hairstyle. In addition tolength, volume is another key element of how a per-son’s hairstyle looks. To establish a feature vector that
expresses both length and volume our method involvesdrawing straight radial lines out from the center of theface at certain angle intervals and using the distance
separating the two intersections between each line andthe hair region boundary. As Figure 4 shows, the systemuses the proportion between Hi (the distance between
the two intersections) and Fi (the distance from the in-terior intersection to the center). Hi value of 0 meansthat there is no hair at that position, while an Hi value
of greater than 0 indicates the volume of the hair. Thismakes it possible to deal with volume and length in auniform fashion. The hair feature vector created when
straight lines are drawn at angle intervals of 2π/n hasn dimensions.
3.1.3 Relative positions of facial components
To represent the spatial relationships among facial com-
ponents, we choose nose as reference component and
Example-Based Caricature Generation with Exaggeration Control 5
(a) Obtaining the (b) Placing components infacial component position the obtained positionsfrom the input image
Fig. 5 Arranging facial components.
compute the position of other component relative tothat of nose. As shown in Figure 4, the relative posi-
tion feature vector of each facial component is definedas the vector from the position of nose to that of thecomponent. The obtained ASM control point informa-
tion provides a basis for calculating the positions of thefacial components. As shown in Figure 5(a), we firstcompute the circumscribed rectangle of the ASM points
of the face outline. The position feature vector−→P nose
of the nose is defined as the vector from the lower-leftcorner of the circumscribed rectangle to the O, which
has the lowest Y coordinate among all the ASM pointsof nose, normalized by the W5, the width of the face.After we defined the feature vector for describing the
position of nose on the face, we can take the nose as thereference and define the feature vector representing therelative position of other facial components. As shown
in Figure 4, we compute the circumscribed rectanglefor the ASM control points constituting each of the re-maining facial components. Then the position of facial
component is defined as the center of its circumscribedrectangle, normalized by the face width W5.
3.2 Searching for similar parts
Using the feature vectors defined above, the systemcalculates the distance between the components of in-
put image and the components of the examples in thelearning database, and then searches for similar com-ponents. This requires comparisons of feature vectors
that include dimensions with different properties, suchas length ratios and angles. For our method, we thusnormalize each feature vector on the feature axis in ad-
vance.
We use the Euclidian norm (L2) to calculate the dis-tances between vectors, taking the component with thelowest Euclidian norm as the similar component. For-
mula (1) expresses face component similarity d(vin, vdb),
Feature space
average exaggeration average
Feature space
Fig. 6 Exaggerating feature vectors.
where the vectors for comparison are V in = (vin1 , vin2 , · · · , vinn )and V db = (vdb1 , vdb2 , · · · , vdbn ), respectively.
d(vin, vdb) =n∑
i=1
(vini − vdbi )2. (1)
As the presence or lack of hair has a significant im-
pact on a person’s visual appearance, our system cal-culates similarity based on both the Euclidean normfor the feature vector and an item that reflects the dif-
ferences in hair presence in the corresponding direc-tion. Formula (2) expresses hair similarity for use inthe search process.
d′(vin, vdb) =n∑
i=1
(vini − vdbi )2 + rn∑
i=1
(δ(vini )− δ(vdbi ))2,
where δ(v) =
{0, (v = 0)1. (v ̸= 0)
(2)
In Formula (2), r is a coefficient for adjusting the
weight of differences in hair presence.
Using the formulas above, the system calculates the
similarity between each feature vector obtained fromthe input image and the corresponding feature vectorfrom the database and then uses the most similar item
as a caricature component.
3.3 Exaggerating feature vectors of facial components
Our method achieves exaggerated depictions of the promi-nent features of the input face by exaggerating the vi-
sual appearance feature vectors of the various compo-nents and using the exaggerated vectors to obtain sim-ilar caricature components from the example database.
As shown in Figure 6, for each component in the in-put image, our system calculates its difference from theaverage of the database in the feature vector space. Fi-
nally, the system shifts the feature vector in the direc-tion of the difference vector in accordance with the re-quired exaggeration level and uses the resulting feature
vector for search purposes.
6 Wei Yang et al.
Although it is logical to assume that a feature is
more prominent when there is a larger difference fromthe average, the actual perceived prominence also de-pends on the distribution of coordinate values in said
dimension. Mo et al. [18] proposed a method that sup-plements the subject’s variations from the average facewith the distribution of his or her features themselves,
which thereby determines exaggeration quantities. Con-sider, for example, the horizontal spans of a person’seyes and mouth. Assume that the difference from the
average for both features is 2cm. As mouth width gen-erally varies more broadly from person to person thaneye width does, a 2cm variation in eye width would rep-
resent a more prominent feature than a 2cm variation inmouth width would. Our system uses the distributionof example data in the database to normalize the differ-
ence for each dimension, thereby making it possible tocalculate difference vectors that produce more promi-nently exaggerated features. The following formula de-termines the coordinate value for each dimension, where
V ′ is the exaggerated feature vector.
v′i = vini + k(vini −mi
σi), (3)
where
vin = {vin1 , vin2 , · · · , vinn }: Feature vectors for the input image
M = {m1,m2, · · · ,mn}: Example data average
σ = {σ1, σ2, · · · , σn}: Example data standard deviation
In Formula (3), k is a coefficient for determining theoverall exaggeration rate. Setting k > 0 exaggerates
the subject’s features, while setting k < 0 brings thesubject’s features closer to the average.
It is known that rather than all components, it is
more effective to exaggerate 1 or 2 components withmost prominent features [20]. In our implementation,we select the two components with largest difference
vector (after normalized by the variance).
3.4 Composition of retrieved components
After obtaining caricatures for all the components fromthe example database, the system generates the out-
put caricature by first reshaping the face outline andthen placing other components accordingly, with theprominent features of relative positions being with a
user specified exaggeration level.
(a) Control points (b) Control point (c) Deformedon photo on caricature caricature
Fig. 7 TPS deformation of face contour.
3.4.1 Deformation of face outline
As mentioned in Section 3.1.2, we treat hairstyle and
face outline as one single component. We use featurevector of hairstyle in searching example database andthen deform the face outline of the retrieved carica-
ture component to resemble that of the input face. Asshown in Figure 7, 14 corresponding feature points ob-tained by applying ASM fitting to the face outline in thephotograph and the caricature, respectively, are used
as the control points for performing Thin Plate Spine(TPS) deformation [4]. TPS deformation can preservethe smoothness of the contour while move the control
points to the target positions.
3.4.2 Arranging facial components
To exaggerate the prominent features of spatial rela-tionship among different facial components, a compo-
nent is moved along the difference vector from the aver-age of database by a user given exaggeration level. Foreach component j except for nose, we compute the dif-
ference of its relative position feature vector pinj from
the average of database and decide the new position ofthe components p′
j in the following way:
p′j = pin
j + tj(pinj − pm
j
)div σj . (4)
Here, pmj and σj are the average and deviation of
the relative position feature vector for all faces in ex-ample database. tj is a coefficient for determining the
overall exaggeration rate. div means the division in anelement-by-element way. Setting tj > 0 exaggeratesthe subject’s features, while setting tj < 0 brings the
subject’s features closer to the average. Empirically tjshould be set within the range shown in Table 1 tokeep the components inside the face outline and with-
out overlap with each other.Note that same as in the case of component shape,
only the relative positions with largest difference fromaverage are exaggerated.
To create the output caricature, the system start
with the caricature of hairstyle component with the de-
Example-Based Caricature Generation with Exaggeration Control 7
teyeL/R=0.7tmouth=0.5
teyeL/R=0.5tmouth=0.4
teyeL/R=0tmouth=0
teyeL/R=-0.2tmouth=-0.2
teyeL/R=-0.3tmouth=-0.3
keyeL/R=2keyebrowL/R=2
keyeL/R=1keyebrowL/R=1
keyeL/R=0keyebrowL/R=0
keyeL/R=-1keyebrowL/R=-1
keyeL/R=-2keyebrowL/R=-2
Input photograph
teyeL/R=0.7tmouth=0.5
teyeL/R=0.5tmouth=0.4
teyeL/R=0tmouth=0
teyeL/R=-0.2tmouth=-0.2
teyeL/R=-0.3tmouth=-0.3
keyeL/R=2kmouth=2
keyeL/R=1kmouth=1
keyeL/R=0kmouth=0
keyeL/R=-1kmouth=-1
keyeL/R=-2kmouth=-2
Input photograph
Fig. 8 Exaggeration and normalization. Top: component shape, bottom: relative in position.
Table 1 Range of tj for exaggeration and normalization.
Exaggeration NormalizationEyebrows 0.3 < t < 0.7 −0.4 < t < −0.2
Eyes 0.4 < t < 0.7 −0.3 < t < −0.2Mouth 0.3 < t < 0.5 −0.3 < t < −0.1
formed face contour line and arrange other facial com-ponents onto it based on the position defined by theexaggerated relative position feature vector.First, the
nose component is first placed at the same position as inthe input image, then the other components are placedin relative to the position of nose based on their relative
position vector.
4 Results and experiments
An example database consisting of facial componentsextracted from 405 pairs of photographs and caricatures
(272 female and 133 male).
Figure 8 are some results generated with the pro-
posed method and shows the effect by adjusting theexaggeration coefficient k and t for component shapeand relative position exaggeration, respectively. Among
the five columns of results, the middle one is generatedwithout exaggeration, to the right are the exaggeratedones and to the left are averaged one. The exaggera-
tion coefficients are shown under each caricature. Weempirically found that it is difficult to find correspond-ing caricature component when k exceeds ±2 due to the
limited number of samples of current example database.So in this experiment we set k = −2,−1, 1, 2. t was cho-sen by bisecting the interval shown in table 1.
We conducted two subject studies to validate theeffectiveness of the proposed method.
4.1 Study 1
The first experiment is designed to test whether the car-
icature generated with the proposed method can cap-
8 Wei Yang et al.
(a) Input (b) Samples for evaluation.
Fig. 9 A sample stimula for subject study 1.
ture the prominent features of a human face, which is
supposed to be the most important function of cari-catures. Based on the assumption that human beingcan capture the prominent facial features at a glance,
the subject is presented with the photograph of a facefor 5 seconds and then asked to select the caricaturemost closely resemble the face of the photograph from
6 caricatures at each trial. Among the 6 caricatures,one is generated with the proposed method with theexaggeration effect on the prominent features of facial
component shape and relative locations, and the other 5are generated by placing randomly chosen componentsat the positions same as that in original photograph.
The exaggeration coefficient for components and posi-tion are set to 2 and the middle of the range shown inTable 1, respectively.
Figure 9 shows an example set of photograph and
caricatures used for one trial. Totally 8 sets of differentages and sex are used. 8 (3 male and 5 female) sub-jects in their 20’s participated in the experiment, each
of them is asked to complete the 8 trials. Table 2 showsthe result of the experiment. As the result of binominaltest, for 6 out of the 8 faces, the null hypothesis that
the caricature generated with the proposed method waschosen at the same probability as other randomly gen-erated caricatures was rejected at 1% significance level.
For two failed cases, face 2 and 7, we found that thefailure was mainly caused by the error of ASM fittingat facial outline.
Table 2 Result of Experiment 1. ∗: 1% significance.
Dataset 1 2 3 4 5 6 7 8Selected 5∗ 2 6∗ 7∗ 4∗ 7∗ 1 5∗
4.2 Study 2
The second study uses Thurston’s Paired Comparisonto investigate the effects of component shape exaggera-
tion and relative position exaggeration when comparedto each other and to the case without exaggeration. Foreach of the 5 faces of different ages and sex, caricatures
are generated with the following 4 different parametersettings.
Case 1 Without exaggerationCase 2 With component shape exaggeration and with-
out relative location exaggerationCase 3 With relative location exaggeration and with-
out component shape exaggeration
Case 4 With both relative location exaggeration andcomponent shape exaggeration
For case 2-4, the exaggeration coefficient for com-
ponents and position are set to 2 and the middle of therange shown in Table 1, respectively. Figure 11 showsthe examples of the results used for the study. Figure
10(a) is the photograph, Figure 10(b), (c), (d), (e) arethe results for Cases 1-4, respectively. Figure 10 showsan example of the stimulus for the Paired Compari-
son test. In the center is the photograph, and differentpairs of the 4 cases are presented at the two sides ofthe photograph. The subject is asked to select the one
better resemble the photograph in the middle from thecaricature at both sides. The same 8 subjects of Study1 participated the test. To eliminate the learning ef-
fect, the 5 faces different from those used in Study 1were used in Study 2. We obtained 240 instance (8 sub-ject × 5 faces × 6 pairs) in total. Figure 12 is a bar
scale showing the results of a Thurston method-basedanalysis. The left side of the scale represents a lowerwinning rate, while the right side of the scale repre-
sents a higher winning rate. Although the difference isnot remarkable, we can see the highest winning ratebelonged to the caricatures with the exaggerations of
both component shape and relative position and theexaggeration of component shape is more effective thanthe exaggeration of relative position. The results with-
out exaggeration gained a rate much lower than other3 cases.
In both studies, the exaggeration coefficient k and t
are set to 2 and the middle of the range shown in Table1, respectively.
5 Conclusions
A new example-based approach for creating caricaturesreflecting the style of artists has been proposed. Al-
though the experiment did not use an extensive col-
Example-Based Caricature Generation with Exaggeration Control 9
(a) Input photograph (b) Case 1 (c) Case 2 (d) Case 3 (e) Case 4
(a) Input photograph (b) Case 1 (c) Case 2 (d) Case 3 (e) Case 4
Fig. 10 Example of the results used in Study 2.
Fig. 11 A sample stimula for subject study 2.
Case1 Case3 Case2 Case4
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35-0.05
Fig. 12 Results of Thurstone method-based analysis inStudy 2.
lection of example data, the results demonstrated the
potential of the approach as being able to create thecaricatures capturing the prominent facial features ofindividuals. Currently the degree of exaggeration is still
limited due to the limited number and variation of ex-amples. Moving forward, we plan to gather more ex-ample data of different styles and conduct subjective
evaluation experiments to improve the design of fea-ture vectors, especially those for dealing with the ex-aggeration of facial components arrangement, which is
considered to be even more important than individualfacial components.
Acknowledgements
This research was supported by JSPS KAKENHI GrantNumber 25280037 and 26240015, the Natural Science
Foundation of Zhejiang Province, China (No. Q12F020007),
and the Zhejiang Provincial Education Department, China
(No. Y201121352).
References
1. Akleman, E.: Making caricature with morphing. In:SIGGRAPH Visual Proceedings: The art and interdis-ciplinary programs of SIGGRAPH, p. 145 (2007)
2. Akleman, E., Palmar, J., Logan, R.: Making extremecaricatures with a new interactive 2d deformation tech-nique with simplicial complexes. In: Proceedings of In-ternatonal Conference on Visual Computing, pp. 165–170(2000)
3. Beucher, S., Meyer, F.: Mathematical Morphology in Im-age Processing (Chapter 12: The morphological approachto segmentation: the watershed transformation). NewYork (1993)
4. Bookstein, F.L.: Shape Models II: The Thin Plate Spline.The Palaeontological Association (1979)
5. Brennan, S.: Caricature generator: the dynamic exagger-ation of faces by computer. Leonardo 18(3), 392–400(1985)
6. Chen, H., Liu, Z., Rose, C., Xu, Y., Shum, H.Y.,Salesin, D.: Example-based composite sketching of hu-man portraits. In: International Symposium on Non-Photorealistic Animation and Rendering, pp. 95–102(2004)
7. Chen, H., Xu, Y.Q., Shum, H.Y., Zhu, S.C., Zheng,N.N.: Example-based facial sketch generation with non-parametric sampling. In: International Conference onComputer Vision, 2, pp. 433–438 (2001)
8. Chen, W., Yu, H., Zhang, J.: Example based caricaturesynthesis. In: Proceedings of Conference on ComputerAnimation and Social Agents (2009)
9. Chiang, P., Liao, W., Li, T.: Automatic caricature gen-eration by analyzing facial feature. In: Proceedings ofAsian Conference on Computer Vision (2004)
10. Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.:Active shape models - their training and application.Computer Vision and Image Understanding 61(1), 38–59 (1995)
10 Wei Yang et al.
11. Gooch, B., Reinhard, E., Gooh, A.: Human facial illus-trations: creation and psychophysical evaluation. ACMTransaction on Graphic 23(1), 27–44 (2004)
12. Klare, B.F., Bucak, S.S., Jain, A.K., Akgul, T.: To-wards automated caricature recognition. In: IAPR In-ternational Conference on Biometrics (ICB), pp. 139–146(2012)
13. Koshimizu, H., Tominaga, M., Fujiwara, T., Murakami,K.: On kansei facial image processing for computerizedfacial caricaturing system picasso. In: Proceedings of In-ternational Conference on Systems, Man, and Cybernet-ics (1999)
14. Liang, L., Chen, H., Xu, Y., Shum, H.: Example-basedcaricature generation with exaggeration. In: PacificGraphics, pp. 386–393 (2002)
15. Liu, J., Chen, Y., Gao, W.: Mapping learning ineigenspace for harmonious caricature generation. In: Pro-ceedings of ACM Multimedia, pp. 683–686 (2006)
16. Liu, J., Chen, Y., Xie, J., Gao, X., Gao, W.: Semisuper-vised learning of caricature pattern from manifold regu-larizations. Advances in Multimedia Modeling 5371(1),413–424 (2009)
17. Min, F., Suo, J.L., Zhu, S.C., Sang, N.: An automatic por-trait system based on and-or graph representation. In:International Conference on Energy Minimization Meth-ods in Computer Vision and Pattern Recognition (EMM-CVPR), pp. 184–197 (2007)
18. Mo, Z., Lewis, J., Neumann, U.: Improved automatic car-icature by feature normalization and exaggeration. In:SIGGRAPH Sketches (2004)
19. Nakasu, T., Naemura, T., Harashima, H.: Applying var-ious artists’ style of exaggeration to a facial caricaturedrawing system with an interactive genetic algorithm.Journal of Institute of Image Information and TelevisionEngineers 63(9), 1241–1251 (2009)
20. Redman, L.: How to Draw Caricatures. McGrawHill(1984)
21. Sadimon, S.B., Sunar, M.S., Mohamad, D., Haron, H.:Computer generated caricature: A survey. In: Proceed-ings of International Conference on Cyberworlds, pp.383–390 (2010)
22. Shet, R.N., Lai, K.H., Edirisinghe, E.A., Chung, P.W.H.:Use of neural networks in automatic caricature genera-tion: an approach based on drawing style capture. In:Proceedings of International Conference on Visual Infor-mation Engineering (2005)
23. Tseng, C.C., Lien, J.J.J.: Synthesis of exaggerative cari-cature with inter and intra correlations. In: Proceedingsof Asian Conference on Computer Vision, pp. 314–323(2007)
24. Tseng, C.C., Lien, J.J.J.: Colored exaggerative carica-ture creation using inter-and intra-correlations of fea-ture shapes and positions. Image and Vision Computing30(1), 15–25 (2012)
25. Xu, Z., Chen, H., Zhu, S.C., Luo, J.: A hierarchical com-positional model for face representation and sketching.IEEE Transactions on Pattern Analysis and Machine In-telligence 30(6), 955–969 (2008)
26. Yang, W., Tajima, K., Xu, J., Toyoura, M., Mao, X.:Example-based automatic caricature generation. In: Cy-berworlds, pp. 237–244 (2014)
Wei Yang received the B.Sc.degree in Engineering, M.Sc.and Ph.D. degrees in Com-puter Science from Universityof Yamanashi in 2010, 2012and 2015 respectively. Her re-search interests include non-photo realistic rendering andcomputer aesthetics.
Masahiro Toyoura receivedthe B.Sc. degree in Engi-neering, M.Sc. and Ph.D. de-grees in Informatics from Ky-oto University in 2003, 2005and 2008 respectively. He iscurrently an Assistant Profes-sor at Interdisciplinary Grad-uate School, University of Ya-manashi, Japan. His researchinterests are augmented real-ity, computer and human vi-sion. He is a member of ACMand IEEE Computer Society.
Jiayi Xu received her B.Sc.,M.Sc. and Ph.D. in ComputerScience from Zhejiang Univer-sity. She is currently an Assis-tant Professor at the School ofComputer Science and Tech-nology, Hangzhou Dianzi Uni-versity, China. Her research in-terests include texture design,crowd animation, and general-purpose GPU computation.
Xiaoyang Mao received herB.Sc. in Computer Sciencefrom Fudan University, China,M.Sc. and Ph.D. in ComputerScience from University ofTokyo. She is currently aProfessor at InterdisciplinaryGraduate School, Universityof Yamanashi, Japan. Herresearch interests includetexture synthesis, non-photo-realistic rendering and theirapplication to scientific visu-alization. She is a member of
Example-Based Caricature Generation with Exaggeration Control 11
ACM and IEEE ComputerSociety.
Fumio Oonuma received the B.Sc. degree in Engineering,M.Sc. and Ph.D. degrees in Computer Science from Univer-sity of xxx. His research interests include caricature synthesis.