Journal of Information & Computational Science 10:12 (2013) 3645–3658 August 10, 2013
Available at http://www.joics.com
Data-driven Enhancement of Chinese Calligraphy
Aesthetic Style ⋆
Wei Li a,b,∗, Changle Zhou a,b
aCognitive Science Department, Xiamen University, Xiamen 361005, China
bFujian Key Laboratory of the Brain-like Intelligent Systems, Xiamen University
Xiamen 361005, China
The generating of a large-scale Chinese character set from a small one has been notoriously challenging
due to the complexity of character topology and the inherently elusive sophistication in glyphic aesthetics.
This paper proposes an innovative approach to synthesize character-glyph, via which a large-scale
character glyph set was generated modeling on a small-scale one of a desired style. The approach
initiated from the sampling of designated calligraphic works and proceeded to build a character stroke
database, followed by the proposal of F-histogram-based character topology. Drawing on aesthetic
intuition, we abstract glyphic aesthetics to establish several evaluative rules, and we further designed
an algorithm for the evaluation of the character topology with the Support Vector Machine (SVM)
algorithm. At last, we adopted simulated annealing algorithm to optimize character glyphs with the
desired style(s). Comparatively, this approach deserves credit in that the representation via F-histogram
character topology accommodates more types of character topology and the synthesizing of glyphs
integrates the stroke shape and the character topology.
Keywords: Glyphic Synthesizing; Chinese Character Topology; Machine Learning; Optimization
Calligraphy art and beauty have fascinated human beings from the emergence of text, inspiring
countless artists and philosophers. However, an absolute definition of aesthetic style remains
difficult. For example, when a group of human raters is presented with a collection of callig-
raphy characters and asked to classify them according to their aesthetic style, the results often
indicate that there is a statistical consensus among the raters. Yet it might be hard to define a
succinct set of rules that capture the aesthetic perceptions of raters. Furthermore, such percep-
tions vary among different classes of shapes, and sometimes differ significantly from culture to
⋆This work was supported in part by National Natural Science Foundation of China (No. 61273338) and the
Open Project Foundation of Chinese Font Design and Research Center (No. CCF2012-01-06).
Email address: email@example.com (Wei Li).
1548–7741 / Copyright © 2013 Binary Information Press
3646 W. Li et al. / Journal of Information & Computational Science 10:12 (2013) 3645–3658
culture. Therefore, in this work, we explore the feasibility of a Data-driven approach to aesthetic
style enhancement. Chinese characters belongs to hieroglyphs and consists of strokes. Chinese
calligraphy styles are, to a large extent, embodied by the spatial relations of strokes (i.e. Chinese
character topology). On the one hand, user’s trajectories simulated calligrapher as input is more
convenient than keyboard, and on the other hand, these trajectories can also distinguish various
stroke types. Specifically, we focus on the challenging problem of enhancing the aesthetic style
of character topology in user’s trajectories input, while maintaining the correctness of character.
Data-driven means that the properties of a particular set of character topology features are the
same irrespective of the perceiver. The universality of the notion of calligraphy style along with
the ability to reliably and automatically predict the style of calligraphy has motivated this work.
Specifically, we present a novel tool capable of automatically enhancing the aesthetic style of
Chinese calligraphy character in given trajectories. Although for brevity we often refer to this
process as optimization, it should be understood that we merely claim that Chinese calligraphy
character generated by our tool are more likely to receive a higher rating, when presented to a
group human observers.
Although ancient Chinese calligraphy works spread are rarely, by learning their styles, an po-
tential application of our techniques is to generate some new characters to rich font library and
repair the eroded characters to protect cultural heritage. Another interesting application is to
write documents with personal style and design a logo or an advertisement. For instance, the
computer can generate a whole email in handwritten style as if it were manually written charac-
ter by character by the human author. Emails in handwriting style produced this way are more
“personal”and can draw the reader closer to the author than “typed”emails.
The key component in our approach is an optimization engine trained using datasets of calligraphy
works with certain style. The entire optimization process is depicted in Fig. 1. Giving trajectories
as input we first recognize character strokes. And using F-histogram  representing the spatial
relations of strokes, we extract a vector of character topology in the graph. This vector is then
fed into the optimization engine, which yields a modified vector of topology, processing a higher
predicted score than that of the original vector. Next, the strokes are readjusted in the glyph
attempting to make the new character topology as close as possible to the modified character
topology. The resulting new spatial relations of strokes define a character with certain style.
Our results indicate that the proposed method is capable of effectively increasing the perceived
character topology style for most trajectories of the user.
Much of the research work in computerized calligraphic handwriting synthesis has focused on
English or Japanese characters [2, 3, 7, 11, 12]. These works are mainly used to imitate the user’s
trajectory via samples. A Chinese calligraphy character consists of strokes, which are 2D area
object, so the relations of stokes are the relative position between areal objects. By contrast,
Chinese calligraphy synthesis has more difficulties. Existing methods for Chinese calligraphy
synthesis can be roughly divided into two categories. The first one is based on interpolation idea [4,
W. Li et al. / Journal of Information & Computational Science 10:12 (2013) 3645–3658 3647
Fig. 1: The overview of Chinese calligraphy synthesis with personal style
13, 14, 16], in which the corresponding point or topology between samples is found and a character
with new style is generated by weighted average in these corresponding components. This method
depends heavily on many samples to the same character and non-rigid point matching is also
thorny issue for pattern recognition. The second category involves rule-based method , where
a new style character is reconstructed by some rules. For the complexity of Chinese character,
only more than ten rules are incompletely to capture the style of Chinese character. In addition,
some researchers employ the method that replace the trajectories with strokes directly to generate
Chinese calligraphy . This method greatly depends on the user’s initial trajectories input. How
to represent the topology of a Chinese character is key to synthesize calligraphy character. As the
element of Chinese calligraphy character strokes are 2D areal object due to width variation and
their relations are also viewed as the relative position of areal objects. Freeman  proposed that
the fuzzy set theory be applied because “all-or-nothing”standard mathematical relations are
clearly not suited to models of spatial relations. Therefore, in contrast to all previous methods on
character topology representation [13, 14, 6, 8], we employ fuzzy-based method in this paper, to
represent the spatial relations between strokes. The main challenge in this work is as the follows:
1) How to represent and optimize the Chinese character topology. 2) With analogous style, how
to generate a large-scale Chinese character set from a small one.
2 Topological Representation of Chinese Characters
Chinese characters consist of radicals which in turn are composed with basic strokes (see Fig. 2).
We favor the proposal by Lai et al.  to define the relations among radicals. Lai postulates
that Chinese characters are topologically featured with horizontal, vertical or bounding patterns.
The arrays of these patterns to represent characters enjoy logic and hierarchical clarity but they
tend not to cover all complex relations among strokes. For example, characters of the horizontal
topology may vary in horizontal proximities, vertical deviations and stroke shapes. The one-
side-fits-all treatment will be doomed to oversimplification. Calligraphic strokes are a kind of
a planar area and the relations among strokes are the spatial relationships of planar objects.
Relationships among planar strokes involve such complex factors as distances, directions, areas
3648 W. Li et al. / Journal of Information & Computational Science 10:12 (2013) 3645–3658
Fig. 2: The primitive strokes and Chinese character topology
and shapes, being elusively inexhaustible. The approach by Matsakis et al.  accommodates
these factors and will be adopted in this paper to analyze the relationships among strokes. It
relies on the intersections of the objects with lines having the desired direction.
2.1 Hierarchical Representation of Chinese Character Topology