Upload
keita-maehara
View
212
Download
0
Embed Size (px)
Citation preview
Parallel Network Organization Algorithm for Graph Matching
and Subgraph Isomorphism Detection
Keita Maehara
Department of Computer and Systems Engineering, Faculty of Engineering, Kobe University, Kobe, Japan 657-8501
Kuniaki Uehara
Research Center for Urban Safety and Security, Kobe University, Kobe, Japan 657-8501
SUMMARY
Data representations using graphs are very flexible and
are used in a wide variety of fields. The development of algo-
rithms to perform basic processing at high speeds is vital for
detecting subgraphs with important meaning from a graph set
and for searching for subgraphs which match a given graph.
However, as the number of graphs in question increases, the
computation costs required for processing rise dramatically. In
this paper, the authors describe an algorithm which detects
isomorphisms which have several graphs in common from
among a set of labeled directed graphs and then organizes them
into a network based on the detected isomorphisms. This algo-
rithm also provides greater processing speed through heuristics
using the MDL (Minimum Description Length) principle.
© 2000 Scripta Technica, Syst Comp Jpn, 31(8): 68�78, 2000
Key words: Graph matching; MDL principle; par-
allel processing; organization algorithm.
1. Introduction
Graph representations are used in a variety of fields,
including circuit design, knowledge representation, and
image recognition, due to their flexibility and applicabil-
ity as a means to represent data. Using graph repre-
sentations for databases is extremely valuable for
efficient management. In addition, valuable information
can be found in graph representations, making searches
of graphs with particular structures a vital topic. For
instance, if a structure which appears repeatedly in a
circuit can be extracted, the circuit can be designed
efficiently using the circuit element which corresponds
to this structure. However, in general, the computation
costs for algorithms used in graphic representations are
high, and there is considerable need for a faster algo-
rithm. In this paper, the authors detect an isomorphism
which is shared by multiple graphs within a set of labeled
directed graphs and then describe an algorithm to organ-
ize the detected structure as a network. This algorithm
provides faster processing through heuristics using the
MDL (Minimum Description Length) principle [6].
2. Basic Approach
2.1. Organization of a graph set
Let us consider finding a graph with a particular
isomorphism from within a given set of graphs. In this
instance, matching must be performed for the isomor-
© 2000 Scripta Technica
Systems and Computers in Japan, Vol. 31, No. 8, 2000Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J82-D-I, No. 1, January 1999, pp. 111�120
Contract grant sponsor: Japanese Ministry of Education Grant-in-Aid for
Scientific Research; Contract grant sponsor: Research for the Future
Program of the Japan Society for the Promotion of Science.
68
phism with respect to all of the given graphs. However, as
the number of graphs becomes larger, the costs of match-
ing rise, making a search in a realistic amount of time
impossible. Therefore, in order to search efficiently, re-
dundant repetition within matching must be curtailed.
In expert systems, the RETE algorithm [3] has been
proposed in order to accelerate matching of elements in
working memory and conditional rules. The RETE algo-
rithm is a method which attempts acceleration by creat-
ing a RETE network from the various elements of the
conditional rules and then matching them to the elements
in working memory in only one pass. Here, if individual
graphs correspond to a production rule and the subgraphs
correspond to the elements in working memory, graph
matching can be made more efficient by using a method
like this one. For instance, given Graph1 and Graph2 in
a set of graphs as in Fig. 1, if the common subgraphs are
focused on, the network structure shown in the lower area
can be created.
If the common structure of a graph can be extracted
and organized, then search speeds can be raised. However,
when the appropriate common structures which a graph has
are not reflected in the network structure, searches cannot
be made more efficient. For instance, Fig. 2 shows a case
in which a separate network structure has been created from
the same graph. In this figure, the common isomorphisms
in the two graphs are not extracted as network nodes. As a
result, there will be no reduction in matching costs even if
organization is performed. As such, the following
three points must be considered when creating net-
works.
x Improvements in search efficiency
In order to increase search efficiency through organi-
zation, the network structures must be optimized. In order
to do this, the appropriate isomorphisms must be extracted
after understanding the structure of the whole graph, and
the organization must be performed based on the results.
x Efficient descriptions of graph sets
Being able to represent the original data with the
smallest amount of information is desirable from the stand-
point of the amount of memory space. In other words,
decreasing the description costs of the network structure
means increasing the efficiency of memory management.
As is the case when using common data compression
algorithms, in graph processing the expression of the origi-
nal data using a smaller amount of information must be
considered based on the structures that repeatedly appear in
multiple graphs.
x Discovery of the important isomorphisms
The isomorphisms that appear repeatedly in a graph
set are considered to have an important conceptual signifi-
cance. Therefore, if such structures can be extracted, they
will have great practical value. From this perspective,
paying attention to isomorphisms common among sev-
eral graphs is vital.
Fig. 1. Network structure for graph retrieval.
Fig. 2. Inefficient network structure.
69
3. Common Structures in a Graph Set
3.1. Discovering isomorphisms using the MDL
principle
Let us consider extracting the subgraph common to
several graphs from a given graph set. Figure 3 represents
an example of a case in which a different subgraph is given
attention with respect to the same graph set. Figure 3(a)
represents a case in which attention is given to a total of two
types of subgraphs, one that is common to Graph 1 and
Graph 4, the other common to Graph 2 and Graph 3. Figure
3(b) represents a case in which attention is given to a total
of four types of subgraphs, two of which are common to
Graph 1 and Graph 2, the other two of which are common
to Graph 3 and Graph 4.
Here, if the selected subgraph is represented as a
single vertex, the graph set can be described more effi-
ciently. For instance, given the four graphs shown in Fig. 3,
if the subgraphs considered in each of the figures are
represented using a single vertex, the number of vertices
and such for the graph set can be reduced (Fig. 4).
In Fig. 4(a), the subgraph considered in Fig. 3(a) is
replaced with one vertex, and the number of vertices for
each graph is reduced to three, and the number of edges is
reduced to two. In Fig. 4(b), the number of vertices for each
graph is reduced to two, and the number of edges to one. In
this fashion, there are numerous possibilities for a subgraph
which has several graphs in common. As a result, some kind
of criterion must be defined.
A policy in which a subgraph includes a larger num-
ber of graphs in common could represent one such criterion.
This would be ideal in the sense of being able to describe
many graphs using the same subgraph. However, the size
of a subgraph that contains many graphs in common is
generally small, and as a result this policy would not be
optimal. In addition, a policy in which a subgraph of a large
size is used could be considered. However, although this
would be ideal in terms of being able to describe the graph
using a small number of vertices, the number of graphs that
contain common subgraphs of a large size is few in general.
As a result, this policy cannot be considered optimal either.
The MDL principle adopted in the authors� re-
search is intermediate between these two policies. In
other words, the optimal model in the MDL principle is
a model for which the sum of the description length of
the model itself which explains the data and the descrip-
tion length for when the data are described using the
model is a minimum. If this is applied to the problem of
description in graph sets, a subgraph which keeps the
sum of the description length of a particular subgraph
and the description length for when the graph set is
described using that particular graph at a minimum
would be the ideal as a model.
In other words, when considering the graph set
GSet, if one combination of subgraphs which have in
common the graphs included in Gset is taken to be
SGList, then the evaluation function Eval�SGList� for
SGList is given by the following equation, according to
the MDL principle:
Note that DL�SGList� is the description length for SGList,
and DL�GSet|SGList� is the description length when GSet
is rewritten using SGList.
DL�SGList� can be defined as follows. When the
number of vertices for the subgraph SG is set as v, the
number of labels for each vertex is lv, the number of edges
is e, and the number of labels for the edges is le, then theFig. 3. Graphs and their subgraphs.
Fig. 4. Efficient description of graphs.
(1)
70
number of bits required to encode the vertices and the edges
are given by
respectively. In addition, the number of bits required to
encode the information as to what vertices the two edges
are connected to is given by
Therefore,
Furthermore, the description length DL�G|SG� for G when
the subgraph SG of the graph G is taken to be one vertex is
defined as follows. The subgraph equivalent to SG is taken
to include nSG elements in G, and after the exchange the
number of vertices becomes vc and the number of edges, ec.
At this time, if encoding is performed, then
without losing the information on the nodes to which both
ends of the edges of the graph are connected to.
The evaluation function Eval�SGList� is a measure of
how efficiently the graph set is represented as the value
becomes smaller. For instance, the description length of
Figs. 3(a) and 3(b) is 113.4 bits for both, according to Eq.
(5). However, the evaluation values when describing the
original graph by focusing on the subgraph are, based on
Eq. (1), 86.2 bits when using Fig. 4(a) and 59.0 bits when
using Fig. 4(b). Therefore, it is clear that from the stand-
point of the MDL principle, the four types of subgraphs in
Fig. 4(b) should be used in order to describe a given graph
set more efficiently.
3.2. Narrowing the search space
Even when using the MDL principle, every subgraph
must be evaluated in order to detect a means to represent
the original graph more efficiently. For instance, given a
graph with n vertices, a method to use nodes from among
them lies in ¦i 1n
nCi. Moreover, it is clear that when the
combination or linking patterns of nodes are considered, the
number of subgraphs to be evaluated rises dramatically. In
addition, in practice it is also necessary to consider combi-
nations of subgraphs that share several graphs. As the scale
of the problem increases, the number of possible combina-
tions explodes.
As a result, the authors attempted in their research to
accelerate processing by using heuristics with the idea that
if the description length is small when describing a graph
set using a particular graph, then the description length will
also be small when describing a graph set using a different
graph. For instance, Fig. 5 shows the state when the Sub-
graph1 common to Graph1, Graph2, and Graph3 is already
selected as the solution and Subgraph2 and Subgraph3 exist
as candidates to be selected next. Although the number of
vertices and the number of edges for Subgraph2 and Sub-
graph3 are the same, Subgraph2 is represented twice in the
graph set and Subgraph3 is represented three times. As a
result, the length of the description when describing the
graph set using its individual units is smaller for Subgraph3.
Therefore, Subgraph3 is used when describing the graph set
in combination with Subgraph1.
4. Network Generation Algorithm
In this section, the parallel network organization al-
gorithm for the graph set using the MDL principle is
described. A network is created using this algorithm in
order to search a given graph set efficiently. Each node
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
Fig. 5. Selection policy of subgraph.
71
within the network corresponds to a detected subgraph. In
addition, the nodes which correspond to subgraphs com-
mon to multiple graphs are shared in the network, and as a
result, duplication of matching can be avoided, and graph
searches can be performed more efficiently.
4.1. Details of the algorithm
Only one node exists in the initial state. This node is
called the input node. A given graph is always passed to this
input node first. The graph is broken down into node sets at
the input node, and the corresponding vertex nodes are
generated. The vertex nodes are linked to the input node as
a child node of the input node (Fig. 6).
Figure 7 shows this procedure using pseudo-code. In
the initial state, there are no child nodes for the input node
at all. As a result, a vertex node newNode which has the
same label as the branching vertex node vertex is created,
and an instance of the vertex vertex is stored.*
Next the internodes are generated. An internode rep-
resents a subgraph generated using a combination of vertex
nodes and/or other internodes. Figure 8 illustrates an exam-
ple of the generation of internodes. Because the evaluation
of the nodes themselves is performed independently in
various parts of the network, higher speeds can be realized
through parallel processing.
Figure 9 shows this procedure using pseudo-code.
Within this procedure, attention should be paid to instance1
and instance2, which are linked together under the same
graph, from among the instances held in the node in the
network at a particular point in time. Then, the description
length of the graph set for when the original graph set is
described is calculated using the subgraph when combining
these instances. Next, in accordance with heuristics based
on the MDL principle, the subgraph that decreases the
description length of the original graph set can be consid-
ered as making the description length even smaller when it
is combined with other subgraphs. As a result, a new node
named newNode is generated from a combination which is
evaluated more highly.
In the authors� research, several processors evaluate
the nodes independently and then actually generate the one
which is evaluated most highly. This is the same as search-
ing in parallel for all subgraphs included in the given graph
set and then selecting the one which is evaluated more
highly. For instance, in Fig. 8, the processes Process1 and
Process2 evaluate the graph which has edge p from vertex
y to vertex x, respectively, and the graph which has edge r
from vertex y to vertex z. As a result, when the evaluation
of the graph which has edge q from vertex y to vertex x is
high, a node corresponding to that graph is actually gener-
ated, as shown in Fig. 10.
Fig. 6. Generation of vertex nodes.Fig. 7. Vertex-node generation algorithm.
*When there is more than one vertex with the same label, the same numbers
of instances are stored in one vertex node. Fig. 8. Generation of internodes (a).
72
When a new node is generated, the instance corre-
sponding to the two parent nodes is eliminated and that
node is stored as a new single node. In Fig. 10, because the
evaluation of the graph which had edge q from vertex y to
vertex x was evaluated most highly, the instance stored at
vertex node y and vertex node x is eliminated, and is stored
as a new single instance at the newly generated node. This
instance then becomes a candidate when creating even more
new nodes. In Fig. 10, the generated node is evaluated using
Process1. In this fashion, new nodes are generated one after
the other in all parts of the network.
The nodes corresponding to the individual graphs in
the original graph set are ultimately generated in the low-
ermost part of the network by repeating the above process.
The combinatorial process is complete when the nodes
corresponding to all the graphs have been generated.
5. Experiments
5.1. Validity of heuristics based on the MDL
principle
In order to demonstrate the validity of the heuristics
described in Section 3.2, the three graph sets with a total
number of vertices of 12 and 14 and the four graph sets
shown in Fig. 3 (total number of vertices, 16) were input,
and an experiment was performed to search for a subgraph
combination that could describe the original graph set more
efficiently. Table 1 shows the processing time and evalu-
ation value from Eq. (1) for search methods that were
completely random, that used heuristics as described in
Section 3.2, and that took into consideration all combina-
Fig. 9. Internode generation algorithm.
Fig. 10. Generation of internodes (b).
Table 1. Comparison of three subgraph search methods
(number of vertices 12, 14, 16)
73
tions of subgraphs (brute force). Note that the results rep-
resent the means and variances for experiments performed
10 times.
When using random as the search method, first one
candidate was selected at random from among the sub-
graphs generated at first, and then it was evaluated using
Eq. (1). When this evaluation value was even smaller than
the description length of the original graph set, that candi-
date was considered to be a solution, and then another
candidate was selected at random. Next, the subgraph se-
lected as a solution and the newly selected candidate were
combined, with the result being evaluated again using Eq.
(1). If this evaluation value was even smaller than the
previous one, the newly selected candidate was added as a
solution. This procedure was performed until the evaluation
value ceased to become smaller. On the other hand, when
heuristics were used as the search method, selection of
candidates for which the evaluation value of the subgraph
was small according to Eq. (1) were given priority.
As can be seen in Table 1, all combinations are
considered when brute force is used. As a result, there is a
combinatorial explosion for 16 vertices, making measure-
ments impossible. Consequently, the same experiment was
performed using only random and heuristic searches for
five subgraphs with 257 vertices. Note that the results
represent the means and variances for experiments per-
formed 10 times. From these experimental results it can be
seen that if the above heuristics are used, the processing
time is compressed. In addition, as the respective variances
show, in contrast to the large dispersion in the processing
time for random searches, the processing time for heuristic
searches is kept within a comparatively narrow space. Fur-
thermore, in comparison with the evaluation performed at
random, the given graph set is described more efficiently.
However, in both cases there was a large scatter of the
evaluations, and it is difficult to say that an optimum sub-
structure has been detected.
5.2. Effects of parallelization
In order to evaluate the effects of using parallel proc-
essing, content description based on the Japanese children�s
story �Omusubi Kororin� was performed, and then repre-
sented in a graph using the conceptual dependency repre-
sentation [5] proposed by Schank. The data in question
were four types of graph sets composed of a total number
of vertices of 257, 514, 771, and 1285. Note that for the
experiment a Silicon Graphics Origin 2000 (CPU: R10000
u 8; main memory: 512 Mbytes; OS: IRIX 6.4) was used.
Figure 11 shows the relationship between the number of
processors used and the processing time. Each result repre-
sents the average value obtained by performing the experi-
ment 10 times. With the evaluation of the newly generated
network nodes done in parallel, a clear increase in process-
ing speed was seen with respect to the increase in the
number of processors (Fig. 11).
As has already been described, the processing which
generates the internodes in Fig. 9 consists of evaluation
processing of nodes which calculates the description length
of the graph set with respect to combinations of instances
linked at the edges of the original graph, selecting the
smallest evaluation value from the combinations of in-
stances evaluated in the node evaluation process, and then
determining and generating the node which is to generate
the internode using that combination.
Because the portion of the node evaluation process-
ing in this system is performed independently by each
processor, the evaluation of one node can be performed
many times. If such duplication can be eliminated, far
greater speeds can be achieved.
In the area which determines and generates the nodes,
each instance which is stored in the two parent nodes which
generate a node is eliminated, and one new instance is
delivered for the newly generated child node. However,
caution is required when several processors are performing
such processes, as is the case with this algorithm. For
instance, the dotted lines in Fig. 12 indicate the situation in
Table 2. Comparison of two subgraph search methods
(number of vertices 257)
Fig. 11. Experimental result of parallel network
organization algorithm.
74
which several processors simultaneously generate two
graphs for a graph which has edge r from vertex y to vertex
z.
As shown in Fig. 12, if before an instance is erased
from a parent node a separate processor generates a new
child node using the same instance, instances which were
originally one and the same (in this case, the instance in
which the vertex node corresponding to vertex y is held)
end up being duplicated. As a result, an element which
originally represented one node in the graph can end up
being represented as a network of several elements. To put
this a different way, the node evaluation processing can be
performed independently in each part of the network, but
there must not be more than one processor engaged in
determining and generating nodes. In other words, the
process of determining and generating nodes represents a
critical region of this algorithm.
Furthermore, although the number of combinations
as well as the processing time increase for node evaluation
processing when the scale of the graph rises, the processing
to determine and generate nodes generates only one node
from among the evaluated combinations. Consequently, it
is small enough that it can be ignored, regardless of the scale
of the graph.* Therefore, even when the scale of the graph
grows larger, the time required for critical region calcula-
tions is expected to remain small, and the effects on paral-
lelization minimal.
5.3. Detection of subgraph isomorphism
Figure 3 shows the results of applying this organizing
algorithm to the data described above. In the original graph
expression, the relationships among objects were complex,
and understanding the structure was very difficult. How-
ever, by organizing them, the data take on layers, and the
relationships between them become clearer. This is useful
for searches. Moreover, in Fig. 13, structures which corre-
spond to various scenes, from small ones such as �a scene
in which an old man holds chopsticks� to comparatively
large ones such as �a scene in which a mouse�s daughter
holds a dish and brings it to the old man from outside the
room,� can be identified. However, because only the de-
scription length of the graph set is given attention in this
algorithm, semantically significant elements are not neces-
sarily found. Methods to find semantically significant struc-
tures by using background knowledge as a standard for
using isomorphism, in addition to the MDL principle, have
been considered with respect to this problem.
6. Related Research
Several algorithms similar to the one proposed in this
paper have been proposed for organization based on iso-
morphism and detection of important structures from
among large amounts of data. Among these, SEQUITUR
[2] represents an algorithm that detects grammar rules in
symbol sequences by focusing on partial sequences that
appear repeatedly in symbol sequences scanned in se-
quence from their start. An example of SEQUITUR would
be the detection of a group of words with the same meaning
or structure from within one sentence written using many
words or the extraction of a phrase repeated in the lyrics of
a song.
SEQUITUR is designed with two restrictions for
identifying grammar rules: Diagram Uniqueness and Rule
Utility. Diagram Uniqueness is a restriction in which two
neighboring symbols cannot appear under several rules. If
such a partial sequence is detected, that partial sequence is
replaced with a new nonterminating symbol immediately.
This can be seen as being the same as the procedure to
�combine small structures that appear in several graphs and
detect larger structures� in the authors� research. Rule Util-
ity is a restriction in which a detected rule cannot be used
more than once. This restriction is set up in order to guar-
antee that the rules are used effectively. This can be seen as
being the same as the standard to �detect structures which
have more graphs in common� used in the authors� research.
Therefore, the authors� algorithm can be thought of as a
generalization of SEQUITUR from simple symbol se-
quences to graph representations.
SUBDUE [1] is an algorithm which detects layered
structures which have graphs using the MDL principle.
SUBDUE uses one vertex of a graph as an isomorphism
when evaluating isomorphisms. Then it performs evalu-
ations for each subgraph composed of that vertex and other
Fig. 12. Instance passing in the network.
*As the scale of the graph grows larger, the number of internodes created
increases. As a result, the processing time for node generation may rise
slightly due to instance processing. This is, however, small enough to be
ignored when compared to the overall processing time.
75
vertices linked to it. The one that is evaluated most highly
is adopted and becomes the new isomorphism. SUBDUE
uses a strategy in which the above procedure is repeated and
the one isomorphism included in a graph is expanded.
Therefore, the number of isomorphisms detected from a
graph under SUBDUE will always be 1. In addition,
searches of graphs are not considered, something else that
makes SUBDUE different from the algorithm described in
this paper.
NA [4] is an algorithm which, like the authors�,
organizes a graph set into a network by focusing on the
subgraphs which have several common graphs. NA is a
system which performs searches of graphs which have
structures identical to newly input graphs from among a
given graph set. NA is an incremental algorithm which
processes a given graph consecutively in sequence. NA
does not use a method to select appropriate structures from
among the many subgraph combinations, and as a result the
structure finally obtained is not necessarily an appropriate
isomorphism with respect to the original graph set. When
organization is completed without detecting an appropriate
structure, matching for the same isomorphism may be
performed more than once. Consequently, the matching
costs may not be reduced, regardless of what organization
is performed.
The GBI (Graph Based Induction) [8] method has
been proposed as a way to extract isomorphisms for graphs
based on the idea of �expanding and combining patterns
discovered in previous iterations� applied to automated user
modeling [7] and to perform classification code learning
from graphs. Under the sequential pair expansion algorithm
in the GBI method, the original graph set can be described
using a combination of several graphs, as is the case with
the authors� algorithm. However, the use of statistical stand-
ards in class rule learning as a standard for extracting
similar structures which appear in several graphs is differ-
ent from the authors� algorithm. Here, �similar� refers to
the frequency of occurrence in the data. In the MDL prin-
ciple, the evaluation criterion used for the authors� algo-
rithm, the final description efficiency represents a problem.
The effect errors in this criterion have on the isomorphisms
ultimately obtained represents a topic for future study.
7. Conclusions
In this paper the authors describe the usefulness of a
multiprocessor environment for their algorithm in addition
to explaining a parallel algorithm used to structure net-
works for graph searches from graph set isomorphisms
detected using the MDL principle.
Fig. 13. Interesting subgraphs discovered from network.
76
The load and distribution of the parallel algorithm are
not considered sufficiently under the present algorithm. As
was described in Section 5, each processor evaluates node
in a completely independent fashion. As a result, an evalu-
ation of one particular node may be performed several
times. The authors believe that far greater speeds can be
achieved by eliminating such duplication to the extent
possible and making the processing performed by each
processor more efficient.
Future topics of study will include expressions within
the data. In order to calculate the evaluation value when
generating new nodes from instances of two internodes
within a network, whether or not there is an edge connecting
these two instances must be checked. For instance, when
considering a case of an instance with m vertices and an
instance with n vertices, the number of combinations of
vertices to be checked is nm. At present, a representation of
a neighbor list requiring minimal storage area is used as an
internal representation of the graph. If the size of each graph
is not too large, the effects of the problem of the storage
area are not expected to be too great. Therefore, if each
process can be returned for bit calculations by using an
array expression, faster processing can be realized.
Acknowledgments. This project was supported in
part by a Japanese Ministry of Education Grant-in-Aid for
Scientific Research on Priority Area: �Research and Devel-
opment of Advanced Database Systems for Integration of
Media and User Environments,� and by Research for the
Future Program of the Japan Society for the Promotion of
Science under project �Researches on Advanced Multime-
dia Contents Processing.� The authors thank Mr. Yoshinori
Nakanishi, a doctoral candidate at Kobe University, for
supervising the follow-up experiments.
REFERENCES
1. Cook DJ, Holder LB. Substructure discovery using
minimum description length and background knowl-
edge. J Artif Intell Res 1994;1:231�255.
2. Nevill-Manning CG, Witten IH. Identifying hierar-
chical structure in sequences: A linear-time algo-
rithm. J Artif Intell Res 1997;7:67�82.
3. Forgy CL. RETE: A fast algorithm for the many
pattern/many object pattern match problem. Artif
Intell 1982;19:17�37.
4. Messmer BT, Bunke H. A network based approach to
exact and inexact graph matching. Tech Rep LAM-
93-021, University of Berne, 1993.
5. Schank RC, Riesbeck CK. Inside computer under-
standing: Five programs plus miniatures. Erlbaum;
1981.
6. Yamanishi K. Introduction to MDL from viewpoints
of computational learning theory. J Jpn Soc Artif
Intell 1992;7:435�442.
7. Yoshida K, Motoda H. Automated user modeling for
intelligent interface. Int J Hum Comput Interact
1996;8:237�258.
8. Yoshida K, Motoda K. Inductive inference by step-
wise pair expansion. J Jpn Soc Artif Intell
1997;12:58�67.
AUTHORS
Keita Maehara graduated from the Department of Computer and Systems Engineering of Kobe University in 1996. He
completed the first half of his doctoral studies there in 1998. Currently, he is working at Yamaha, Inc. While a student, he was
primarily pursuing research related to machine learning and multimedia databases.
77
AUTHORS (continued)
Kuniaki Uehara graduated from the Engineering and Information Department of Osaka University in 1978. He completed
the second half of his doctoral program in 1983. After serving as a lecturer and then an instructor at the Institute of Scientific
and Industrial Research at Osaka University, and subsequently an assistant professor in the Department of Computer and Systems
Engineering at Kobe University, he is now a professor at the Research Center for Urban Safety and Security at Kobe University.
He also serves in the Department of Computer and Systems Engineering at Kobe University. He was a visiting assistant professor
at Oregon State University in 1989 and 1990. He served as assistant director of the General Information Processing Center at
Kobe University from 1994 to 1996. He holds a D.Eng. degree. He is pursuing research related to artificial intelligence, in
particular machine learning, multimedia databases, and human interfaces using natural language. He was the recipient of the
1990 Japanese Society for Artificial Intelligence�s Research Scholarship. He is a member of the Japanese Society for Artificial
Intelligence, the Information Processing Society of Japan, the Mathematical Linguistic Society of Japan, the Japan Society for
Software Science and Technology, and AAAI.
78