19
PARALLEL COMPUTING ELSEVIER Parallel Computing 20 (1994) 1303-132l Efficient parallel recognition of context-free languages Abhay Jain a, N.S. Chaudhari b a Computer Centre, Shri G.S. Institute of Technology & Science, Indore MP, India b School of Computer Science, Deci Ahilya University, Khandwa Road Campus, Indore MP, India Received 13 November 1992;revised 1 October 1993 Abstract This paper presents an efficient parallel approach for recognition of context-free languages. The algorithm takes O(log n) time of computations on a fixed configuration of processors array with reconfigurable bus system. The number of processors required is O(n3). Processor architecture consists of multibus and multiple internal switches in the processors. A specific arrangement of processors is required. Achievements in terms of time and processors requirement have been further supplemented with the elimination of dependence of processor complexity on the size of grammar. Keywords: Context-free language recognition; Multiple buses; Parallel processing; Proces- sors array with reconfigurable bus system I. Introduction Recognition is a main component of language processing systems [12,15,16]. Earley's approach [8], Cocke-Younger-Kasami's method [5] and Tomita's algo- rithm [13] for recognition and parsing of general context-free languages are well known. Valiant's variant of the CKY method which runs in time O(n 2sl) is the fastest known sequential parsing algorithm for general context-free languages [11]. Earley's method takes time O(n 3) but it has the advantage that it does not need a grammar in Chomsky Normal Form. Attempts to speed up the processing in time by using parallel algorithms and parallel architectures, has been one of the areas of work for researchers. Recently, advancements in hardware technology have made it possible to design various computer architectures. Many architectures can easily be implemented on VLSI [9,14,18-20]. Some of the parallel techniques have been evolved from cognitive viewpoints [6] and others have been investigated from computational linguistic viewpoints [2]. Kosaraju [17] showed that the CKY method can be used to 0167-8191/94/$07.00 © 1994 Elsevier Science B.V. All rights reserved SSDI 0167-81 91(94)00022-3

Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

PARALLEL COMPUTING

ELSEVIER Parallel Computing 20 (1994) 1303-132 l

Efficient parallel recognition of context-free languages

Abhay Jain a, N.S. C h a u d h a r i b

a Computer Centre, Shri G.S. Institute of Technology & Science, Indore MP, India b School of Computer Science, Deci Ahilya University, Khandwa Road Campus, Indore MP, India

Received 13 November 1992; revised 1 October 1993

Abstract

This paper presents an efficient parallel approach for recognition of context-free languages. The algorithm takes O(log n) time of computations on a fixed configuration of processors array with reconfigurable bus system. The number of processors required is O(n3). Processor architecture consists of multibus and multiple internal switches in the processors. A specific arrangement of processors is required. Achievements in terms of time and processors requirement have been further supplemented with the elimination of dependence of processor complexity on the size of grammar.

Keywords: Context-free language recognition; Multiple buses; Parallel processing; Proces- sors array with reconfigurable bus system

I. Introduct ion

Recognition is a main component of language processing systems [12,15,16]. Earley's approach [8], Cocke-Younger-Kasami 's method [5] and Tomita's algo- rithm [13] for recognition and parsing of general context-free languages are well known. Valiant's variant of the CKY method which runs in time O(n 2sl) is the fastest known sequential parsing algorithm for general context-free languages [11]. Earley's method takes time O(n 3) but it has the advantage that it does not need a grammar in Chomsky Normal Form.

Attempts to speed up the processing in time by using parallel algorithms and parallel architectures, has been one of the areas of work for researchers. Recently, advancements in hardware technology have made it possible to design various computer architectures. Many architectures can easily be implemented on VLSI [9,14,18-20]. Some of the parallel techniques have been evolved from cognitive viewpoints [6] and others have been investigated from computational linguistic viewpoints [2]. Kosaraju [17] showed that the CKY method can be used to

0167-8191/94/$07.00 © 1994 Elsevier Science B.V. All rights reserved SSDI 0167-81 91(94)00022-3

Page 2: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

1304 A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

recognize general context-free languages in O(n) time on a two-dimensional array of processors. Parallel versions of Earley's method indicate that the solution can be obtained in linear time using O(n 2) processors and O(n 2) of total space. Object- oriented parallel approach has been reported in [3]. Gibbons and Rytter [1] have proposed a parallel algorithm with O(log 2 n) parallel time having O(n 6) processor complexity. Chu and Fu [10] have presented a parallel version of CKY-algorithm on VLSI architecture taking O(n) time using O(n 2) processors with concurrent writes.

This paper describes the context-free recognition in O(log n) time with O(n 3) processors on processors array with a reconfigurable bus system using the partial syntactic tree approach [1]. The solution is an efficient one in the sense that it takes O(log n) time with no write conflicts. The number of processors required is O(n3). The precise number is (14n3+ 9n 2 - 8n)/12 for even values of n and (14n3+ 9n 2 - 8 n - 3)/12 for odd values of n. In the existing methods, the constant of proportionality being a function of size of the grammar, further leads to higher processor complexity. In the proposed architecture, the heavy depen- dence of number of processors on size of grammar has been eliminated by using O ( K ) switches in the processors where K is the size of grammar.

The rest of this section describes some terminology used in the paper. Section 2 describes the approach and log n solution to the context-free recognition problem. Section 3 details on processors array with reconfigurable bus systems and their configuration. The internal architecture of processors required for the proposed solution along with the algorithm is also given in this section. Section 4 concludes the paper with suitable remarks.

A Context-Free Grammar (CFG) is defined as a 4-tuple G = (T, NT, P, S) where T is a non-empty finite set of terminal symbols (or constants), N T is a finite non-empty set of non-terminal symbols (or variables), P is a finite set of produc- tion rules of type a ~ fl with a as a non-terminal symbol and/3 in (T U NT)* and S is the starting non-terminal symbol in NT. A language is said to be a Context-Free Language (CFL) if it is accepted by a CFG G.

A terminal symbol of grammar G is represented by a lower case letter from alphabet {a, b, c , . . . } and a non-terminal symbol is denoted by capital letters from character set {S, A, B, C,...}.

A context-free rule in Chomsky Normal Form (CNF) is either of the form A BC (a branching rule) or A ~ a i (a terminal rule) where A, B, C are non-termi-

nal symbols and a i is a terminal symbol. We assume here that the empty string is not generated by CNF.

Recognition is a procedure to determine whether a given input string in a language is accepted by the grammar or not.

2. The approach

2.1. Partial syntactic trees with gaps

Almost all algorithms for context-free language recognition are based on tree data structures. The recognition problem is equivalent to the problem of determin-

Page 3: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

• b c d • b c d

lal Ibl tel

Fig. 1. Growth of a syntactic tree.

1305

~ 0

II

Fig. 2. An initial syntactic tree.

ing for a given string w if there is a syntactic tree with root and with leaves spelling w. A syntactic tree represents a derivation of a string w in (T U N T ) * from some non-terminal symbol. The internal nodes of such a tree are non-terminals and the leaves are labeled by consecutive symbols of w (from left to right). If A is the label of a particular node and if its left and right sons are respectively labeled as B and C, then A ~ BC is a production of the grammar. Any syntactic tree can be obtained by successive composition of pairs of syntactic trees (Fig. 1). An initial syntactic tree in this process is a derivation tree of height 1 generating one non-terminal symbol as shown in Fig. 2.

The detailed concept of partial syntactic tree has been described in [1]. It has been shown that using partial syntactic trees with gaps, the recognition of a given string can be performed in O(log n) time in WRAM model.

A partial syntactic tree is described by its root R and by an interval [i : j]. This interval specifies the substring w[i : j ] as a sequence of leaf labels which means the substring ai+l.., aj, from input string w = a t a 2 . . , a n. The partial syntactic tree corresponding to Fig. l(c) is given in Fig. 3.

A triple (R, i, j ) can be used to denote the partial syntactic tree with root labeled R and with interval [i : j]. The node (R, i, j ) is treated as a vertex in graph

0 4

Fig. 3. Partial syntactic tree corresponding to the syntactic tree of Fig. l(c).

Page 4: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

1306 A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

i k i i

Fig. 4. A partial syntactic tree,

and it is realizable iff there is a derivation R ~ * w [ i : j ] . y, z ~-x, (iff A ~ B C and x, y, z are of the form x = ( A , i, j ) , y = (B, i, k) , z = (C, k, j ) ) is a rule of composition by which a bigger partial syntactic tree may be obtained from smaller ones.

A partial syntactic tree with a gap is a tree deriving some substring w[i : j] of w in which an internal substring w [ k : l ] i <_ k < l < j is missing. This substring is replaced with a non-terminal symbol B. Hence it is a tree corresponding to a derivation of a string w [ i : k ] B w [ l : j ] from some non-terminal A. Fig. 4 represents a partial syntactic tree with root A and leaves w[i : k ]Bw[ l : j]. It is convenient to represent a partial syntactic tree with root A and leaves w[i : k ]B[ l : j] by a pair of nodes ((A, i, j), (B, k, 1)). Such a pair of nodes can be interpreted as an edge in a graph whose vertices are partial syntactic trees (Fig. 5).

A pair of nodes ((A, i, j), (B, k, 1)) is realizable iff (A ~ *w[i: k ] B w [ l : j ] i < k < l < j and (A, i, j ) is not equal to (B, k, l)).

For the grammar G and string w a directed graph Fo, w = ( N V , E ) can be defined where NV is the set of all nodes (partial syntactic trees without gaps) and (x, y) is in E iff for some realizable node z, we have y, z t -x (or equivalently z, y ~x) . Thus as mentioned above, the edges of Fo. ~ correspond to partial syntactic trees with gaps.

Algorithm 1 begin For all i, 0 < i < n , A in N T such that A ---~ai+ 1 inparallel do

pebbled ((A, i, i + 1)) ,-- true

IA,2,7~

\ , IC,4,~

2 4 4 7

Fig. 5. The edge ((A, 2, 7), (C, 4, 7)) corresponds to the partial syntactic tree (A, 2, 7) with the gap (C, 4, 7).

Page 5: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari / ParaUel Computing 20 (1994) 1303-1321 1307

repeat log n times begin

activate; square; square; pebble end

end. The execution of Algorithm 1 results in the recognition of given string using

WRAM model if the node (S, 0, n) is pebbled [1]. Various operations in the algorithm have been defined as follows:

activate for all x, y, z such that y, z ~ x and pebbled(z)inparallel do

EDGE(x, y) ~ true square for all x, y, z such that EDGE(x, z) and EDGE(z, y) in parallel do

EDGE(x, y) ~ true pebble for all x, y such that EDGE(x, y) and pebbled(y) in parallel do

pebbled(x) ~ true

2.2. The log n time solution

Theorem 1. For the graph Fc. w of a given string w of length n, the number of possible edges (each corresponds to partial syntactic tree with gap) is given by the expression

( n - 1 )n (n + 1 ) / 3 . (1)

Proof. As said above, a directed graph Fc, w is defined in which nodes are partial syntactic trees without gaps and edges correspond to partial syntactic trees with gaps. This graph-theoretic approach can be used simultaneously with syntactic terminologies for the same set of objects as Fig. 5 implies. For each of the terminal symbol of a given string of length n, first the terminal rule will be applicable. Corresponding to each terminal symbol a, a node may be taken and the node is being said to be pebbled with non-terminal symbol A if there exist a terminal rule A ~ a. In the next step, a pair of consecutive nodes from above nodes causes formation of a syntactic tree with root A, if branching rule A --* BC is applicable. Thus there can be maximum n - 1 roots each having substring of length 2 at leaves. Similarly at one step ahead there are n - 2 roots of partial syntactic trees each having 3 nodes as leaves. Continuing in this manner it is seen that, if all possible roots of partial syntactic trees for a given input string are considered, they form a pyramid as shown in Fig. 6.

Consider the pyramid representation of nodes: each node corresponds to a partial syntactic tree as shown in Fig. 6 for n = 5. Designate the elements of the pyramid by a two-dimensional representation. Bottom most row with 5 elements indexed as row 1, above this the row with four elements is row 2 and so on. Similarly, left most five elements are designated as column 1, next left four

Page 6: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

1308 A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

O

0 0

0 0 0 0 0

Fig. 6. The upper nodes of the possible edges contributed by pebbled node x (dark nodes).

elements is column 2 and so on. The pebbled node x of the pyramid can contribute to the edges which have their upper node in the straight line joining the nodes from x to upward left side diagonal elements and elements of the column of x as shown in Fig. 6. The lower nodes of the edges contributed by pebbled node x can be obtained by drawing straight lines from these upper nodes downward at a distance of k nodes in the column of the upper nodes which are the left hand side of x and diagonal right hand side for the nodes which are at right hand side of x, as shown in Fig. 7 (the nodes with cross), where k is row index of x. In fact the straight lines drawn in Fig. 7 are the edges which may be contributed by the pebbled node x.

The above explanation is based on the fact that the node x at k th row is the root of partial syntactic tree consisting of the substring with k elements. This partial syntactic tree can be joined with other partial syntactic tree to form a bigger tree. These other partial syntactic trees can be formed with m elements 1 < m < n

- 1 occurring at both sides of the k elements. In Fig. 7, x is the root of partial syntactic tree formed with substring a2a 3. The node x can form Y3 by composition with z3, the only node at right hand side. At left hand side of x, there exist two possibilities of compositions as there are two nodes. First case is composition with the node z 2 and second one is with partial syntactic tree formed by composition of a l and z2, i.e. node z 1.

In the pyramid representation of nodes as shown in Fig. 6, the number of nodes coming on the straight line drawn from a pebbled node upward is n - k (in fact left upward diagonal elements and upward elements of the column of pebbled node, in two-dimensional array representation) where k represents the row index of the pebbled node. The number of nodes occurring at k th row is n + 1 - k.

O

,y,o

Fig. 7. Nodes with cross represent the lower nodes of the edges which can be contributed by the pebbled node x.

Page 7: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321 1309

Therefore, the number of possible edges for k th row element shall be (n + 1 - kXn - k ) . On taking the summation for each value of k, from 1 to n, the total number of edges corresponding to all nodes of pyramid shall be (n - 1)n(n + 1)/3. []

It can be seen that for a pebbled node whose location is represented by the row index i and the column index j, 1 < j < n + 1 - i, 1 < i < n, the edges will be given by the node pairs indices obtained from the expression (2) and node pairs indices obtained from expression (3) where h = i + j

( i + k , j ) ( k , h ) l < k < n - h + l , h < n (2)

( h - m , m ) ( j - m , m ) l < m < j - l , h < n + l (3)

Theorem 2. I f a signal '!' /s transmitted upward in Fc, ~ from the initially pebbled nodes (row 1) then the signal will be received by other nodes iff these are connected by edges to the initially pebbled nodes directly or through other nodes.

Proof. Initially the nodes of row 1 will be pebbled as described by Algorithm 1. By application of composition rule as described in the proof of Theorem 1, some of the nodes will be connected by edges. Considering the edges as path for transmis- sion of the signal, it can be seen that if z is initially pebbled node and there exist an edge from y to z then y will receive the transmitted signal. Similarly, if y is having an edge from x then the signal will also be received by x, which means there exists a path from x to z through y. In general, if there exist edges between pairs of nodes (x, xl), (X1, X2) , (X2, X3)...(Xp, Xp+l) and (xp+l, z) and z is pebbled then there will be a path from x to z and therefore the signal transmitted from pebbled node z, shall reach to node x. []

Example 1. Consider the grammar G = (NT, T, P, S) where NT = {S, A}, T = {a,b}, P = { S ---,SS, S ~ S A , S ~ b , A ~ a } and S is the starting symbol. Consider the string w = ababa. Fig. 8 shows a part of the process. (S, 1, 2) and (S, 3, 4) are amongst the initially pebbled nodes, which activate the edges

AOOs AOOs AO0

AOOs AOOs AOOsX

a b a b a

Fig. 8. Existence of paths due to the pebbled nodes.

Page 8: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

1310 A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

((S, 3, 5)(A, 4, 5)) and ((S, 1, 3)(A, 2, 3)). Similarly the node (S, 1, 3) activates the edge ((S, 1, 5)(S, 3, 5)). A signal transmitted from (A, 4, 5), the initially pebbled node, reaches to (S, 1, 5) as there exists a path between the two.

2.3. The algorithm

Assume the following terminology: A node is said to be pebbled if it is realizable. EDGEI(x , y) means the establishment of a physical link between the nodes x and y, directed from x to y (pair x, y is said to be realizable). Let us define the following operations: activate1

For all pebbled(z) and y, z ~-x as per the expressions (2) and (3), in parallel do EDGEI(x , y) ,-- true

link(i) Transmit the signal i from all initially pebbled nodes upward

pebblel(i)For all unpebbled nodes z which receive the signal i in parallel do pebbled(z) ~ true

Algorithm 2 begin

For all i, 0 <i < n A in N T such that A ---*ai+ 1 in parallel do pebbled((A, i, i + 1)) ,--- true

F o r i = l t o l o g n d o begin

activate1; link(i); pebblel(i); if (S, 0, n) is pebbled then

write 'recognized'; stop; end

write 'does not recognize' end.

Algorithm 2 is in fact the extension of Algorithm 1 taking physical links between the nodes. But it has resulted in the processor complexity of the algorithm O(n 3) as compared to O(n 6) for Algorithm 1. Similarly O(log n) time complexity has been obtained as compared to O(log2n) of Algorithm 1, on PRAM model. The main point to show now is the architecture to implement the Algorithm 2. This is described in next section.

3. The architecture

3.1. Processors array with a reconfigurable bus system, PARBS

A one-dimensional Processors Array with a Reconfigurable Bus System consists of an array of N processors (N PARBS) which are connected in linear reconfig-

Page 9: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321 1311

Pl P2 P3 Pl I~

Fig. 9. Two subbuses are established by disconnecting the connection between X1 and X2 ports of P3.

urable bus system [4,7]. Each processor can be identified by a unique index i, 1 < i_<N. Within each processor two ports denoted by X1, X2 are built. The processors are connected to the reconfigurable bus system through these ports. The configuration of the bus system is dynamically changeable by adjusting the local connection among ports within each processor. For example, by connecting ports X1 and X2 within each processor linear straight bus can be established to connect the processors together. The bus can be denoted as X bus. The bus can split in subbuses if some processors disconnect their local connections between ports X1 and X2 (see Fig. 9). Each processor can communicate with other processors by broadcasting values on the bus system [9].

A two dimensional N1 × N 2 PARBS can also be viewed similarly. Each processor can be identified by indices (i, j), 1 < i < N 1 , 1 < j < N 2 . Within each processor there will be four ports denoted by say X1, X2, Y1 and Y2 and two buses X and Y. The idea can be extended to higher number of buses. The proposed algorithm requires a five-directional configuration of processors. The reason for taking five directions is the realization of five buses U, X, Y, Z and V which are being shared by three types of processors.

The algorithm under consideration requires two kinds of processors. The first kind of processors shall be with four buses and the second kind is the processor with two buses. The processors used by the algorithm can be divided in three groups. One group belongs to the processors with four buses and two groups belong to the processors with two buses. Three types of processors with their arrangements are described below.

3.1.1. Types of processors and configuration Type P1. Consider a processor with four buses. The eight ports of the processor

can be designated as X1, X2, Y1, Y2, Z1, Z2, U1 and U2 corresponding to the four buses X, Y, Z and U respectively. Buses can be reconfigured. Fig. 10(a) shows a processor with four buses. The bus X has got K pairs of switches where K is the number of non-terminal symbols in the grammar. In each pair the two switches, lower (A L) and upper (AR), are connected in series. Between the two switches there is a connection from where the Y bus is passing. At this point X and Y buses are permanently connected. Y bus can be considered as a set of K parallel buses. As shown in Fig. 10(a) for K = 3, there will be a set of 3 subbuses each corresponding to one non-terminal symbol from {S, A and B}. Six ports for these three buses have been shown as Y1S, Y2S, Y1A, Y2A, Y IB and Y2B. Y bus will not require any reconfiguration and hence all ports Y1 and Y2 are always

Page 10: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

Is) F~occsso~ o( Type P1.

A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

~ \ q / \ ~ - s ¢ I

~ \ ,, q t , ~ . \ , . . j . I

---\ t

U1

U2 ~x

1312

Ibl I~ocgssot of Type I~. |q I~ocnnor of Typ© P3.

Fig. 10. Ports and bus systems of processors.

connected. Z and U buses are reconfigurable with one switch on each bus in each processor.

Processors of type P1 are connected together using X, Y and Z buses as follows:

Consider a three dimensional arrangement of processors. In two directions, the processors are arranged in a pyramid. Bottom most row has n processors, next row has n - 1 and so on. The top row will have one processor. The top row processor is although a P1 type processor but all internal switches except S L (lower switch corresponding to symbol S on X bus) are made permanently open. Only the

Page 11: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari /Parallel Computing 20 (1994) 1303-1321 1313

switch S t can be closed if required. This is so, because recognition of a given string requires that (S, 0, n) be realizable. In the third direction, 2(n - 1) such two-di- mensional arrangements of processors are taken. The three-dimensional arrange- ment of processors can be represented by array P l ( i , j, k ) 1 < j < n + 1 - i , 1 < i < n, 1 < k < 2(n - 1), where n is the number of input symbols in the string.

On Y bus all processors for each value of i and j are connected together. To make connections on X bus the Exp. (2) and (3) are used. For each value of k from 1 to n - 1, 1 < j < n + 1 - i , 1 < i < n the processor pairs given by Exp. (2) are connected on bus X (X1 port of first processor of the pair is connected to X2 port of second processor of the pair). Similarly, for each value of k from n to 2 ( n - 1), 1 < j < n + 1 - i , 1 < i < n , the processor pairs given by Exp. (3) are connected on bus X. For each value of i and j, all processors of type P1 are also connected on Z bus. Fig. 11 depicts the arrangement of X buses of P1 type processors, for n = 5. For simplicity, other buses have not been shown and 2(n - 1) pyramids of processors of third direction have been shown separately for each value of k.

Processors given by Exp. (2) and (3) are only connected on X bus. No other processors will be required by the algorithm. Therefore remaining processors (shown in Fig. 11 as unconnected) are not required to be taken at those locations. It can be seen that number of processors required are given by the Exp. 4((a) and (b)).

n (2n 2 + n - 2 ) / 4 for even n (4a)

( n - 1) (n + 1)(2n + 1 ) / 4 for odd n (4b)

Type P2. Consider a processor with two buses Z and V. Port Z1 of bus Z will be required for receiving the information from one direction. Port Z2 is not required and hence no reconfiguration is desired. Similarly, only the port V1 is required for broadcasting the information on V bus in one direction and hence V2 port and reconfiguration for this bus are also not necessary (Fig. 10(b)).

Processors of type P2 are required to be connected as follows: At one end of the Z bus running from processors of type P1, the Z1 ports of

these processors (type P2) are connected. On processors of type P1 there are n(n + 1) /2 number of Z buses exist. Therefore, the number of processors re- quired of type P2 will be n(n + 1)/2. The pyramid arrangement of these proces- sors can be represented by P2( i , j ) 1 < j < n + 1 - i, 1 < i < n.

Type P3. Type P3 processor is a processor with two buses V and U. Bus V is reconfigurable with one switch between the ports. U bus is established by making local connection U1 and U2 permanently for all processors (Fig. 10(c)). Processors of type P3 are connected to the system as follows:

To each of the processors P2(i , j), 1 _<j _< n + 1 - i, 1 _< i _< n two groups of processors of type P3 are connected on V bus. The indices of first group of processors are given by Exp. (5) and indices of second group of processors are

Page 12: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

1314 A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

0 0 0 0

0 0 / 0 0 0 0

0 0 0 0 0 0 0 0 0 0

1(=-4 k=8

o o o \ ~ o 0 0 0 O ' 0 0 0 [3 0 0

1(=-3 I(=-1

0 0

k=2 k=6

I(=-1 1(=-5

Fig. 11. Arrangement of type P1 processors for each value of k, which indicates the number of two-dimensional arrangements of processors in the third direction.

Page 13: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

left side 0 / ~ alUM sid©

Oo o o oz ,~ ,~"~t o o

lal P211.11 151 P2p.21

1315

0 0 0

lcl I)211.31

0

o°o Idl F'zp.41

o0o% 0 0 0 0 1

~ 0 0 0

lel m p . ~ m ml~11

0 0 0

0 °

III 13212,21 151 ml~,~ll Fig. 12 Connections of type P3 processors with each of type P2 processors. P2 type processors is shown in black. Each of the diagrams from (a) to (n) is labelled with P2 type of processor to which processors of type P3 are connected.

Page 14: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

1316 A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

0 0 0 0

III P212.4 [] P2|3.11

0

0 O 0 0 0

Ikl I~p.21 Pl ~ p . 3 1

0 0 0 0 0 0 0 O 0 0 0 0

iml 1321'1.11 I'll i~l'i.2i Fig. 12. (continued).

given by Exp. (6), where h = (i + j ) . Processors are assumed to be arranged in three dimensions as shown in Fig. 12 for the case n = 5. Two dimensional pyramids of processors for third direction have been shown separately for each value of P2(i, j). Since for each of P2(i, j ) the two groups are required, the total number of groups of processors will be 2[{n(n + 1)/2} - 1]. (P2(n , 1) is not required to be connected).

( i + k , j ) ( n - h + 2 - k , h ) l < k < n - h + l (5)

( i + m , j - m ) ( j - m , m ) l < _ m < _ j - 1 (6)

Page 15: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321 1317

The number of processors of type P3 for each P2(i , j ) is given by 2(n - i ) which are divided in two groups say left group and right group. In Fig. 12(a) to Fig. 12(n) the left hand side processors to the processor shown as black is left group and the right hand side processors to the same is right group. (P2(1, 1) being the left most processor has no left side processors).

U bus of P3 type processors is connected to the U bus of P1 type processors as follows:

Represent the processor of type P3 by an array P3(i, j, P2(k, l), m), where i and j represent the pyramid arrangement of processors in two directions. P2(k, l) represent the processor of type P2 to which processors of type P3 are connected and m has two values left say 2 and right say 1, to indicate the processors of left side and right side to P2(k, l) respectively.

Right hand processors P3(i, j, P2(k, l), m) for each value of k of P2(k, l) are connected together on U bus for corresponding values of i and j. For example, the right hand processors of Fig. 12(a) to 12(e) i.e. corresponding to P2(1, 1), P2(1, 2), P2(1, 3), P2(1, 4) and P2( I , 5) will be connected together. It may be noted that in this manner a maximum of two processors of type P3 will be connected together. Similarly left hand processors P3(i, j, P2(k, l), m) for each value of k of P2(k, l) are connected together for corresponding values of i and j. For this case also it can be seen that maximum two processors of type P3 will be connected together.

U buses of processors for P l ( i , j , k ) for l < k < n - 1 are connected to U buses of P3(i, j, P2(l, m), q) l = k, q = 1 and 1 __< m < n + 1 - l for corresponding values of i and j'. Similarly, the U buses of processors Pl(i, j, k) for n _< k _< 2(n - 1 ) are connected to U buses of P3(i , j , P2( l ,m) ,q ) l = k - n + l , q=2, 1 < m < n + 1 - l, for corresponding values of i and j. It may be noted that in this manner only one processor of P1 type is connected on U bus to the maximum two processors (as said above) of type P3 for each value of i and j'.

It can be seen that processors detailed by Exp. (5) and (6) will contain only a part of array. For all values of i and j corresponding locations are not required to be filled up. As shown in Fig. 12 the processors which are not connected, need not be taken. It can be easily seen that the number of processors required is given by the expression

2(n - 1 )n (n + 1 ) / 3 . (7)

3.1.2. Processor complexity The processor complexity can be computed from the processors requirement

mentioned in above subsection. Exp. (4) and (7) give the number of processors required for type P1 and type P3 respectively. The number of type P2 processors required is n(n + 1)/2. Therefore the total number of processors required is given by Exp. (8(a) and (b)) as sum of three types of processors P1, P 2 and P3.

(14n 3 + 9n 2 - 8 n ) / 1 2 for even n (8a)

(14n 3 + 9n 2 - 8n - 3 ) / 1 2 for odd n (8b)

Page 16: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

1318 A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

3.2. Implementation of algorithm on PARBS

The informal description of the algorithm is as follows:

Algorithm 3 (i) Processors P l ( - / , j , k ) with i = l , all k, l < k < 2 ( n - 1 ) and for all j,

1 _<j _< n, A in N T such that A -~ a j + 1 in parallel provided with non-termi- nal symbols A on X1 ports. (X1 ports of these processors with i = 1 are unconnected being the bottom most processors). There can be maximum K symbols to be provided at the ports and therefore the time required will be proportional to K.

(ii) Lower switch for each of A (denoted by A L) of the processors which have received the non-terminal symbols in Step (i), will be closed if symbols A have been received by the processors. Processors are said to be pebbled with the symbols A. Time required for the operation will be of the order of K.

(iii) Using the bus splitting technique the information about pebbled nodes is transmitted from P1 type processors to P2 type processors. Straight bus Z is formed connecting all the processors of type Pl(i , j, k) for each value of i and j by establishing the local connections Z1 and Z2. Now the processors which are pebbled with a symbol A, disconnect the local connection between Z1 and Z2. The straight bus is split into several disjoint subbuses. Then, these processors with disconnected connections broadcast the symbol A on the subbus which it is connected to, through the port Z2 (Z1). Processor P2(i, j) connected on the bus will receive the symbol if it exists on any of the processors of type P1. The process is repeated for each of the K non-termi- nal symbols. It will take a time proportional to K. At the end of the step, processors P2(i, j) will be equipped with the number of non-terminal symbols for which the processors Pl(i , j, k) are pebbled for all values of i, j and k.

(iv) Local connection V1, V2 for each of the type P3 processors are established. All processors P2(i, j) (for all i and j ) broadcast the non-terminal symbols received in above step on the bus V. The information will be associated with the indices i and j of the broadcasting processor. All processors of type P3 which are connected to V bus will receive the information. The broadcast is performed synchronously, i.e. at first instant first symbol will be broadcast by all the processors if it is there with the processor, otherwise the processor broadcasts nothing on the bus. At second instant, the second symbol will be broadcast in the same manner. Continuing with each symbol like this the Kth symbol will be broadcast at Kth instant.

(v) All processors P3(i, j, P2(k, l), m) broadcast the information collected in above step, namely, the non-terminal symbols associated with the processors indices, on U buses to be received by processors of type P1. As stated earlier, there are maximum two processors of type P3 and one processor of type P1, existing on a U bus. The broadcasting is completed in two stages. In the first stage the processors with m = 1 and index value j of P3 equal to

Page 17: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321 1319

(viii)

index l of processor type P2 (which has been received by processors of type P3), i.e. processors which are at the right hand side of P2(i, j ) and have same column number, see Fig. 12; and processors with m = 2 and having i + j = k + l, i.e. processors which are at the left hand side of P2(i, j) and are at left side diagonal of P2(i, j), will broadcast. In the second stage remaining processors will broadcast. In fact, the processors involved at first stage will form upper nodes of the links to be established on P1 type processors and second stage processors will form the lower nodes for the same.

(vi) Processors of type P1 which have received information from P3 type processors will use the information for closing the switches on X bus inside the processors. Processors will close the lower switches for the symbols which have been received in the first stage. The status of switches corresponding to symbols which are not received in first stage will remain open. Similarly the upper switches of all the processors of type P1 will be closed for the corresponding symbols if those are received in second stage.

(vii) The signal ' i teration number ' is connected at the X1 ports of processors Pl(i , j, k) for i = 1 and all j and k (same ports which are used in Step (i)). The processors of type P1 which receive the iteration number (except the processors Pl(i , j, k) for i = 1, and all j and k) and have not received any iteration number earlier is said to be pebbled with the non-terminal symbols for which the lower switches are closed inside the processors. Processors Pl(i , j, k) with j = 1, i = n and all k are checked if these are pebbled with S then the given string is recognized by the grammar and execution of the algorithm can be terminated.

(ix) The execution enters in next iteration and Step (iii) to Step (viii) are repeated till the iteration number crosses the number log n. If in log n iterations recognition is not obtained the string will not be recognizable by the grammar.

Lemma 2.1. Algorithm 3 accepts an input string of length n iff P l ( n , 1, k) for 1 _< k < 2(n - 1) receice signal transmitted from any of the processors Pl( i , j, k), l <_j<n, l < k < 2 ( n - 1 ) .

Proof. With reference to the partial syntactic trees with gaps, a processor P I ( i , j, k) receiving transmitted signal from initially pebbled nodes, indicates that node corresponding to processor Pl(i , j, k) and closed switches AL, (A, j - 1, i + j - 1) is realizable. It further means that there exist a syntactic tree with the symbol A as root and string aj, ai+ ~...ai+j_ ~ as leaves. Again, in terms of partial syntactic trees with gaps, the physical link between processors Pl(i , j, k) and any of the processors Pl( i , j, k), i = 1, 1 <j < n, 1 < k < 2(n - 1) through the edges, is the indication that the symbol A represented by Pl(i , j, k) and closed A L switch is realizable or the corresponding substring is accepted by the grammar.

If the processors Pl( i , j, k) for index i = n and j = 1 receive the signal transmitted from PI(1, j, k) for 1 _<j < n, 1 < k < 2(n - 1) then it will mean that any one or more of the processors at the top of the pyramid have received the

Page 18: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

1320 A, Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321

signal, and since all P1 type processors are connected to each other for each value of i and j through Y bus, all the processors P l ( i , j , k ) for i = n , j = 1, and 1 < k < 2(n - 1) will share the same information. Further, as these processors have only S L switches closeable (other switches corresponding to remaining non-termi- nals are permanently_ open), the signal received indicates that (S, 0, n) is realiz- able, i.e. there exists atleast one syntactic tree with root at P l ( i , j , k ) with symbol S for i = n and j = 1 and leaves being the input string of size n, w = a~a2a3. . , a n.

Log n time is sufficient for the recognition as shown in Algorithm 2 and Algorithm 1. For the proof of log n time in Algorithm 1, see [1]. []

4. Conclusion

The paper has described an efficient method for recognition of context-free languages. The time complexity O(log n) and processor complexity O(n 3) are better than the available methods.

The philosophy used to achieve the efficient solution is based on the concept of syntactic trees. The methodology of partial syntactic trees with gaps has been employed with processors array with reconfigurable bus system. Three different types of processors, multiple buses and multiple internal local connections between ports have been proposed. One type of processors require that there should be 2K switches for local connections on one of the buses in the processors, where K is the number of non-terminal symbols in the grammar. At the first glance it may seem to be a fairly high complexity in the processor architecture, but a slight insight into the overall system shows that this increased complexity eliminates the requirement of a large number of processors. The dependence of number of processors on the grammar size is no more there, which is the case in almost all parallel architectures proposed earlier.

It is worth mentioning here that it is possible to obtain the same time complexity using the processors with two switches in place of processors with 2K switches, but the processor complexity will increase by a factor atleast K along with some more complexity in the system.

Acknowledgment

We thank our colleague Prof. M. Chandwani for his contributions through active participation in the discussions, during the course of work. We are also thankful to the anonymous referees for their valuable suggestions and comments on the work of the paper.

References

[1] A. Gibbons and W. Ryttcr, Efficient Parallel Algorithms (Cambridge University Press, Cambridge, UK, 1988).

Page 19: Efficient parallel recognition of context-free languagescse.vnit.ac.in/people/narendraschaudhari/wp... · Recognition is a main component of language processing systems [12,15,16]

A. Jain, N.S. Chaudhari / Parallel Computing 20 (1994) 1303-1321 1321

[2] A. Nijholt, Parallel parsing strategies in natural language processing, in: Proc. Int. Workshop on Parsing, Pittsburg, PA (Aug. 1989) 240-253.

[3] A. Yonezawa and I. Ohsawa, Object-oriented parallel parsing for context-free grammars, in: COLING 88, Proc. 12th Int. Conf. on Computational Linguistics, Budapest (1988) 773-778.

[4] B.-F. Wang and G.-H. Chen, Constant time algorithms for transitive closure and some related graph problems on Processor Arrays with Reconfigurable Bus System, IEEE Trans. Parallel Distributed Syst. 1 (4) (Oct. 1992) 500-506.

[5] D.H. Younger, Recognition and parsing of context-free languages in time n 3, Inform. Control 8 (1965) 607-639.

[6] D.L. Waltz and J.B. Pollack, Massively parallel parsing: a strongly interactive model of natural language interpretation, Cognitive Sci. 9 (1985) 51-74.

[7] G.-H. Chen, B.-F. Wang and C.-J. Lu, On the parallel computation of algebraic path problem, IEEE Trans. Parallel Distributed Syst. 3 (2) (March 1992) 251-256.

[8] J. Earley, An efficient context-free parsing algorithm, in: B.J. Grosz, K.P. Jones and B.L. Webber, eds., Readings in Natural Language Processing (Morgan Kauffman, Los Altos, CA, 1986) 25-34.

[9] J.L. Aravenda and A.O. Barbir, A class of low complexity high concurrence algorithms, IEEE Trans. Parallel Distributed Syst. 2 (4) (Oct. 1991) 495-502.

[10] K.H. Chu and K.S. Fu, VLSI architecture for high-speed recognition of context-free languages, Proc. 9th Syrup. on Computer Architecture, SIGARCH Newsletter 10 (3) (1983) 43-49.

[11] L. Valiant, General context-free recognition in less than cubic time, J. Comput. System Sci. 10 (1975) 308-315.

[12] M.A. Harrison, Introduction to Formal Language Theory (Addison-Wesley, Reading, MA, 1978). [13] M. Tomita, Efficient Parsing of Natural Language: A Fast Algorithm for Practical System, (Academic,

Norwell, MA: Kluwer, 1986). [14] R. Miller et al., Image computation on reconfigurable VLSI arrays, Proc. IEEE Computer Society

Conf. on Computer Vision and Pattern Recognition (1988) 935-950. [15] R.N. Moll, M.A. Arbib and A.J. Kfoury, An Introduction to Formal Language Theory (AKM-series

on Text and Monographs in Computer Science, Springer-Verlag, 1988). [16] S. Graham, M.A. Harrison and W.L. Ruzzo, An improved context-free recognizer, ACM-TOPLAS

2 (3) (1980) 415-460. [17] S.R. Kosaraju, Speed of recognition of context-free languages by array automata, SlAM J. Comput.

4 (1975) 335-340. [18] S.-Y. Kung, VLSI array processors, IEEE ASSP Mag. (July 1985) 4-22. [19] S.-Y. Kung, S.C. Lo and P.S. Lewis, Optimal systolic design for the transitive closure and the

shortest path problems, IEEE Trans. Comput. C-36 (5) (May 1987) 603-614. [20] Y.-C. Chert et al., Designing efficient parallel algorithms on mesh-connected computers with

multiple broadcasting, IEEE Trans. Parallel Distributed Syst. 1 (2) (April 1990) 241-246.