7
|ICCCT’10| __________________________________ 978-1-4244-9034-/10/$26.00©2010 IEEE 752 Context Free Grammar Induction Library Using Genetic Algorithms Hari Mohan Pandey* *Computer Engineering Department, MPSTME Shirpur Campus, SVKM’s NMIMS University, Mumbai, India *[email protected] Abstract:-As we all are aware that “Evolutionary Algorithms” (EAs) are modern techniques used for searching for an optimum. One can establish communication via two medium first oral communications (Speech Processing) and second is written (Text Processing). This project focuses the second medium (text processing) of communication where we communicate using any language by writing some thing. Genetic Algorithms are developed as random search methods, which have not so sensitivity on primary data of the problems. It can be used in estimation of system parameter to get the best possible solution. Genetic Algorithms have been discussed for grammar induction. Grammar Inference or Language Learning is the process of learning grammar from training data.This paper mainly discussed the various methods for learning context-free grammar (CFG) from the corpus of string and presents the approach of informant learning in the form of result for two standard grammar problems first balanced parenthesis problem and second two symbol palindrome over {a, b}. Keywords: - Machine Learning, Grammatical Inference, Context Free Grammar, Genetic Algorithms etc. I. INTRODUCTION The idea of Grammar induction is simple. Given a string of symbols, a grammar is constructed by replacing any repeated sequence of two or more symbols by a non- terminals symbol [1]. There are many previous grammar induction systems have worked on the assumption on that grammar induction is a subset of inductive logic programming (ILP) and as such have primarily dependent on ILP techniques [2]. In this paper we are using the concept of Genetic Algorithms. Genetic Algorithms (GA’s) were invented by John Holland in the 1960s. Wyard [3] explored the impact of different grammar representations and experimental results show that an evolutionary algorithm using standard context-free grammars (BNF) outperformed other representations. The organization of paper is as follows: the section II gives the detailed description about the problem definition, section III is given to understand the proposed method, section IV covers the detailed about the previous work done in the same field, section V gives the idea of how to go about to achieve the goal, section VI gives the idea of experimental work done for achieving the solution the next section covers the detailed of the road traveled by the author to achieve the goal and future enhancement for the work. II. PROBLEM DEFINITION In this paper author has given the idea about how to generate grammar from a piece of well structured data. There is a possibility that one can program a computer to inference grammar or to design or to discover or to represent the grammatical structure. Generally, for any language we represent the grammar using a representation known as context free grammar. Here, the program is trained on some sample of sentences (e.g. simple English sentences in written form), then evaluated. The general technique for syntactic induction is given in [4]. These techniques are nothing but supervised learning techniques where the presence of the teacher is required who has detailed knowledge about the language. The shape of problem becomes more complex when the data which we are accepting as an input are samples of natural language (e.g. set of sentences). As we know that each language owns a set of rules called as grammar, for our example (set of English sentence) there are also presents some grammatical structure (Noun, Pro-noun, Verb, Adverb, Determiner, and Adjectives etc.) Since we are intended to get the induced grammar from set of sentences then there is probability to get ambiguous grammar. Ambiguity in the sentence means that one’s understanding of the syntactic structure of a sentence is informed by linguistic data beyond syntax [5]. Hence, to deal this problem we will use the concept of parsing. To resolve the complexity of the problem at the initial stage we have implemented the problem using GA’s and after getting the result on the single machine we are planning to implement the same set of sentence using any approach of parallel genetic algorithm by which we will surely get more optimum result. III. ` PROPOSED METHOD For the solution of the problem given in section II, author proposes that a genetic algorithm, with some assistance can solve the problem. Here, to induce the grammar we will train on a set of simple example. Over time, the set of sentences will increase, which also increases the complexity of the system.

[IEEE 2010 International Conference on Computer and Communication Technology (ICCCT) - Allahabad, Uttar Pradesh, India (2010.09.17-2010.09.19)] 2010 International Conference on Computer

Embed Size (px)

Citation preview

�������������� ��������������������������� |ICCCT’10|

__________________________________ 978-1-4244-9034-/10/$26.00©2010 IEEE 752

Context Free Grammar Induction Library Using Genetic Algorithms Hari Mohan Pandey*

*Computer Engineering Department, MPSTME Shirpur Campus, SVKM’s NMIMS University, Mumbai, India *[email protected]

Abstract:-As we all are aware that “Evolutionary Algorithms” (EAs) are modern techniques used for searching for an optimum. One can establish communication via two medium first oral communications (Speech Processing) and second is written (Text Processing). This project focuses the second medium (text processing) of communication where we communicate using any language by writing some thing. Genetic Algorithms are developed as random search methods, which have not so sensitivity on primary data of the problems. It can be used in estimation of system parameter to get the best possible solution. Genetic Algorithms have been discussed for grammar induction. Grammar Inference or Language Learning is the process of learning grammar from training data.This paper mainly discussed the various methods for learning context-free grammar (CFG) from the corpus of string and presents the approach of informant learning in the form of result for two standard grammar problems first balanced parenthesis problem and second two symbol palindrome over {a, b}.

Keywords: - Machine Learning, Grammatical Inference, Context Free Grammar, Genetic Algorithms etc.

I. INTRODUCTION

The idea of Grammar induction is simple. Given a string of symbols, a grammar is constructed by replacing any repeated sequence of two or more symbols by a non-terminals symbol [1]. There are many previous grammar induction systems have worked on the assumption on that grammar induction is a subset of inductive logic programming (ILP) and as such have primarily dependent on ILP techniques [2]. In this paper we are using the concept of Genetic Algorithms. Genetic Algorithms (GA’s) were invented by John Holland in the 1960s. Wyard [3] explored the impact of different grammar representations and experimental results show that an evolutionary algorithm using standard context-free grammars (BNF) outperformed other representations. The organization of paper is as follows: the section II gives the detailed description about the problem definition, section III is given to understand the proposed method, section IV covers the detailed about the previous work done in the same field, section V gives the idea of how to go about to achieve the goal, section VI gives the idea of experimental work done for achieving the solution the next section covers the detailed of the road traveled by the author to achieve the goal and future enhancement for the work.

II. PROBLEM DEFINITION

In this paper author has given the idea about how to generate grammar from a piece of well structured data. There is a possibility that one can program a computer to inference grammar or to design or to discover or to represent the grammatical structure. Generally, for any language we represent the grammar using a representation known as context free grammar. Here, the program is trained on some sample of sentences (e.g. simple English sentences in written form), then evaluated. The general technique for syntactic induction is given in [4]. These techniques are nothing but supervised learning techniques where the presence of the teacher is required who has detailed knowledge about the language. The shape of problem becomes more complex when the data which we are accepting as an input are samples of natural language (e.g. set of sentences). As we know that each language owns a set of rules called as grammar, for our example (set of English sentence) there are also presents some grammatical structure (Noun, Pro-noun, Verb, Adverb, Determiner, and Adjectives etc.) Since we are intended to get the induced grammar from set of sentences then there is probability to get ambiguous grammar. Ambiguity in the sentence means that one’s understanding of the syntactic structure of a sentence is informed by linguistic data beyond syntax [5]. Hence, to deal this problem we will use the concept of parsing. To resolve the complexity of the problem at the initial stage we have implemented the problem using GA’s and after getting the result on the single machine we are planning to implement the same set of sentence using any approach of parallel genetic algorithm by which we will surely get more optimum result.

III. ` PROPOSED METHOD

For the solution of the problem given in section II, author proposes that a genetic algorithm, with some assistance can solve the problem. Here, to induce the grammar we will train on a set of simple example. Over time, the set of sentences will increase, which also increases the complexity of the system.

�������������� ��������������������������� |ICCCT’10|

753

Hence, for step by step solution we use the concept of GA’s. Also, by using the GA’s we will create random population of Grammar by using Grammar designing steps. Then calculate the fitness of the grammar in the whole population and get the best from the whole set. The best individual with the fitness greater than the required threshold is found or the given number of generation is completed. After that we will use the operator crossover and mutation to create new population and note down the crossover and mutation rate by using appropriate method. Again calculate the fitness of the new constructed population. Finally, merge the population (first one and second which we get by applying crossover and mutation) and update the best individual by using the correct replacement method. To get the more optimum result we will explore the idea on multiple machines by using the idea of parallel genetic algorithms.

IV. PREVIOUS WORK

Many previous grammar induction systems have worked on the assumption that grammar induction is a subset of inductive logic programming (ILP) [2]. Few papers give evolutionary techniques to grammar induction, Keller and Lutz [6], used a genetic algorithm to evolve stochastic context-free grammars for finite language. The limitation of the work done in [6] was they used only positive examples of language data were presented. In [2] Margaret Aycinena, Mykel and David gave an approach for evolving natural language grammar using genetic algorithm, which was based on the part of speech tagged natural language example, here fitness was based on the number of sentences correctly parsed by the Grammar from a selection of language examples, inversely related to the number of random sentences parsed by the grammar, and discounted by the length of the string representing the grammar. Olgierd Unold in [7] gave the detailed idea about a task of training Grammar-Based Classifier System (GCS) to learn grammar from data. GCS has been proposed to address both the natural language grammar induction and also learning formal grammar for DNA sequence. Therefore by seeing the above work done in the same field author can say that similar experiments have been performed, but all have limited scope. The above mentioned method proves the ability of genetic algorithms to induce the structure of many types of well-formed data. Also in the above method they have used the idea of GA’s for getting the well structured Grammar in Context-free form.

V. IMPLEMENTATION METHODOLOGY

In this section, we describe the methods used to induce a grammar for set of sentences or natural language. Although the approach is based on the idea of a genetic algorithm as originally presented in [8] and latter on in [9], in this paper

we have made a few significant adaptations that make the process easier for the grammar induction.

A. Grammar Induction Process:• In formal language theory, a context-free grammar (CFG) is a grammar in which every production rule is of the form

α β→Where, α is a single non-terminal symbol, and β is a string of terminals and/or non-terminals (possibly empty) [10].

The term "context-free" expresses the fact that non-terminals can be rewritten without regard to the context in which they occur. A formal language is context-free if some context-free grammar generates it. These languages are exactly all languages that can be recognized by a non-deterministic pushdown automata.

Now, it’s time to discuss the process of Grammar Induction or Grammar Inference or Language Learning, is the process of learning of a grammar from set of samples. There are verities of algorithms available for learning regular language. These algorithms are nothing but used for largest class of languages, which can be efficiently learned. Here the focus is on grammatical inference, i.e. the inference of formal languages such as those of the Chomsky hierarchy from positive as well as negative sample strings.

Peter Wyard [3] gave the impact of different grammar representations and experimental results show that an evolutionary algorithm using standard context-free grammar (BNF). Wyard [3] also explore the concepts and various issues such as the representation of the grammars, and method for the evaluation of the chromosomes.

In general grammatical induction methodology, a language acceptor is designed to accept all the positive examples. We know that grammar inference or learning from positive examples is called text learning. A more powerful techniques uses as well. In this learning the language acceptor is constructed so as to accept all the positive examples and reject all the negative samples. With the help of a comparative analysis one can say that, context free grammar learning requires more information about a set of positive and negative sample for example a set of skeleton parse trees, which makes them a more challenging task to induce grammar. In a broader sense, a learner has access to some sequential or structured data and is asked to return a grammar that should in some way explain such data. Parsing according to a grammar amounts to assigning one or more structures to a given sentence of the language the grammar has been defined. We will surely get ambiguous grammar if there are sentences with more than one structure, as generally used in case of natural language. Parsing can be used as a search process that looks for correct structures for the input sentence. If we can establish some kind of preference among the set of correct structures, the process can be regarded as an optimize one.

�������������� ��������������������������� |ICCCT’10|

754

The idea given here suggests considering evolutionary programming techniques, which are acknowledged to be practical search and optimization methods [12].

Grammar induction has several practical applications outside the field of theoretical linguistics, such as structural pattern recognition [13] [15] (in both visual images and more general patterns), automatic computer program synthesis and programming by example, information retrieval, programming language and bioinformatics. Syntactic processing has always been paramount to a wide range of applications, such as machine translation, information retrieval, speech recognition and the like. It is therefore natural language syntax has always been one of the most active research areas in the field of language technology [14]. All of the typical pitfalls in language like ambiguity, recursion and long-distance dependencies, are prominent problems in describing syntax in a computational context. The field of evolutionary computing has been applying problem-solving techniques that are similar in intent to the Machine Learning recombination methods. Most evolutionary computing approaches hold in common that they try and find a solution to a particular problem, by recombining and mutating individuals in a society of possible solutions. This provides an attractive technique for problems involving large, complicated and non-linearly divisible search spaces.

B. Genetic Algorithms: In 1950s and 1960s several computer scientists independently studied evolutionary systems with the idea that evolution process could be used as an optimization tool for engineering problems. The idea in all these systems was to evolve a population of candidate solutions to a given problem, using operators inspired by natural genetic variation and natural selection. Rochenberg (1965, 1973) introduced "Evolution Strategies", a method he used to optimize real-valued parameters. This idea was further developed by Schwefel (1975, 1977). The field of Evolution Strategies has remained an active area of research. Fogel, Owens, and Walsh (1966) developed "Evolutionary Programming", a technique in which candidate solutions to given tasks were represented as finite-state machines, which were evolved by randomly mutating their state-transition diagrams and selecting the fittest. Together, Evolution Strategies, Evolutionary Programming, and Genetic Algorithms form the backbone of the field of Evolutionary Computation. Genetic algorithms (GA) are a technique for searching for olutions to a problem in an intelligent and efficient manner. In this paper we are presenting a very simple description of how GA’s works for structuring chromosomes for context-free grammar evolution. Simon Lucas [16] gave the use of genetic algorithms for inferring small regular and context-free grammars. In [16] Lucas explored the idea that applying genetic algorithm is not very effective in structuring grammar,

therefore to overcome this problem Lucas presented two methods for structuring the chromosomes, the first is to bias the distribution of 1’s in the population of chromosomes according to an algebraic expansion techniques previously developed by the author. The second method involves the performing the evolution in a different space where the grammars are represented in embedded normal form.

Genetic algorithm differs from conventional optimization techniques in following ways [17]:

1. GA’s operate with coded versions of the problem parameters rather than parameters themselves i.e., GA works with the coding of solution set and not with the solution itself. 2. Almost all conventional optimization techniques search from a single point but GA’s always operate on a whole population of points(strings) i.e., GA uses population of solutions rather than a single solution fro searching. This plays a major role to the robustness of genetic algorithms. It improves the chance of reaching the global optimum and also helps in avoiding local stationary point. 3. GA uses fitness function for evaluation rather than derivatives. As a result, they can be applied to any kind of continuous or discrete optimization problem. The key point to be performed here is to identify and specify a meaningful decoding function. 4. GA’s use probabilistic transition operates while conventional methods for continuous optimization apply deterministic transition operates i.e., GA’s does not use deterministic rules.

3. Learning Approach for CFG Using GAs: This section defines the basic procedure used for Grammar Induction from a set of corpus constructed from a set of positive (S+) and negative (S-) string for the grammar to be constructed. Here fitness function is based on the nature of the string supplied in the corpus [18].

Algorithm-1: CFG Induction using genetic algorithm

Input: random sequence of 0’s and 1’s up to size of chromosomes (Array p). Output: the optimal search time (t) and the set of grammar rule (R).

Parameters: a) Genetic Population size (popsize) b) Length of the chromosome (len_chrom) c) Maximum number of generation (max_gen). d) Possibility of crossover operator (poss_cross). e) Possibility of mutation operator (poss_mut). f) Primary value of producing of random number (prnum).

Properties: a) Coding Style: Binary (0/1) coding b) Select operator: Roulette Wheel

�������������� ��������������������������� |ICCCT’10|

755

c) Crossover operator: Single Point d) Mutation Operator: Inverting of selecting bits. e) Termination Condition: Max number of generation. f) Fitness function:

(( ( ) ( )) (( ) ( ))* * ( )n AP n RN AN RP PW PW CL PW MR+ − + + − −

Where, CL : Corpus Length PW : Positive Weight PS : Positive String NS : Negative String PP : Penalty of Positive String PN : Penalty of Negative String MR : Addition factor of the maximum number of

expectation rule

1. Input (n, p)2 Input (popsize, maxgen, poss_cross,

poss_mut, prnum)3. Population � random population of

the grammar (popsize) 4. Fitness � ((n (PS) +n (NS))-((PN) +

(PP))*PW+PW*CL)-(PW-MR) 5. Repeated step 5 to 8

Either : Best individual > required threshold OR : Number of generation completed

6. New Population � population after crossover and mutation

7. Fitness � Fitness of newly construed individual

8. Merge both populations9. Update (best individual in the

population)10. Stop.Algorithm-2: Steps Used for construction of grammar from each individual chromosomeS0. Input: Set of strings random 0’s and 1’s. S1 First generate random sequence of 0’s and 1’s up to

size of chromosome. S2. Now, use BNF (Backus-Naur form) and chromosome

mapping to extract rules of the CFG. S3. Apply rules to eliminate left recursion. S4. Apply rules for left factoring to remove multiple

production rules from the same set of terminal. S5. Stop.

4. Grammar Induction Library: The implementation of “Grammar Induction Library” on the system requires support from the operating system in controlling the implementation of java. The control over the classes and functions are made available to the users/applications via system call supported by java virtual machine.

The classes which participating as a building blocks for the grammar induction libraries are given ahead:

a) SYM_TAB class This class provides the facility to declare variables in the form of symbols and binary.

b) VAR_TER class It is used to represents the terminals and non-terminals in binary 3-bit or 4-bit representation. This class declares various methods to represents the variables and terminals in the form of binary. c) Chromosome class This class has been used to get random chromosomes. In this class we have declared functions to get context free grammar also.

d) Check_Parser class We have used this class to check the correctness of the grammar for a set of strings. In this class we have declared the functions to remove the useless production from the production rules, to overcome the problem of left recursion, to resolve the problem of left factor, to find context free grammars, to parse the grammar and others. e) Gen_Ope Class This class basically deals with operations of genetic algorithm which we have declared earlier. In this class we have applied the operators called as crossover and mutation. This class basically deals with two famous problem known as palindrome and balanced parentheses. In the present class we have taken two symbol palindromes. We are dealing with positive and negative set of corpora as we have explained earlier for both the problems (palindromes and balanced parentheses). In this class we are calling the function of other classes by using the object of the respective class.

VI. EXPERIMENTAL SETUP

We have used java for the implementation of the present work. Since, the proposed work has to be run on sequential as well as parallel environments. Therefore, java is one of the best solutions to achieve all these. Java has been used to get the experimental result to implement the presented genetic algorithm. This language consists many of standard parallel programming instructions. Java language offers a machine independent method of distributing of distributing the code to perform the computations on different computer architectures. The communication performance between processors running java programs is a crucial issue [19]. In [19] Narendar Yalamanchilli and William gave the idea that java implementation may be slower, but performance improvements are possible.

A. Test Data: To understand the working of Genetic Algorithm, it is tested on two different languages. Both the problem we have selected is common test cases for effectiveness of

�������������� ��������������������������� |ICCCT’10|

756

grammatical inference methods. The test languages are given in the table as follows:

Table-1: Test Languages Language-id Descriptions Language-1 Balanced parentheses problem. Language-2 Two symbol palindrome over {a, b}.

The problem which we have selected for the implementation is based of the set of positive and negative strings. The set of corpora are given below:

Language-1: Balanced parenthesis problem (positive and negative example)

S + ={"()()","()()()","()()()()", "(()())", "((()()))", "(()()())", "(())()", "()((()))", "((()()))", "()((())())", "()()((()))", "()(((()))())", "(()(()))()", "(((()()()())))", "()()(((())))", "(((())))()()"} S − ={"()())((","(((()(())(","()(()((","( ()(","((())(","((())(((", "(()))((((", "((())(((", "()))(", "((())((", "(()))(((", "((())((((", "(((()))(", "((())))((", "((())(((","(((()))((((", "(())))((((("}

Language-2: Two symbol palindrome over {a, b} (positive and negative example)

S+ = {"baaaaaaaab", "aa", "bb", "abba", "baab", "aaaa", "bbbb", "aaaaaa", "aabbaa", "abaaba", "abbbba", "baaaab", "babbab","bbaabb","bbbbbb","aaaaaaaa","aaabbaaa","aabaabaa","aabbbbaa","abaaaaba","ababbaba","abbaabba","abbbbbba","baaaaaab","baabbaab","babaabab","babbbbab","bbaaaabb","bbabbabb","bbbaabbb","bbbbbbbb"}

S − ={"a","b","ab","ba","abab","baba", "aaab","aaba","abababababab", "aabaab","aaa","bbb","aaaaa", "bbbbb","aaabb","aabbb","aaaabb", "aabbaabb","aaaabaa","aaaabbba", "babababa","aabaabab","ababbaaba", "abbaabbaa","abbababba", "abbaaabba","baababaab", "bababababa"}

2. GA Parameters: The Genetic Algorithm parameter we have selected for the implementation of the problem is given in the table below:

Table-2: Parameters of the Genetic Algorithm S.No. Parameters Size/Value

1. Population Size 30 2. Chromosome Size 120 3. Maximum Generation 10 4. Probability of Crossover 0.5 5. Probability of Mutation 0.8 6. Selection Strategy Roulette Wheel

Selection

3. Results:

Language-1: Balanced Parenthesis Problem

Table-3: Showing the Generation of chromosomes G W A B 1 370 411.6 4322 414 416.3666666666667 432 3 414 417.56666666666666 432 4 414 421.73333333333335 432 5 431 431.9 433 6 432 432.06666666666666 433 7 432 432.1666666666667 433 8 432 432.3333333333333 433

9 432 432.6666666666667 433 10 433 435.0 493 G: Generation W: Worst A: Average B: Best

Best Chromosome is 000110110110110000100010000101001111110001011001100010110011000101101111011011000101101111011110011111000000110000111001

Best Chromosome in 10 Generation is 493 The Best over generations is

1. S->? 2. S->(S)

Total number of rules: 2 Its fitness is: 493

Total time required in milliseconds: 10000 Days : 0 Hours : 0 Minutes : 0 Seconds : 10 Mili-Seconds: 0

Language-2: Two symbol palindrome over {a, b}.

Table-4: Showing the Generation of chromosomes G W A B 1 368 407.56666666666666 4322 413 416.7 433

�������������� ��������������������������� |ICCCT’10|

757

3 414 419.6 451 4 414 431.46666666666664 472 5 431 449.6333333333333 511 6 433 462.23333333333335 527 7 470 476.1666666666667 527 8 472 482.23333333333335 548 9 472 488.56666666666666 567 10 472 496.93333333333334 567 G: Generation W: Worst A: Average B: Best

Best Chromosome is 011000010010101001000000110000000100000001110001001101011001000010000000111000101000011110010000110010110001100001101010

Best Chromosome in 10 Generation is 567

The Best over generations is

1. S->bSC 2. S->aSA 3. S->? 4. M->bCAM 5. M->? 6. C->Sb 7. A->aAbM 8. A->SSSM

Total number of rules: 8 Its fitness is: 567

Total time required in milliseconds: 51140 Days : 0 Hours : 0 Minutes: 0 Seconds : 51 Mili-Seconds: 140 Now, on the basis of the data given in table-3 and table-4 with the details of the best chromosome and the fitness value we can form a result set. The result set is given in the table-5 below:

Table-5: Resultant Grammar with Fitness Value L-id G FV EG L1 10 493 <{S},{(, )}, { S->?, S->(S)}, S > L2 10 567 <{S,A, C, M}, {a, b}, { S->bSC

S->aSA, S->?, M->bCAM, M->?, C->Sb, A->aAbM, A->SSSM}, S>

L-id: Language-id, FV: Fitness ValueG: Generation EG: Equivalent GrammarNote: Epsilon is denoted by “?”

The grammars shown in the table-5 are the grammar equivalent to the chromosome shown above. The grammar with fitness value shown in table-5 will accept all the positive examples and rejects the negative example as

considered for the experiment. The grammar out of 10 generation is represented as <V, T, P, S> where V is finite set of Variables, T is finite set of Terminals, P is finite set of Production rules and S is a starting Variable.

VII. CONCLUSION AND FUTURE ENHANCEMENT

In this paper author has given the detailed idea for grammar induction using genetic algorithm. The contribution of the work is as follows:

1. Basic concepts of Inducing Grammars. 2. Detailed explanation of Genetic Algorithms. 3. The learning approach for Context Free Grammar. 4. Context Free Grammar inducing library. 5. Solution of two problems first palindromes (two symbol

palindrome) and balanced parentheses.

As a future work we can do the following:

1. We can apply the similar method for parallel hardware for the adjustment of the population.

2. By using the method given in this report one can implement the concept of parallel genetic algorithms on different topologies.

3. Similar approach can be applied for other categories of parallel genetic algorithms.

4. We can also think for more optimum result by applying this approach

Acknowledgement Author thank to Dr. N.S.Choubey, Head Computer Engineering Department, MPSTME Shirpur Campus, NMIMS University, Mumbai, Maharashtra, for his kind support in providing Laboratory infrastructural facility and others required for the conduction of the experiment.

REFERENCES

1. Compression by Induction of Hierarchical Grammars by Craig G. Nevil-Manning, Ian H.Written, David L. Maulsby, Computer Science, University of Waikato, Hamilton, New Zealand.

2. An evolutionary approach to natural language grammar induction. Margaret Aycinena, Mykel J. Kochenderfer, and David Carl Mulford.

3. Wyard, P., Representational Issues for Context-Free Grammar Induction Using Genetic Algorithm in Proceedings of the 2nd International Colloquim on Grammatical Inference and Applications, Lecture Notes in Artificial Intelligence, Vol 862, pp. 222-235, 1994.

4. K.S. Fu and T.L. Booth. Grammatical inference: Introduction and Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8:343-375, 1986.

5. Inducing a Context Free Grammar for a Natural Language with a Genetic Algorithm by Joseph Lewis and Benjamin Collar December 14, 2000.

6. B. Keller and R. Lutz, Evolving stochastic context-free grammars from examples using a minimum description length principle, Workshop on Automata Induction Grammatical Interference and Language Acquisition, ICML-97 (1997).

7. Olgierd unold. “Grammatical Inference with Grammar-based Classifier System, Institute of Computer Engineering, Control and

�������������� ��������������������������� |ICCCT’10|

758

Robotics, Wroclaw University of Technology Wyb. Wyspianskiego 27, 50-370 Wroclaw Poland.

8. J. H. Holland, Adaptation in natural and artificial system, University of Michigan Press, Ann Arbor, 1975.

9. D.E. Goldberg, Genetic Algorithms in search, optimization, and machine learning, Addison-Wesley, Boston, 1989.

10. Introduction to Automata Theory, Languages, and Computation, 3/E ,John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman, Addison-Wesley, 2007.

11. F. Javed, B. R. Bryant, M.Crepinek, Mernik, Sprague, “Context-free Grammar Induction using Genetic Programming”, ACMSE, Huntzville, 2004.

12. C. de la Higuera, “A Bibliographical Study of Grammatical Inference”, Pattern Recognition, v. 38, no. 9, 2005, pp. 1332-1348.

13. Evolutionary Computing as a Tool for Grammar Development, Guy De Pauw, CNTS – Language Technology Group, UIA – University of Antwerp, Antwerp – Belgium, E. Cant´u-Paz et al. (Eds.): GECCO 2003, LNCS 2723, pp. 549–560, 2003.,c_Springer-Verlag Berlin Heidelberg 2003.

14. Genetic Programming with Incremental Learning for Grammatical Inference Ernesto Rodrigues and Heitor Silvério Lopes, Graduate Program in Electrical Engineering and Computer Science, Federal University of Technology – Paraná, Av. 7 de setembro, 3165 80230-901, Curitiba, Brazil.

15. Structuring Chromosomes for Context-free Grammar Evolution, Simon Lucas, Department of Electronics Systems Engineering University of Essex Colchester CO4 3SQ.

16. Sivanandam, Deepa “Introduction to Genetic Algorithm”, Springer, 2008.

17. Dr. N.S. Choubey, Department of Computer Engineering, M.P.S.T.M.E., N.M.I.M.S., University, Shirpur Campus, Shirpur, Dhule, India. Dr. M. U. Kharat, Department of Information Technology, G.H. Raisoni College of Engineering and Management, Nagpur, India. “Grammar Induction and Genetic Algorithm” 2009.

18. Narendar Yalamanchilli and William Cohen Department of Electrical and Computer Engineering University of Alabama in Huntsville, Huntsville, AL 35899, USA “Communication Performance of Java based Parallel Virtual Machines”, February 7, 1998