Learning Regular - University of Pennsylvania...(deterministic finite automaton) and the syntactic right congruence ~ of a language. L* works due to the Myhill-NerodeTheorem, stating

1

LearningRegular

ω-Languages

The Goal

Device an algorithm for learning

2

an unknown regularω-language L

The Goal


3


set of infinite words (ω-words)

The Goal


4


accepted by a finite automaton

The Goal


5

Learner


The Goal


6

Learner


L

Teacher

The Goal


7

Learner


L

Teacher

membership queries

equivalencequeries

Coping with ω-words

8

Learner

L

Teacher

Isabcccccabdebbbaaaabcdaa…

in L?


9

Learner

L

Teacher

Isabcccccabdebbbaaaabcdaa…

in L?


10

Learner

L

Teacher

Iswingardium laviosaω in L?


11

Learner

L

Teacher


wingardium laviosa laviosa laviosa laviosa laviosa …


12

Learner

L

Teacher


wingardium laviosa laviosa laviosa laviosa laviosa …

THM:

Two regular ω-languages are equivalent iff

they agree on the set of ultimately periodic words

The Goal

13

Is uvω in L ?

Learner

L

Teacher

The Goal

14

Is uvω in L ?

Yes / No

Learner

L

Teacher

The Goal

15

Is uvω in L ?

Yes / No

Is H same as L ?

Learner

L

Teacher

The Goal

16

Is uvω in L ?

Yes / No

Is H same as L ?

Yes / No, c.e: uvω

Learner

L

Teacher

The Goal

17

Is uvω in L ?

Yes / No

Is H same as L ?

Yes / No, c.e: uvω

H s.t. H=L

Learner

L

Teacher

Motivation

Same problem for regular (finitary) languages is solved by the L* algorithm [Angluin]

L* has found applications in many areas including AI, neural networks, geometry, data mining, verificationand synthesis and many more.

Reactive systems concerns ω-languages, using L* for this limits application to safety (and excludes liveness, fairness)

18

Previous Work on Learning ω-Langs.

19

allregular

ω-languages

Safe

DP / DMDS / DR

DWP

DCDB

DWCDWB

strictly less expressive


20

allregular

ω-languages

Safe

DP / DMDS / DR

DWP

DCDB

DWCDWB


• [de la Higuera & Janodet, 2004]

• [Jayasrirani et al, 2012]

• [Saoudi & Yokomori, 1993]

From Prefixes

From Lassos


21

allregular

ω-languages

Safe

DP / DMDS / DR

DWP

DCDB

DWCDWB





• [Maler & Pnueli ,1995]

From Prefixes

From Lassos

From Lassos


22

allregular

ω-languages

Safe

DP / DMDS / DR

DWP

DCDB

DWCDWB






From Prefixes

From Lassos

From Lassos

goal


23

allregular

ω-languages

Safe

DP / DMDS / DR

DWP

DCDB

DWCDWB






From Prefixes

From Lassos

From Lassos

goal


24

allregular

ω-languages

Safe

DP / DMDS / DR

DWP

DCDB

DWCDWB






From Prefixes

From Lassos

From Lassos

goal

However …

25

L$

SyntacticFORC

RecurrentFDFA

[Calbrix, Nivat & Podelski ‘93]

[Maler & Staiger ‘93]

However …

26

L$

SyntacticFORC

RecurrentFDFA

May be exp. bigger

than



However …

27

L$

SyntacticFORC

RecurrentFDFA

May be exp. bigger

than

May be quadr.

bigger than



Main Contribution

28

Learner

Lω –

a learning algorithm for all 3 representations

(periodic, syntactic, recurrent)

Why is it difficult?

29

ω-aut.

L*

1962

1987

2014

1993

1994

2005

2012

2008


30

states of the minimal DFA (deterministic finite automaton)

and the syntactic right congruence ~ of a language.

L* works due to the Myhill-NerodeTheorem, stating a 1-to-1 relationship between

1

2

53

4

a

b

c

a

bc

a

12

3 54

ω-aut.

L*

1962

1987

2014

1993

1994

2005

2012

2008


31

states of the minimal DFA (deterministic finite automaton)

and the syntactic right congruence ~ of a language.

L* works due to the Myhill-NerodeTheorem, stating a 1-to-1 relationship between

1

2

53

4

a

b

c

a

bc

a

12

3 54

But for ω-langs, the syntactic right congruence is not informative enough!

ω-aut.

L*

1962

1987

2014

1993

1994

2005

2012

2008

For finite words

The Syntactic Right Congruence

32

x »L y iff v2*. xv2L , yv2L

For finite words

Example:

L = { w w has an even number of a’s}


33


For finite words

Example:

L = { w w has an even number of a’s}

Then »L has two equivalence classes:

words with an odd number of a’s and

words with an even number of a’s


34


a

a

b b10

For finite words:

For ω-words:


35


x »L y iff w2!. xw2L, yw2L

For finite words:

For ω-words:


36


x »L y iff w2!. xw2L, yw2Lx »L y iff u,v2*. xuv!2L, yuv!2L

For ω-words:


37

x »L y iff u,v2*. xuv!2L, yuv!2L

Example:

L = { w | w has finitely many a’s}

For ω-words:


38


Example:


Then xuv!2L iff uv! has finitely may a’s.

For ω-words:


39


Example:



Regardless of x. Thus »L has just one equivalence class.

For ω-words:


40


Example:




For ω-words:


41


Example:




But clearly an ω-automaton for L requires at least two states.

Solution Foundation

Families of DFAs (FDFA)

A non-traditional representation of ω-automata

inspired by Maler & Staiger’s ’97 families of right congruences (FORC)

and their canonical representation of a Syntactic FORC

for which they have shown a Myhill-Nerode Theorem

42

Family of Right Congruences [MS97]

43

Leading Right Congruence

2

2

1

1

4

4

5

5

Plus some restriction (details ommited)

33

(»,¼1,¼2,¼3,¼4,¼5)

ProgressLeading

Family of DFAs (FDFA)

44

Leading DFAM

(M, P1, P2, P3, P4, P5 )

ProgressLeading


45

Leading DFAM

11P1

(M, P1, P2, P3, P4, P5 )

ProgressLeading


46

Leading DFAM

11P1

2

2P2

(M, P1, P2, P3, P4, P5 )

ProgressLeading


47

Leading DFAM

11P1

2

2P2

33P3

(M, P1, P2, P3, P4, P5 )

ProgressLeading


48

Leading DFAM

11P1

2

2P2

33P3

4

4P4

(M, P1, P2, P3, P4, P5 )

ProgressLeading


49

Leading DFAM

11P1

2

2P2

33P3

4

4P4

5

P5

(M, P1, P2, P3, P4, P5 )

ProgressLeading


50

Leading DFAM

11P1

2

2P2

33P3

4

4P4

5

P5

(M, P1, P2, P3, P4, P5 )

ProgressLeading

That restriction removed.

FDFA Exact Acceptance

(u,v)2 M, P1, P2, P3, P4, P5

P1P1P1

22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4P4P4 P5P5P5

P3P3P3

P1P1P1

?uv!


(u,v)2 M, P1, P2, P3, P4, P5

P1P1P1

22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4P4P4 P5P5P5

P3P3P3

P1P1P1

u

?uv!


(u,v)2 M, P1, P2, P3, P4, P5

P1P1P1

22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4P4P4 P5P5P5

P3P3P3

P1P1P1

u

v

?uv!


(u,v)2 M, P1, P2, P3, P4, P5

P1P1P1

22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4P4P4 P5P5P5

P3P3P3

P1P1P1

u

v

?uv!

FORC and FDFA – Example


1

a,b

1

No a’s

Some a’s

FORC FDFA

Leading

Progress

All prefixes are equally good



1

a,b

a,ba

P1

b

1

1

No a’s

Some a’s

FORC FDFA

Leading

Progress




1

a,b

a,ba

P1

b

1

1

No a’s

Some a’s

FORC FDFA

(abba,bbb)

(abba,bab)

Leading

Progress


4-state automaton

58

0

1

2

2

1

2

1

00,1,2

4-state automaton

59

0

1

2

2

1

2

1

00,1,2

Some periods of are easy to find:

11

22

100201

10020122

4-state automaton

60

0

1

2

2

1

2

1

00,1,2


11

22

100201

10020122

Some are hard:

1012

4-state automaton

61

0

1

2

2

1

2

1

00,1,2


11

22

100201

10020122

Some are hard:

10121012 1012

1012 1012

4-state automaton

62

0

1

2

2

1

2

1

00,1,2


11

22

100201

10020122

Some are hard:

10121012 1012

1012 1012

The DFA for periods of has 23 states

=> L$ > 23

4-state automaton

63

0

1

2

2

1

2

1

00,1,2

All periods that loop back are easy to find:

11

22

100201

10020122

4-state automaton

64

0

1

2

2

1

2

1

00,1,2

All periods that loop back are easy to find:

11

22

100201

10020122

The DFA that accepts only periods of that loop back has only 5 states !

Saturation (Consistency)

How can we use this without losing saturation?

65



66

We say that a language is saturated if

uv!=xy! implies (u,v)2 L, (x,y)2 L



To achieve saturation, we change the definition of acceptance.

67

We say that a language is saturated if

uv!=xy! implies (u,v)2 L, (x,y)2 L

FDFA Normalized Acceptance

P1P1P1 22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4 P5P5P5

P3P3P3

P1P1P1

(u,v)2 M, P1, P2, P3, P4, P5 ?


P1P1P1 22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4 P5P5P5

P3P3P3

P1P1P1

Normalization seeks for the smallest repetition

of the period that loops back

(u,v)2 M, P1, P2, P3, P4, P5 ?


P1P1P1 22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4 P5P5P5

P3P3P3

P1P1P1



(u,v)2 M, P1, P2, P3, P4, P5 ?


P1P1P1 22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4 P5P5P5

P3P3P3

P1P1P1



(u,v)2 M, P1, P2, P3, P4, P5 ?


P1P1P1 22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4 P5P5P5

P3P3P3

P1P1P1



(u,v)2 M, P1, P2, P3, P4, P5 ?


P1P1P1 22

44

33 55

MM

2211

44

33 552

4

3 5

M

21

4

3 5

P4 P5P5P5

P3P3P3

P1P1P1



We term Recurrent FDFA the FDFA where progress DFA recognize only periods that loop back. It is saturated under normalized acceptance!

(u,v)2 M, P1, P2, P3, P4, P5 ?

More generally

74

SyntacticFDFA

RecurrentFDFA

May be exp. bigger

than

Periodic FDFA (L$)

Generalizing to arbitrary n we get the following lower bound

The Learner Lω

75

Same scheme for all 3

canonical FDFAs

Minimality?

The Recurrent FDFA for a given L is not necessarilythe minimal FDFA for L.

According to the normalized acceptance criterion a progress DFA Pu should give correct results only to extensions that close a cycle to u. On other extensions it has freedom.

This has the flavor of minimization with unknowns, which is NP complete.

The Recurrent FDFA chooses to treat all don’t cares as rejecting.

76

Time Complexity

Interestingly, [Klarlund, 1994] has shown that choosing a leading automaton which is more refined (bigger) than the syntactic right congruence, may yield an overall smaller FDFA.

If we are given such a leading automaton, we can feedit to the learning algorithm, in which case it will yield the smaller FDFA.

However, the same phenomenon can cause the learning algorithm for the syntactic/recurrent FDFAs to work as hard as the one for the periodic FDFA

77

Time Complexity

The worst case time complexity for all 3 families is thus polynomial in the size of the periodic FDFA.

Some positive results on sizes obtained via recurrent FDFA learner vs. periodic FDFA learner on randomly generated Muller automata.

78

Future Direction

Find smaller canonical representations ?

Find smallest FDFA ?

Learning the leading first ?

L*-learning for (traditional) ω-automata ?

Other types of learning for ω-languages ?

80

81

THE ENDThank you for your

attention!

Comments or questions?

Learning

Regular

Languagesvia

Alternating

Automata

IJCAI’15

Time Complexity

The worst case time complexity for all 3 families is thus polynomial in the size of the periodic FDFA.

Some positive results on sizes obtained via recurrent FDFA learner vs. periodic FDFA learner on randomly generated Muller automata.

83

Documents

Learning Regular - University of Pennsylvania...(deterministic finite automaton) and the syntactic right congruence ~ of a language. L* works due to the Myhill-NerodeTheorem, stating