8
Information Processing Letters 91 (2004) 285–292 www.elsevier.com/locate/ipl Formal language identification: query learning vs. Gold-style learning Steffen Lange a , Sandra Zilles b,a FH Darmstadt, FB Informatik, Haardtring 100, 64295 Darmstadt, Germany b TU Kaiserslautern, FB Informatik, Postfach 3049, 67653 Kaiserslautern, Germany Received 6 January 2004; received in revised form 28 April 2004 Available online 24 June 2004 Communicated by P.M.B. Vitányi Abstract A natural approach towards powerful machine learning systems is to enable options for additional machine/user interactions, for instance by allowing the system to ask queries about the concept to be learned. This motivates the development and analysis of adequate formal learning models. In the present paper, we investigate two different types of query learning models in the context of learning indexable classes of recursive languages: Angluin’s original model and a relaxation thereof, called learning with extra queries. In the original model the learner is restricted to query languages belonging to the target class, while in the new model it is allowed to query other languages, too. As usual, the following standard types of queries are considered: superset, subset, equivalence, and membership queries. The learning capabilities of the resulting query learning models are compared to one another and to different versions of Gold-style language learning from only positive data and from positive and negative data (including finite learning, conservative inference, and learning in the limit). A complete picture of the relation of all these models has been elaborated. A couple of interesting differences and similarities between query learning and Gold-style learning have been observed. In particular, query learning with extra superset queries coincides with conservative inference from only positive data. This result documents the naturalness of the new query model. 2004 Published by Elsevier B.V. Keywords: Formal languages; Query learning; Gold-style learning 1. Introduction In machine learning, the quite natural approach of learning by ‘asking questions’ was first modeled and * Corresponding author. E-mail addresses: [email protected] (S. Lange), [email protected] (S. Zilles). investigated by Angluin [2]. An example for its use in machine learning systems is Shapiro’s Algorithmic Debugging System, cf. [10]. Since Angluin’s pioneer- ing paper [2], the query learning model has been re- ceiving a lot of attention (see [3] for a quite recent overview). Angluin’s [2] model deals with ‘one-shot’ learning. Here, a learning algorithm (henceforth called query 0020-0190/$ – see front matter 2004 Published by Elsevier B.V. doi:10.1016/j.ipl.2004.05.010

Formal language identification: query learning vs. Gold-style learning

Embed Size (px)

Citation preview

Page 1: Formal language identification: query learning vs. Gold-style learning

l

ractions,d analysis

lasses ofl model

ery otherembership

rsions ofervativeouple oflar, queryents the

e

Information Processing Letters 91 (2004) 285–292

www.elsevier.com/locate/ip

Formal language identification:query learning vs. Gold-style learning

Steffen Langea, Sandra Zillesb,∗

a FH Darmstadt, FB Informatik, Haardtring 100, 64295 Darmstadt, Germanyb TU Kaiserslautern, FB Informatik, Postfach 3049, 67653 Kaiserslautern, Germany

Received 6 January 2004; received in revised form 28 April 2004

Available online 24 June 2004

Communicated by P.M.B. Vitányi

Abstract

A natural approach towards powerful machine learning systems is to enable options for additional machine/user intefor instance by allowing the system to ask queries about the concept to be learned. This motivates the development anof adequate formal learning models.

In the present paper, we investigate two different types of query learning models in the context of learning indexable crecursive languages: Angluin’s original model and a relaxation thereof, called learning with extra queries. In the originathe learner is restricted to query languages belonging to the target class, while in the new model it is allowed to qulanguages, too. As usual, the following standard types of queries are considered: superset, subset, equivalence, and mqueries.

The learning capabilities of the resulting query learning models are compared to one another and to different veGold-style language learning from only positive data and from positive and negative data (including finite learning, consinference, and learning in the limit). A complete picture of the relation of all these models has been elaborated. A cinteresting differences and similarities between query learning and Gold-style learning have been observed. In particulearning with extra superset queries coincides with conservative inference from only positive data. This result documnaturalness of the new query model. 2004 Published by Elsevier B.V.

Keywords:Formal languages; Query learning; Gold-style learning

1. Introduction investigated by Angluin [2]. An example for its us

ofnd

in machine learning systems is Shapiro’s Algorithmicer-re-nt

.ery

.

In machine learning, the quite natural approachlearning by ‘asking questions’ was first modeled a

* Corresponding author.E-mail addresses:[email protected] (S. Lange),

[email protected] (S. Zilles).

0020-0190/$ – see front matter 2004 Published by Elsevier B.Vdoi:10.1016/j.ipl.2004.05.010

Debugging System, cf. [10]. Since Angluin’s pioneing paper [2], the query learning model has beenceiving a lot of attention (see [3] for a quite receoverview).

Angluin’s [2] model deals with ‘one-shot’ learningHere, a learning algorithm (henceforth called qu

Page 2: Formal language identification: query learning vs. Gold-style learning

286 S. Lange, S. Zilles / Information Processing Letters 91 (2004) 285–292

learner) receives information about a target concept byasking queries which will be truthfully answered by an

nerandredthe

aeryries[3]tingpthasd

s ofof(in-

ex-fer-isat-

textable

itherieing

erfi-

cannok.to

, seend

areingandllrn-heno-n-

ther-

formation it receives, but it may never revise its hy-pothesis. So a comparison between query inference

el-in,es

e-andurrsetre-

areold-beries:rsetdata

]bleore-

areThethatlan-ses,ap-fore

ingr ofthendin-eriesn-os-vese.

ur-ba-5].

f

oracle. After at most finitely many queries, the learis required to stop this process and to output its oneonly hypothesis. The learning process is considesuccessful, if this hypothesis correctly describestarget concept.

Angluin’s work and ensuing work in this aremainly address the aspect of efficiency of qulearning, measured in terms of the number of quemaximally needed to satisfy the learning goal, seeand the references therein. Thus several interespolynomial-time query learners for different conceclasses have been designed. In particular, thisrevealed close relationsbetween query learning anPAC-learning, see [11,2].

In the present paper, we study the pros and conAngluin’s [2] query learning model in the contextlearning indexable classes of recursive languagesdexable classes, for short). The learnability of indable classes has been intensively studied within difent formal frameworks (see [12] for a survey). Thismotivated by the fact that many interesting and nural classes, including regular, context free, consensitive, and pattern languages, constitute indexclasses.

We investigate learning of indexable classes wsuperset, subset, equivalence, and membership qucomparing the learning capabilities of the resultquery learners to one another. In contrast to formstudies, we neglect complexity issues. Regardingnite classes of concepts, apparently, every classbe learned with the usual types of queries. This islonger valid, if infinite classes form the learning tasFor illustration, membership queries do not sufficelearn the class of all extended pattern languages[8]. This motivates a detailed analysis of the power athe limitations of the query learning models.

Moreover, the resulting query learning modelscompared to models of Gold-style language learnsuch as finite learning, conservative inference,learning in the limit from only positive data as weas from positive and negative data. In Gold-style leaing, the learner may change its recent hypothesis wmore information is provided, but it receives only ‘lcal’ information about the object to be learned. In cotrast, in the query model the learner receives ra‘global’ information andcan affect the sample of in

s,

and Gold-style learning may help to explain the revance of these different features of learning. Agafor this purpose it is useful to neglect efficiency issuin query learning.

A complete picture displaying the relations btween all discussed versions of query learningGold-style learning is obtained. For example, oanalysis shows that any query learner using supequeries can be simulated by a Gold-style learnerceiving only positive data. In contrast to that, thereclasses learnable using subset queries, but not Gstyle learnable from positive data only. This cantraced back to a duality of superset and subset quethe relevance of positive data for simulating a supequery learner matches the relevance of negativefor simulating a subset query learner.

From a theoretical point of view, Angluin’s [2query learning model has the drawback that learnaclasses may possess non-learnable subclasses. Mover, a couple of quite simple indexable classesnot learnable with superset or subset queries.observed weakness is often caused by the factthe learners are constrained to query exclusivelyguages belonging to the target class. In many cathe learners are simply not allowed to make the ‘propriate’ queries. In the present paper, we theremodify Angluin’s [2] original model by relaxing thisconstraint and introduce a new model, called learnwith extra queries. We analyze the learning powethe resulting query learners by comparing them tocapabilities of query learners in the original model ato Gold-style language learners. As it turns out, andexable class is learnable using extra superset quif and only if there is a conservative Gold-style laguage learner that identifies this class from only pitive examples. To a certain extent, this result prothe naturalness of the new model of query inferenc

2. Notions and notations

Familiarity with standard mathematical and recsion theoretic notions and notations as well as withsic language theoretic concepts is assumed, cf. [9,

From now on, a fixed finite alphabetΣ with{a, b} ⊆ Σ is assumed. ByΣ∗ we denote the set oall words overΣ and any subsetL ⊆ Σ∗ is called

Page 3: Formal language identification: query learning vs. Gold-style learning

S. Lange, S. Zilles / Information Processing Letters 91 (2004) 285–292 287

a language.The complementL of a languageL isthe setΣ∗\L. If C is a class of languages, then we

n,

.an

s tofied

ies,esisd asect

s

s

f

or

a

a

-

wise, together with the answer ‘no’ a counterexamplefrom L′\L is supplied.

also’

-

,y.ies,

bset

d-me[4,ce

l

ing

t

t

denote byC the class{L | L ∈ C} of all complementsof languages inC.

Let C be a class of recursive languages overΣ∗. Cis said to be anindexable class, if there is an effectiveenumeration(Li)i∈N of all and only the languages iC such that membership is uniformly decidable, i.e.there is a computable function that, for anyw ∈ Σ∗and i ∈ N, returns 1, ifw ∈ Li , and 0, otherwiseSuch an enumeration will subsequently be calledindexingof C.

In the query learning model, a learner has accesan oracle that truthfully answers queries of a specikind. A query learnerM is an algorithmic devicethat, depending on the reply on the previous quereither computes a new query or returns a hypothand halts. Its queries and hypotheses are codenatural numbers; both will be interpreted with respto an underlyinghypothesis space. When learningan indexable classC, any indexingH = (Li)i∈N

of C may form a hypothesis space, such thatM,when learningC, is only allowed to query languagebelonging toC, see [2].

More formally, letC be an indexable class, letL ∈C, let H = (Li)i∈N be an indexing ofC, and letMbe a query learner.M learns L with respect toHusing some type of queriesif it eventually halts andits only hypothesis, sayi, correctly describesL, i.e.,Li = L. SoM returns its unique and correct guesiafter only finitely many queries. Moreover,M learnsC with respect toH using some type of queries, if itlearns everyL′ ∈ C with respect toH using queries othe specified type. Below we consider:

Membership queries. The input is a stringw andthe answer is ‘yes’ or ‘no’, depending on whethernotw belongs to the target languageL.

Equivalence queries. The input is an indexj ofsome languageL′ ∈ C. If L = L′, the answer is‘yes’. Otherwise, together with the answer ‘no’counterexample from(L′\L) ∪ (L\L′) is supplied.

Superset queries. The input is an indexj ofsome languageL′ ∈ C. If L ⊆ L′, the answer is‘yes’. Otherwise, together with the answer ‘no’counterexample fromL\L′ is supplied.

Subset queries. The input is an indexj of some lan-guageL′ ∈ C. If L′ ⊆ L, the answer is ‘yes’. Other

Equivalence, superset, and subset queries arestudied in arestricted form, for which the answer ‘nois no longer supplemented by a counterexample.

MemQ, EquQ, SupQ, andSubQdenote the collections of all indexable classesC ′ for which there are aquery learnerM ′ and a hypothesis spaceH′ such thatM ′ learnsC ′ with respect toH′ using membershipequivalence, superset, and subset queries, respectivelIf the learner uses only the restricted form of querthis is indicated by the prefix ‘res’ connected toEquQ,SupQ, or SubQ. Note that, by definition,resEquQ⊆EquQ, resSupQ⊆ SupQ, andresSubQ⊆ SubQ.

Moreover, it will be helpful to notice the followingsimple observation stating that superset and suqueries yield dual learning models.

Proposition 1. LetC be an indexable class.

(a) C ∈ SupQ iffC ∈ SubQ,(b) C ∈ resSupQ iffC ∈ resSubQ.

Comparing query learning with the standard moels in Gold-style language learning requires somore notions explained in brief below, see also1,12]. Let L be a language. Any infinite sequent = (wi)i∈N with {wi | i ∈ N} = L is called atextfor L. Then, for anyn ∈ N, tn denotes the initiasegment(w0, . . . ,wn) and content(tn) denotes theset{w0, . . . ,wn}. Any infinite sequence((wi, bi))i∈N

with bi ∈ {+,−}, {wi | bi = +} = L, and{wi | bi =−} = L is said to be aninformantfor L.

Let C be an indexable class,H = (Li)i∈N a hy-pothesis space (i.e., an indexing possibly comprisa proper superclass ofC), andL ∈ C. An inductive in-ference machine(IIM ) is an algorithmic device, thareads longer and longer initial segmentsσ of a text(informant) and outputs numbersM(σ) as its hypothe-ses. As above, an IIMM returning somei is construedto hypothesize the languageLi . Given a text (an infor-mant)t for L, M identifiesL from t with respect toH,if the sequence of hypotheses output byM, when fedt , stabilizes on a numberi (i.e., past some pointM al-ways outputs the hypothesisi) with Li = L. M iden-tifies C from text(informant) with respect toH, if itidentifies everyL′ ∈ C from every corresponding tex(informant). As above,LimTxt (LimInf) denotes the

Page 4: Formal language identification: query learning vs. Gold-style learning

288 S. Lange, S. Zilles / Information Processing Letters 91 (2004) 285–292

collection of all indexable classesC ′ for which thereare an IIMM ′ and a hypothesis spaceH′ such that

ton

.,-nt

tive

tongot

putde,

ne-has

onng

rn-totyleith

ld’seals

lim-sesta.ableetthe

ven

kind of additional information which negative datanormally provide. Still, the potential of subset query

sses-arn-ersare

asktails

s

g

ries

rner

es

lestent

)

edry

M ′ identifies C ′ from text (informant) with respecto H′. A quite natural and often studied modificatiof LimTxt is defined by the model ofconservative in-ference. M is a conservativeIIM for C with respectto H, if M performs only justified mind changes, i.eif M, on some textt for someL ∈ C, outputs hypothesesi and laterj , thenM must have seen some elemew /∈ Li before returningj . The collection of all index-able classes identifiable from text by a conservaIIM is denoted byConsvTxt.

In contrast to query learners, an IIM is allowedchange its mind finitely many times before returniits final and correct hypothesis. In general, it is ndecidable whether or not an IIM has already outits final hypothesis. In case this is decidable we alluto finite learning, see [4]. Similar to query learningfinite learning can be understood as a kind of ‘oshot’ learning, where the first hypothesis alreadyto be correct. The corresponding modelsFinTxt andFinInf are defined as above. Some helpful resultsGold-style learning are summarized in the followiproposition, see [4,6,12] for the details.

Proposition 2.

FinTxt⊂ FinInf ⊂ ConsvTxt⊂ LimTxt⊂ LimInf.

3. Comparison results

The scope of this paper is to compare the leaing capabilities of Angluin’s query learning modelsone another and to the different versions of Gold-slearning defined above. First, note that learning w(restricted) equivalence queries coincides with Gomodel of limit learning from positive and negativdata, while learning with membership queries equfinite learning from positive and negative data.1 Sec-ond, the power of superset query learners is ratherited, because they will only be successful for clasthat are Gold-style learnable from only positive daIn contrast, there are indexable classes not learnfrom only positive data, but learnable with subsqueries. Thus, subset query learners may exploit

1 These results are somehow folklore and will not be prohere.

learners is severely limited: there are indexable clain LimTxt, even inFinInf , which are not learnable using subset queries. In contrast to the case when leing with equivalence queries is considered, learnthat are allowed to ask superset (subset) queriesmore powerful than learners that are constrained torestricted superset (subset) queries. The formal deare summarized in Theorems 3 and 4.

Theorem 3.

(a) FinTxt⊂ resSupQ⊂ SupQ⊂ LimTxt⊂ LimInf ,(b) FinTxt⊂ resSubQ⊂ SubQ⊂ LimInf .

Proof. (a) First, FinTxt ⊂ resSupQcan be verifiedusing the following fact known from [12]: ifC ∈FinTxt, thenL#L′ for all distinct languagesL andL′in C.

Second, we verifySupQ\resSupQ�= ∅. Considerthe classCrsup containingL0 = {a}∗ and all languageLi = ({a}∗\{a, ai})∪{bz | z � i}, i � 1. To learnCrsup,a query learnerM starts with a query correspondinto L0. If the answer is ‘yes’,M knows that thetarget language equalsL0. If the answer is ‘no’, anelement fromL0, say bz, z � 1, is supplied as acounterexample. Sincebz /∈ Li for all i > z, M caneasily identify the target language by posing quecorresponding toL1,L2, . . . ,Lz. To see thatCrsup /∈resSupQ, suppose there is a superset query leaM ′ for Crsup. ConsiderM ′ when learningL1. If M ′queriesL1, provide the answer ‘yes’. In all other casprovide the answer ‘no’. Letz be the maximal indexof a languageM ′ queries before returning its sohypothesis. Since the answers provided are consiwith the two distinct languagesL1 andLz+1, M ′ failsfor at least one of them. SoCrsup /∈ resSupQand henceSupQ\resSupQ�= ∅.

Third,LimTxt\SupQ�= ∅ follows via Theorem 4(eand FinInf ⊆ LimTxt. To verify SupQ⊆ LimTxt, letC ∈ SupQandM a query learner identifyingC withrespect to an indexing(Li)i∈N. The following IIMM ′ LimTxt-identifiesC. Given a text segmentσ oflengthn, M ′ interacts withM simulating a learningprocess forn steps. In stepk, k � n, depending onhow M ′ has replied to the previous queries posby M, the learnerM computes either (i) a new que

Page 5: Formal language identification: query learning vs. Gold-style learning

S. Lange, S. Zilles / Information Processing Letters 91 (2004) 285–292 289

i or (ii) a hypothesisi. In case (ii),M ′ returns thehypothesisi and stops simulatingM. In case (i),M ′

h

irf

in

nd

g

all

nsy,

w-abl

ofsk

. AsarBying

a

an indexable class. Anextra query learnerfor C ispermitted to query languages in any indexing(L′ )i∈N

ies

ol-

thtes).

rget

b-

su-

d-

-

1ed

rs, thextra

t

checks whether there is a word incontent(σ )\Li . Ifsuch a word exists,M ′ replies ‘no’ using one sucword as a counterexample; elseM ′ replies ‘yes’. Ifk < n, M executes stepk + 1, elseM ′ returns anyauxiliary hypothesis and stops simulatingM. Givensegmentsσ of a text for some target language, if thelength n is large enough,M ′ answers all queries oM correctly andM returns its sole hypothesis withn steps. So, the hypotheses returned byM ′ stabilize onthis correct guess.

(b) First, FinTxt ⊂ resSubQholds, sinceL#L′for all distinct languagesL and L′ in any C ∈FinTxt. The remaining assertions follow from (a) aProposition 1. �Theorem 4. (a) resSupQ� SubQ; (b) resSubQ�SupQ; (c) resSupQ� FinInf ; (d) resSubQ� LimTxt;(e) FinInf � SupQ; (f ) FinInf � SubQ.

Proof. For (d) and (f ) we provide the separatinclasses only. The class that containsL = {a}∗ andLi = L\{ai} for all i � 1 proves (d). Assertion(f ) can be shown using the class consisting oflanguages{a, ai}∪{b, . . . , bi}, i � 1. Now (b) followsfrom (d), becauseSupQ⊆ LimTxt. Moreover, (a)is obtained from (b) via Proposition 1. Assertio(c) and (e) follow from (d) and (f ), respectivelvia Proposition 1, taking into account thatFinInf ⊆LimTxtand that an indexable classC is in FinInf iff Cis in FinInf . �4. Learning with extra queries

Angluin’s [2] query learning model has one draback: learnable classes may possess non-learnsubclasses. For instance,Crsup can be identified usingsuperset queries, while its subclassC ′

rsup= Crsup\{{a}∗}cannot. This drawback is a direct consequenceAngluin’s assumption that the learner may only aqueries referring to languages in the target classthe classC ′

rsup illustrates, this assumption may bthe learner from asking the ‘appropriate’ queries.Proposition 1, an analogous result holds for learnwith subset queries.

To overcome these difficulties, we introducemodified version of Angluin’s [2] model. LetC be

e

i

of a superclassC ′ of C. We say thatC is learnable usingextra superset (subset) queries with respect to(L′

i )i∈N

iff there is an extra query learnerM learningC withrespect to(L′

i )i∈N using superset (subset) querconcerningC ′. Then xSupQ (xSubQ) denotes thecollection of all indexable classesC learnable withextra superset (subset) queries; the notionsxresSupQandxresSubQare defined analogously.

The above discussion immediately yields the flowing result.

Theorem 5. (a) SupQ⊂ xSupQ; (b) SubQ⊂ xSubQ.

In contrast to the original model, learning wisuperset queries and learningwith restricted supersequeries now coincide (analogously for subset queri

Theorem 6. (a) xresSupQ= xSupQ; (b) xresSubQ=xSubQ.

Proof. (a) Obviously, it suffices to verifyxSupQ⊆xresSupQ. So let C ∈ xSupQand letM be a corre-sponding query learner. In order to learn some talanguageL, the requiredxresSupQ-learnerM ′ usesM as a subroutine. In the long run,M ′ is supposedto adoptM ’s hypothesis. For this purpose,M ′ asksthe same queries asM does and hands over the otained replies toM. In order to makeM work,M ′ hasto find supplementing counterexamples in case aperset query ofM ′, referring to someL′, receives thereply ‘no’. But this is easily done.M ′ simply asks,for all elementsw ∈ L′, a superset query corresponing to the languageΣ∗\{w}, until it receives the reply‘no’. If this happens,M ′ has found the counterexample needed, sinceΣ∗\{w} � L iff w ∈ L.

(b) Using (a) and an adaptation of Propositionfor learning with extra queries yields the desirproof. �

Although the learning power of query learneincreases with the permission to ask extra querieslearners are not always able to identify also the elanguages they query.

Theorem 7. (a) There is someC ∈ xSupQ such thaC ′ /∈ SupQ for all superclassesC ′ of C.

Page 6: Formal language identification: query learning vs. Gold-style learning

290 S. Lange, S. Zilles / Information Processing Letters 91 (2004) 285–292

(b) There is someC ∈ xSubQ such thatC ′ /∈ SubQfor all superclassesC ′ of C.

g

f

l4]

ial

h

inetg

f

ct

s

isrse

skthe

eanded,

tion

of Proposition 1 for extra query learning implies ananalogous result concerning subset queries.

iesto

eromhe

with

iess

-

y

n

g

a

e for

Proof. (a) LetCxsupbe the class consisting ofΣ∗ andall finite subsets ofΣ∗\{a}.

First, to verifyCxsup∈ xSupQ, choose an indexinof Σ∗\{a} and all languages inCxsup. A learnerM forCxsup may initially make a query forΣ∗\{a}. If theanswer is ‘no’, then the target languageL must beΣ∗.If the answer is ‘yes’,M can determineL by (i) askingfinite subsets ofΣ∗\{a} until some finiteLi ⊇ L isfound and (ii) asking subsets ofLi until the minimalsubset ofLi forcing the answer ‘yes’ is known.

Second, no superclass ofCxsup belongs toSupQ.Assuming the contrary, supposeC ′ is a superclass oCxsup in SupQ. Then distinguish two cases.

Case1. Σ∗\{a} ∈ C ′: Since C ′ also contains alfinite subsets ofΣ∗\{a}, arguments used already in [imply C ′ /∈ LimTxtand thusC ′ /∈ SupQby Theorem 3.

Case 2. Σ∗\{a} /∈ C ′: Assume some learnerMidentifiesC ′ with superset queries. Consider a specscenarioS of M when learningΣ∗. If M queriesΣ∗, let M get the answer ‘yes’. IfM queries someLi �= Σ∗, let M get the answer ‘no’ together witsome counterexamplewi �= a. Note that somewi ∈Σ∗\Li with wi �= a exists, becauseLi �= Σ∗\{a}.After finitely many queriesM guessesΣ∗.

Let Li1, . . . ,Lin be those languages queriedthe scenarioS, which have been answered ‘no’. LL = {wi1, . . . ,win} be the set of the correspondincounterexamples. Notice thatL is a finite subset oΣ∗\{a} and hence belongs toC ′. But, since all queriesin the scenarioS are answered truthfully with respeto L, M fails to learnL. SoC ′ /∈ SupQ.

(b) This follows from (a), as Proposition 1 holdanalogously for learning with extra queries.�

Considering the classesC ′rsup andCxsup in xSupQ\

SupQ, it is remarkable that for both classes therea successful learner which uses restricted supequeries and which is additionally allowed to amembership queries. So one might suspect thatfull capabilities of xSupQ-learners can already bachieved by learners using restricted supersetmembership queries. But this is not the case. Inde[7] provides a class inxSupQwhich is not learnablewith superset and membership queries. An adapta

t

It remains to compare learning with extra querto the Gold-style learning types. A method similarthat used in the proof of Theorem 3 yieldsxSupQ⊆LimTxt. But interestingly, there is an even strongrelation between Gold-style language learning frtext and learning with extra superset queries. Tlearning capabilities ofconservativeIIMs and ofextra superset query learners coincide. Note that,Proposition 2 and Theorem 8, thenxSupQ⊂ LimTxt.

Theorem 8. xSupQ= ConsvTxt.

Proof. ‘ConsvTxt⊆ xSupQ’: Fix C ∈ ConsvTxt. Thenthere is an indexing(Li)i∈N comprisingC and an IIMM, such thatMConsvTxt-identifiesC with respect to(Li)i∈N. Note that, ifL ∈ C andt is a text forL, thenM never returns an indexi with L ⊂ Li on any initialsegment oft .

Now the indexable class used for the quercontains all languages in(Li)i∈N and all languageΣ∗\{w} for w ∈ Σ∗. An xSupQ-learnerM ′ identify-ing anyL ∈ C may work as follows.M ′ uses queriesconcerning languages of the formΣ∗\{w} to constructa text forL and simulatesM on the obtained text segments untilM returns an index of a superset ofL. Thisindex is then returned byM ′.

First, to effectively enumerate a textt for L, M ′determines the setT of all wordsw ∈ Σ∗, for whichthe query representingΣ∗\{w} is answered with ‘no’.SinceT = L and T is recursively enumerable, anrecursive enumeration ofT yields a text forL.

Second, to compute its hypothesis,M ′ executessteps 0,1,2, . . . until it receives a stop signal. Igeneral, stepn, n ∈ N, consists of the followinginstructions:

Determine i := M(tn), where t is a recursiveenumeration of the setT . Pose a query referrinto Li .2 If the answer is ‘no’, execute stepn + 1.Otherwise hypothesize the languageLi and stop.(*In the latter case, asM never hypothesizesproper superset ofL, M ′ returns an index forL.*)

2 Since one cannot ensure thatLi belongs toC, this indicateswhy restricted superset and membership queries do not sufficachieving the power ofxSupQ.

Page 7: Formal language identification: query learning vs. Gold-style learning

S. Lange, S. Zilles / Information Processing Letters 91 (2004) 285–292 291

Further details are omitted.‘xSupQ⊆ ConsvTxt’: Fix someC ∈ xSupQ. Then

t

s)

is

is

bytly

er,-

ss,s,

To prove thatM ′ avoids overgeneralizations, as-sume to the contrary, that there is anL ∈ C, a text

-

esee

s,

ion

edults

ipet

l

there is an indexing(Li)i∈N comprisingC and a querylearnerM, such thatM learnsC with extra supersequeries in(Li)i∈N. Define a new indexing(L′

i )i∈N:

• L′0 is the empty language.

• If i � 1 andi is the canonical index3 of the non-empty set{i1, . . . , in}, thenL′

i = Li1 ∩ · · · ∩ Lin .

An IIM M ′ identifying C in the limit from textwith respect to the hypothesis space(L′

i )i∈N, given atext t , may work as follows:M ′(t0) := 0. To computeM ′(tn+1), the learnerM ′ interacts withM simulatinga learning process forn steps. In stepk, k � n,depending on howM ′ has replied to the previouqueries posed byM, the learnerM computes either (ia new queryi or (ii) a hypothesisi.

If case (i) occurs, thenM ′ checks whether therea word in content(tn)\Li . If such a word, sayw, isfound, thenM ′ provides the reply‘no’, accompaniedby w as a counterexample, to the query learnerM. Ifno such word is found, thenM ′ provides the reply‘yes’. If k < n, then M ′ proceeds with stepk + 1,otherwise M ′ returns the hypothesisM ′(tn+1) :=M ′(tn) and stops.

If case (ii) occurs, thenM ′ computes the hypothesM ′(tn+1) according to the following directives:

• Let Li+1, . . . ,Li+m be the languages represented

the queries answered with ‘yes’ in the currensimulated scenario.

• Compute the canonical indexi ′ of the set{i, i+1 ,

. . . , i+m} and return the hypothesisM ′(tn+1) = i ′.

It is not hard to verify thatM ′ learnsC in the limitfrom text; the relevant details are omitted. Moreovas we will see next,M ′ avoids overgeneralized hypotheses, that means, ift is a text for someL ∈ C,n ∈ N, andM ′(tn) = i ′, thenL′

i′ �⊃ L. Therefore,M ′can easily be transformed into a learnerM ′′ whichidentifies the classC conservatively in the limit fromtext.4

3 See [9] for the definition of canonical indices of finite sets.4 Note that a result in [12] states that any indexable cla

which is learnable in the limit from text without overgeneralizationbelongs toConsvTxt.

t for L, and ann ∈ N, such that the hypothesisi ′ =M ′(tn) fulfills L′

i′ ⊃ L. Theni ′ �= 0. By definition ofM ′, there must be a learning scenarioS for M, inwhich

• M poses queries representingLi−1, . . . ,Li−k

,Li+1,

. . . ,Li+m (in some particular order);• the answers are ‘no’ concerningLi−1

, . . . ,Li−kand

‘yes’ concerningLi+1, . . . ,Li+m ;

• afterwardsM returns the hypothesisi.

Hencei ′ is the canonical index of{i, i+1 , . . . , i+m}.This impliesL′

i′ = Li ∩Li+1∩· · ·∩Li+m . As by assump

tion L′i′ ⊃ L, each of the languagesLi+1

, . . . ,Li+m is a

superset ofL. By definition ofM ′, Li−j� content(tn)

for 1 � j � k. Therefore none of the languagLi−1

, . . . ,Li−kare supersets ofL. So the answers in th

learning scenarioS above are truthful respecting thlanguageL. AsM learnsC with extra superset queriethe hypothesisi must be correct forL, i.e., Li = L.This yieldsL′

i′ ⊆ L in contradiction toL′i′ ⊃ L.

So M ′ learns C in the limit from text withoutovergeneralizations, which—by the argumentatabove—impliesC ∈ ConsvTxt. �

5. Summary

The following figure summarizes the observcomparison results. Concerning the indicated resnot proved yet, note thatMemQ⊆ xSupQandMemQ⊆xSubQ follow from the fact that each membershquery for a wordw can be simulated by a supersquery forΣ∗\{w} or a subset query for{w}. An ex-ample for a class inresSupQ\xSubQis the class of alfinite languages.

LimInf = resEquQ= EquQ

LimTxt

ConsvTxt= xresSupQ= xSupQ xresSubQ= xSubQ

SupQ FinInf = MemQ SubQ

resSupQ resSubQ

FinTxt

Page 8: Formal language identification: query learning vs. Gold-style learning

292 S. Lange, S. Zilles / Information Processing Letters 91 (2004) 285–292

Each learning type is represented as a vertex in adigraph. A directed path fromA to B indicates that the

ing

tful

g 2

er-r-

tificial Intelligence, vol. 2225, Springer-Verlag, Berlin, 2001,pp. 12–31.

d

ry,MA,

es,

llylial

withic

e,

ive

s,

7

esforial0–

learning typeB outperforms typeA, i.e.,B is a propersuperset ofA. A missing path betweenA andB meansthatA andB are incomparable, i.e., there are learnproblems solvable with respect to the learning typeA,but not solvable with respect to the learning typeB,and vice versa.

Acknowledgements

Many thanks are due to Jochen Nessel for fruidiscussions.

References

[1] D. Angluin, Inductive inference of formal languages frompositive data, Inform. and Control 45 (1980) 117–135.

[2] D. Angluin, Queries and concept learning, Machine Learnin(1988) 319–342.

[3] D. Angluin, Queries revisited, in: Proc. 12th Internat. Confence on Algorithmic Learning Theory, in: Lecture Notes in A

[4] E.M. Gold, Language identification in the limit, Inform. anControl 10 (1967) 447–474.

[5] J.E. Hopcroft, J.D. Ullman, Introduction to Automata TheoLanguages, and Computation, Addison–Wesley, Reading,1979.

[6] S. Lange, Algorithmic Learning of Recursive LanguagMensch und Buch Verlag, 2000.

[7] S. Lange, S. Zilles, Replacing limit learners with equapowerful one-shot query learners, in: Proc. 17th AnnuaConference on Learning Theory, in: Lecture Notes in ArtificIntelligence, Springer-Verlag, Berlin, 2004, in press.

[8] J. Nessel, S. Lange, Learning erasing pattern languagesqueries, in: Proc. 11th Internat. Conference on AlgorithmLearning Theory, in: Lecture Notes in Artificial Intelligencvol. 1968, Springer-Verlag, Berlin, 2000, pp. 86–100.

[9] H. Rogers, Theory of Recursive Functions and EffectComputability, MIT Press, Cambridge, MA, 1987.

[10] E. Shapiro, Algorithmic Program Debugging, MIT PresCambridge, MA, 1983.

[11] L. Valiant, A theory of the learnable, Commun. ACM 2(1984) 1134–1142.

[12] T. Zeugmann, S. Lange, Aguided tour across the boundariof learning recursive languages, in: Algorithmic LearningKnowledge-Based Systems, in: Lecture Notes in ArtificIntelligence, vol. 961, Springer-Verlag, Berlin, 1995, pp. 19258.