The undecidability of self-embedding for term rewriting systems

Information Processing Letters 20 (1985) 61-64 15 February_ 1985

North-Holland

T H E UNDECIDABILITY OF S E L F - E M B E D D I N G F O R T E R M R E W R I T I N G S Y S T E M S *

David A. PLAISTED

Department of Computer Science, Unwersity of Illinois, 1304 IV. Springfield Avenue, Urbana, IL 61801, U.S.A.

Communicated by J. Nievergelt Received 10 October 1983 Revised 10 March 1984:25 July 1984

The self-embedding property of term rewriting systems is closely related to the uniform termination property, since a nonself-embedding term rewriting system is uniform terminating. The self-embedding property is shown to be undecidable and partially decidable. It follows that the nonself-embedding property is not partially decidable. This is true even for globally finite term rewriting systems. The same construction gives an easy alternate proof that uniform termination is undecidable in general and also for globally finite term rewriting systems. Also, the looping property is shown to be undecidable in the same

way.

Kevwords: Term rewriting system, self-embedding, termination

1. Introduction 2. Definitions

The uniform terminat ion proper ty of term rewriting systems has been the subject of much research. Applications of term rewriting systems to theorem proving and programming languages often require methods for proving uni form termination. The main techniques now in use for this are the terminat ion theorems of Dershowitz [1]. These theorems make use of the homeomorphic embedding relation on terms. Unfor tunate ly , the uniform terminat ion problem for general term rewriting systems is undecidable [3]. However, there are decidable properties of term rewrit ing systems which imply uni form termination. M a n y such properties can be found in the literature, such as those based on the ' recursive path ordering' of Dershowitz and Plaisted [1,4] and terminat ion tests based on m a n y other orderings. We introduce another property which implies unifor~n termination, and show that it is undecidable.

* This research was supported in part by the National Science Foundation under Grants MCS-81-09831 and MCS-83- 07755.

A term rewriting system is a set R = {(¢i, ri)} of rules where ¢i and r i are terms possibly containing variables, and for all i, all variables in r i must also occur in fi. For example, the set {(x*l, x), (x*(y + z), x*y + x'z)} is a te rm rewriting system, where x, y, and z are variables and *, + , and 1 are operators. This means informal ly that x*l may be replaced by x and that x*(y + z) may be replaced by x*y + x*z. Often we write a rule (t', r) as f=~ r. We write t ~ R t' (or jus t t =* t') for terms t and t' if t' may be obtained f rom t by replacing some subterm of t to obtain t' using a rule of R. Thus if R is the above set, then

a* l ~ R a and (a*(b + c ) )*d =, R (a*b + a*c)*d.

We write t =, ~, t' if t' m a y be obtained from t by zero or more replacements using rules in R. Thus, for the above term rewrit ing system, (a*l )*l =* ~ a using two applications of the rule (x*l, x). A term rewriting system R is uniform terminating if there is no infinite sequence t 1, t 2, -.-, t n . . . . . such that, for all i, t i ~ R ti+ 1. It turns out that the above

0020-0190/85/$3.30 © 1985, Elsevier Science Publishers B.V. (North-Holland) 61

Volume 20, Number 2 INFORMATION PROCESSING LETTERS 15 February 1985

term rewriting system is uni form terminating, and this may be shown using the recursive path ordering.

The homeomorphic embedding relation < on terms is def ined as below, following Dershowitz

[11:

S= f(s I , S 2 . . . . . Sin) <~ g( t l , t x . . . . . t n ) = t

if and only if

(a) f = g and s i <1 tj, for all i, 1 < i < m, 1 < j l <

j 2 < - . . < j m < n , or (b) s ~ tj for some j, 1 < j < n, or ( c ) s = t.

(Actually, condi t ion (c) in included in (a) and is superfluous.) For example, f(a, g(b, c)) is em- bedded in h(f(h(a), d, k(g(b, h(c)), d))). We say that a term rewriting system R is self-embedding if there is a derivation t 1 = t 2 = • " " = t,, such that n > 1 and t I ~ t n. If no such derivation exists, then R is nonself -embedding.

3. Undecidability of nonseif-embedding

The main reason for the interest in the self-embedding proper ty is the following result.

Theorem 3.1 ([1]). I f a term rewriting system R is not uniform terminating, then R is self-embedding.

Therefore, one way to show that R is un i form terminat ing is to show that R is not self-embedding. In fact, if there exists a simplification ordering [1] which can be used to show that R is uniform terminating, then R is nonself-embedding. It is conceivable that the nonsel f -embedding property might be decidable even though uni form terminat ion is not. However, we have the following result.

Theorem 3.2. / t is undecidable whether a term rewriting system is nonself-embedding.

Proof. Note that self-embedding is part ial ly decidable since we can search through all derivations to show that a term rewriting system is self-embedding. However, nonsel f -embedding is undecidable

(and therefore not partially decidable) by the following argument: We give a systematic method of constructing a term rewriting system S M from a Turing machine M such that S M is self-embedding iff M accepts blank tape. For a discussion of Turing machines and decidability, see [2]. Since the blank tape non-accepting problem is undecidable, so is the nonself-embedding problem. We give the construction for S M and show some properties of S M using some lemmas. The term rewriting system S M is constructed from another term rewriting system R M, which we now define.

Given a Turing machine M, let R M be a term rewriting system having the following property: M accepts blank tape iff there is a sequence t r t 2 . . . . . t n of terms such that, for all i, 1 _< i < n, the pair (t i, t i+l) is an instance of a rule of R, and such that t 1 is START and t n is ACCEPT. We say (t i, t i+l) is an instance of (r, s) if there is a sub- stitution O replacing variables by terms, such that t i is rO and ti+ 1 is SO. Thus START =a * ACCEPT RM by a derivation in which all replacements are at the ' top level'. It is easy to give an effective procedure to construct R M from M. One way to do this is to simulate a Turing machine using two stacks. For example, we may represent a binary

string 01101 . . . by a term g0(gl(gl(g0(gl--- ( c ) . . . )))), where c represents an infinite string of blanks. We may represent a configuration c~q 13 of M, in which et is the string to the left of the head, q is the current state, and 13 is the string to the right of the head, including the currently scanned symbol, by a term h(u, q, v) where u is a term encoding the reverse of et and v is a term encoding 13. Then the moves of M may be represented as rules. For example, if M, in state q, scanning a 1, prints zero, enters state r, and moves right, we have h(u, q, gl(v)) ~ h(g0(u), r, v). By adding ap- propriate rules for START and ACCEPT and for moves to blank regions of tape, R i may easily be constructed. For example, it is necessary to add a rule START =* h(c , q, C) t o start M on blank tape, where q is the start state of M.

The system S M consists of the following rules: (We write s =, t if (s, t) is in SM. )

(a) fl(x, S(y), t D =, fl(x, y, t2) for rules t 1 =, t 2 in R M.

62


(b) f2(x, S(y), t l )=* f2(x, y, t2) for rules t 1 t 2 in R M.

(c) fl(x, 0, ACCEPT) ~ fE(X, x, START). (d) f2(x, 0, ACCEPT)~ fl(X, X, START).

Here, ACCEPT and START are as above in the descript ion of R M. Intuitively, S M simulates M, keeping a counter in the first two a rguments of f~ and f2- The first a rgument is a m a x i m u m value of the counter . The second a rgument keeps decreasing. If the second a rgument reaches zero when M accepts, then the counte r is reset, and the compu- tat ion is repeated, but with fl replaced by f2 and vice versa. The decreasing counte r ensures that no se l f -embedding will occur unless the Tur ing machine accepts b lank tape. However, if the Tur ing machine does accept blank tape, then there will be a der ivat ion that loops, hence S M is self-embedding.

Lemma 3.3. If M accepts blank tape, then S M is self-embedding.

Proof. Let n be the number of moves required for M to accept b l ank tape. T h e n the te rm fl(sn+l(O), sn+l (o) , START) derives itself re- peatedly, so S M is self-embedding. []

Lemma 3.4. Let d l (u ) be the maximum depth of nesting of fl and f2 together in u. Thus, i f the operator of u is fl or f2, then dl(U ) is one greater than the maximum value dl (v ) for any top-level proper subterm v of u. Otherwise, i f u is a con- stant, then dl(U ) is O. Otherwise, dl(U ) is the maximum value dl (v ) for any top-level proper subterm v of u. Then, if r ~ sM s, then d l ( r ) = dl(S ).

Proof. All rules in S M preserve dr. []

Lemma 3.5. Let d2(u ) for term u be defined as follows: I f the top-level function symbol of u is fl or f2, then d 2 ( u ) = dl(U ). Otherwise, i f U is a con- stant, then d 2 ( u ) = 1. Otherwise, d2(u ) is one plus the maximum value d2(v ) for v a top-level proper subterm of u. Thus d 2 (u) includes the depth in u of a term v having maximum value dl(V ). Then if r ~ sM s, then d2(r ) = d2(s ).

Proof. All rules in S M preserve d 2. []

Lemma 3.6. Suppose that S M is self-embedding. Then

fi (n, n, START) ==> * fi (n, 0, ACCEPT)

by a derivation which uses none of the last two rules

Of SM.

Proof. Let r and s be terms such that r has min imal dep th and r can be rewri t ten to s by one or more rewrites in SM, and r ~_ s. Now, both r and s mus t have fl or f2 as the top-level operator , since all rules in S M have fl or f2 at the top-level. Also, d l ( r ) = d l (s ) by L e m m a 3.4. Let r be fi(t l , t2, t3) and let s be fi(ul , u 2, u3). We canno t have r ~_ uj for any j since d l ( u j ) < d l ( r ). Thus r cannot be e m b e d d e d in a p roper subterm of s, so the operators of r and s mus t be indent ical and t i _~ u i for i -- 1, 2, 3. Assume wi thout loss of generali ty that r is f~(h, t2, t3) and that s is f~(u 1, u 2, Up). If t i can be rewri t ten to u i for all i then r and s are not min imal satisfying the specified condit ions. Thus at least one rewrite in going f rom r to s mus t be at the top-level. Now, by L e m m a 3.5, dE( r )= d2(s ). Also, all the rules of S M except the last two decrease d 2 of the second a rgument of fi. Since t 2 _~ u 2, d 2 ( t 2 ) < dE(U2). Thus one of the last two rules must be used at least once at the top-level in rewrit ing r to s. However, the operators of r and s are identical, so there mus t be an equal n u m b e r of appl icat ions of each of the last two rules of SM, and these appl icat ions mus t occur alternately. Thus the der ivat ion f rom r to s mus t have a subderiva- t ion of the specified form. []

Proof of Theo rem 3.2 (continued). We note that the cons t ruc t ion of S u f rom M is effective. There- fore, it suffices to show that S M is se l f -embedding iff M accepts b lank tape. By L e m m a 3.3, if M accepts b lank tape, then S M is self-embedding. Suppose M is self-embedding. Then, by L e m m a 3.6,

fi (n, n, START) =* * fi (n, 0, ACCEPT)

by a der ivat ion which uses none of the last two rules of S M. Therefore, START =* * ACCEPT by a RM derivat ion in which all rep lacements are done at the top-level. Therefore, by cons t ruc t ion of RM, M accepts b lank tape. []

63


Corollary 3.7. It & undecidable whether a term rewriting system R is looping, that is, whether there exists a term t that rewrites to itself using one or more applications of rules in R.

Proof. Similar to the above. In fact, S M is looping iff M accepts blank tape. []

Corollary 3.8. The properties nonself-embedding, non-looping, and uniform terminating are not partially decidable, even for globally finite term rewriting systems.

termination, since a term rewriting system can be self-embedding even if it is uniform terminating. However, if there exists a simplification ordering which can be used to show R uniform terminating, then R is nonself-embedding. We note that it is decidable, given a term rewriting system R and a term r, whether there exists a term s such that r __< s and r can be rewritten to s in one or more appli- cat ions 'of rules from R. This follows because if no such s exists, no derivations from r can be infinite, so the total number of terms derivable from r is finite (by K6nig's lemma).

Proof. The system S M is globally finite. Also, S M is uniform terminating iff it is nonself-embedding. This gives a simple a l te rna te p roof of the unde- cidabili ty of uniform termination. To show that uniform termination is not partially decidable, if uni form termination were partially decidable, then we could partially decide nonsel f -embedding of S M, hence self -embedding of S M would be decidable. []

4. Comments

References

[1] N. Dershowitz, Orderings for term-rewriting systems, Theo- ret. Comput. Sci. 17 (1982) 279-301.

[2] J. Hopcroft and J.D. Ullman, Introduction to Automata Theory, Languages, and Computation (Addison-Wesley, Reading, MA, 1979).

[3] G. Huet and D. Lankford, On the uniform halting problem for term rewriting systems, INRIA Tech. Rept. 283, 1978.

[4] D. Plaisted, A re.cursively defined ordering for proving termination of term-rewriting systems, Rept. No. 943, De- partment of Computer Science, University of Illinois at Urbana-Champaign, 1978.

The undecidabi l i ty of nonsel f -embedding does not follow from the undecidabil i ty of uniform

64

Documents

The undecidability of self-embedding for term rewriting systems