33
Arch. Math. Logic (1991) 30:409-441 Archive for Mathematical Logic Springer-Verlag1991 Herbrand analyses Wilfried Sieg Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA 15213, USA This paper is dedicated to Kurt Schiitte on the occasion of his 80th birthday Received June 13, 1990 Abstract. Herbrand's Theorem, in the form of J-inversion lemmata for finitary and infinitary sequent calculi, is the crucial tool for the determination of the provably total function(al)s of a variety of theories. The theories are (second order extensions of) fragments of classical arithmetic; the classes of provably total functions include the elements of the Polynomial Hierarchy, the Grzegorczyk Hierarchy, and the extended Grzegorczyk Hierarchy ~', ~ < ~o. A subsidiary aim of the paper is to show that the proof theoretic methods used here are distinguished by technical elegance, conceptual clarity, and wide-ranging applicability. Introductory remarks Statements ,p of the form (Hy)dpy express a functional dependence of the quantified variable y on the parameters occurring in ,p. In case ,p's matrix is quantifier-free, the uniformity of formal proofs D for such ,p provides the basis for Herbrand analyses, i.e. the extraction of a term t from D and the generation of an associated proof D* of qbt. The extracted term t reflects both the expressiveness of the term language and the formal structure of the given derivation? Furthermore, if the basic terms in D are computable, t represents also a computation of a restricted sort. The very idea of computation as analyzed by Turing is, after all, tied to rule- governed, mechanical procedures. This is most sharply pointed out when the computable functions are characterized as those functions that can be systemati- 1 I assume that we are concerned with "Herbrand theories", defined in Sect. 1.2. The proof of the 3-inversion lemma for such theories will make it clear that and how both aspects are involved. - Herbrand analyses can be carried out not only for numerical quantifiers, but also function quantifiers; see [Feferman and Sieg]

Herbrand analyses

Embed Size (px)

Citation preview

Arch. Math. Logic (1991) 30:409-441 Archive for

Mathematical Logic

�9 Springer-Verlag 1991

Herbrand analyses Wilfried Sieg Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA 15213, USA

This paper is dedicated to Kurt Schiitte on the occasion of his 80th birthday

Received June 13, 1990

Abstract. Herbrand's Theorem, in the form of J-inversion lemmata for finitary and infinitary sequent calculi, is the crucial tool for the determination of the provably total function(al)s of a variety of theories. The theories are (second order extensions of) fragments of classical arithmetic; the classes of provably total functions include the elements of the Polynomial Hierarchy, the Grzegorczyk Hierarchy, and the extended Grzegorczyk Hierarchy ~' , ~ < ~o. A subsidiary aim of the paper is to show that the proof theoretic methods used here are distinguished by technical elegance, conceptual clarity, and wide-ranging applicability.

Introductory remarks

Statements ,p of the form (Hy)dpy express a functional dependence of the quantified variable y on the parameters occurring in ,p. In case ,p's matrix is quantifier-free, the uniformity of formal proofs D for such ,p provides the basis for Herbrand analyses, i.e. the extraction of a term t from D and the generation of an associated proof D* of qbt. The extracted term t reflects both the expressiveness of the term language and the formal structure of the given derivation? Furthermore, if the basic terms in D are computable, t represents also a computation of a restricted sort. The very idea of computation as analyzed by Turing is, after all, tied to rule- governed, mechanical procedures. This is most sharply pointed out when the computable functions are characterized as those functions that can be systemati-

1 I assume that we are concerned with "Herbrand theories", defined in Sect. 1.2. The proof of the 3-inversion lemma for such theories will make it clear that and how both aspects are involved. - Herbrand analyses can be carried out not only for numerical quantifiers, but also function quantifiers; see [Feferman and Sieg]

410 W. Sieg

cally evaluated in "calculi", satisfying only general recursiveness conditions2; computations literally are derivations from this perspective. Even when logical and mathematical machinery extends the basic calculus to a formal theory T, the class of computable functions is not changed. This is the notion's absoluteness emphasized by G6del in his remarks before the Princeton Bicentennial Confer- ence. However, once such machinery is added two questions arise immediately: Which computable functions can be shown in T to be total? and In which subclass of computable functions lies a Skolem-function for (gx) (3y) Cxy, r atomic, when the latter is provable in T?

The second issue is a concrete form of Kreisel's general question: What more than its truth have we recognized, when we have established a theorem in a formal theory? Interesting results have been obtained, e.g. for fragments of elementary number theory; they have been used mathematically to determine (crude) bounds and to carry out independence arguments. During the last few years, the converse of this task has also been attacked, namely to find for a given complexity class !;I a theory whose provably total 3 functions are exactly the elements of !;I. Here the hope is that relations between the theories might reveal connections between the complexity classes. I am considering theories T of arithmetic and some second order extensions; that allows me to focus on the connection between the syntactic complexity of induction used in T-derivations and the computational complexity of T-provably total functions (and functionals). Clearly, induction is needed to prove existence conditions and recursion equations when introducing functions in a definitional extension of T; recursion is needed to functionally analyze applica- tions of the induction principle in T-derivations. The first point involves only standard considerations, but for weak fragments of arithmetic they can be quite delicate. The second point is most naturally addressed by proof theoretic methods, since formal derivations are being analyzed and used as (data for) computations. It seems to me to be of considerable methodological interest to bring out the uniform character of the metamathematical proofs given here: at their heart is an Herbrand analysis within finitary and infinitary sequent calculi, expanded on occasion by (what amounts to) e-terms and infinitary terms. The tools are, consequently, taken from the classical repertoire in proof theory.

The choice of theories is largely due to work in proof theory aiming for foundational reductions. After all, a "constructive" consistency proof for analysis, the classical theory of the continuum, was viewed by Hilbert and Bernays as the central programmatic goal. They chose as the frame for the formal development of analysis full second order arithmetic, allowing third order parameters. Supplement IV of Grundlagen der MathematikII offers a detailed and quite beautiful presentation that can actually be given in a subsystem of second order arithmetic,

1 , , 1 namely the "restricted system with Hi-comprehension (//1-CA) I. This theory is obtained from a base theory (BT), a second order version of primitive recursive arithmetic, by adding the comprehension principle for H~-formulas. 4 Note, and

2 Such a characterization was given by Hilbert and Bernays in "Grundlagen der Mathematik II", Supplement II; they introduced the concept of a function that can be evaluated according to rules ("regelrecht auswertbare Funktion"). The "Rekursivit~itsbedingungen" are formulated within their general analysis of finitist operations as primitive recursive ones a Or provably recursive; compare the definition and discussion in [Girard, p. 231] 4 (BT) is introduced in Sect. 2.2. In general, subsystems of(CA) are obtained from (BT) by adding function existence principles and strengthening the induction principle

Herbrand analyses 411

that is clearly the reason for considering (H~-CA)I as "restricted", the induction principle is available only for/ /~-formulas. As (//~-CA)I is conservative with respect to the class I ( of negative arithmetic and /-/~ over the intuitionistic theory ID < ~o(O) for the finite constructive number classes we do have a foundational reduction for classical analysis; it provides an intuitionistically acceptable consistency proof and thus a constructive justification for a blatantly impredicative theory. 5

Analysis can be developed in much weaker theories that are actually conservative with respect to the same class N2 of formulas over intuitionistic number theory (HA). 6 One such theory is the theory of arithmetic properties ( / /~ I obtained from (BT) by adding the comprehension principle for just arithmetic formulas. But we have reached here a limit for further global refinements, as a number of important mathematical theorems are actually equivalent to the arithmetic comprehension principle; for example, K6nig's lemma and the statement that each bounded sequence of real numbers has a least upper bound. Clearly, particular arguments are carried out in proper parts of such theories. That pushes us to make further local refinements, serving as a basis for computational reductions. The systematic and historical context in which these considerations are embedded is, for once, not further discussed in Sect. 1. That section contains only a precise description of basic tools and examples of their use, namely a proof theoretic characterization of the classes ~, of the Grzegorczyk- hierarchy, n > 2, and of the ~n in the polynomial hierarchy. The latter result is due to Buss, but the proof given in Sect. 1.3 is new and, I think, much more direct and perspicuous. Sections 2 and 3 are devoted to the investigation of (second order extensions of) fragments of primitive recursive and full classical arithmetic (Z). You will find there, in particular, (1) an informative and useful second proof theoretic characterization of the classes ~n via fragments of arithmetic with the //~ rule and (2) a characterization of the provably total function(al)s of analytic extensions of the S~ of arithmetic. For the investigation of (2) I use the most versatile tool of modern proof theory: semi-formal systems as introduced by Schiitte. 7

1 Normality & bounds

To get at the computational content of number theoret ic / /~ 8 I use appropriate sequent calculi. Their normalizability guarantees bounds on the logical complexity of formulas occurring in derivations; the invertibility of the

5 These general philosophical issues are discussed in my paper "Relative consistency and accessible domains", Synthese 84(2), 1990 6 See, for example, Feferman (1977) or Friedman, Simpson, and Smith 7 The basic considerations of this paper were presented to the Colloquium Logicum (Kiel, June 4, 1988) and the Symposium on Proof Theory (Heidelberg, July 1, 1988) under the title "Proof Theory and Complexity"; some of the results were obtained in a manuscript "Derivations as Computations" and announced in an abstract for the 1989 European Summer Meeting of the ASL in Berlin 8 Thec•asses•farithmeticf•rmu•ashavetheirstandardde•niti•ns;Ica••af•rmu•aexistentiali• it is in prenex normal form and has a purely existential prefix (the quantifiers may range over numbers or, when we are dealing with second order languages, also over functions)

412 W. Sieg

quantifiers V and, with suitable restrictions, 3 yields a functional analysis of the combination V3. In this section I describe the method and use it to prove (refinements of) one of the first results obtained in the pursuit of Hilbert's program. You will see that this is not "accidental". The program's aim, a finite consistency proof for (standard) formal theories P, can be reformulated in a most informative way, as the consistency of P is equivalent to the reflection principle for P, i.e.

(Vx) (Pre(x, 'r => r

Pre is the finitistically formulated proof predicate for P, r a finitist statement, and 'r the corresponding formula in the language of P. A finitist consistency proof would allow to transform any P-derivation of 'r into a finitist proof of r and, in this way, make possible the elimination of"ideal elements" from proofs of"real (i.e. finitist) statements". Doesn't the normalization of proofs of finitist statement quite directly perform this elimination? Indeed, the subformula property of normal derivations guarantees just that.

1.1 Bounding logical complexity

The sequent calculus is taken here in the form given to it by Tait. Its underlying language L is that of elementary arithmetic, expanded possibly to L(~) by adding function symbols for a variety of classes ~ of (primitive) recursive functions. Formulas are built up from literals, i.e. from atomic and negated atomic formulas, with just the logical symbols &, v , V, 3. The remaining sentential connectives (~-+, ~ , ~ ) have their ordinary definitions. The calculus allows the derivation of finite sets F,A, ... of formulas semantically understood as the disjunction of their elements. The "rules of the calculus are as follows:

LA F, r --7 r r atomic

& r,r

_v r , r r , r F,q~ v q~ F,q~v r

_C(ut) F, r " F, -14) F

Y F, Ca a r P(F) r, (Vx)r

3_ F , r r, (3x)r

Here a ~ P(F) means that the parameter a occurs in some formula contained in F; we will use a e P(t) [P(r P(D)] for terms t [formulas r and derivations D] with the same meaning. The theories for (parts of) arithmetic contain always the equality axioms EA 1, i.e. F, t = t, and EA2, i.e. F, s 4= t, -7 Cs, ~bt for atomic r To obtain full normalization we use at first also all those sequents as axioms that are obtained from EA 1 and EA 2 by _C with one of the p.f.s. (principal formulas) 9 of EA 1 and EA2 as cut-formula.

9 Unexplained standard terminology is always taken in the sense of Schwichtenberg (1977)

Herbrand analyses 413

Any derivation in this calculus can be turned into a derivation of the same sequent, but without using the cut rule _C. To do that one establishes first inversion lemmata for _&, V. The _&-inversion lemma asserts, ifD is a derivation ofF, r & r then there are derivations D i of F, ~i with IDil < IDI and Q(Di) --< Q(D), i < 1. Here D, Do, D1, ... range over derivations; IDI is the length of D; Q(D) the cut-rank of D, i.e. sup{kol + 1 :~0 is a cut-formula in D}, where I~ol is the length of the formula ~o. The latter complexity measure is defined recursively by I~ol = I--1 ~01-- 0, in case ~o is atomic, and I~Oo v ~oll = ItPo & ~ol1--sup {ko~l + 1 : i< 1}, respectively I(Vx)~ol--1(3x)~01 = I~01 + 1. Derivations with cut-rank 0 are called normal or cut-free, those with cut- rank 1 quasi-normal. The universal quantifier rule is invertible without any restriction.

1.1.1 ~-Inversion. I fD is a derivation of F, (Vx)r then there is a derivation E of F, r with IEI < IDI and Q(E) < Q(D).

The parameter c can either be chosen to be new for the given derivation D, i.e. c r or we can make a general assumption that _V-inferences have unique eigenparameters; in that case derivations are always taken to be normal w.r.t. eigenparameters. - The inversion lemmata are crucial for the proof of the reduction lemma: if Do is a derivation of F, r and D 1 of F, --7 r with Q(Di)~ Ir i< 1, then there is a derivation D of.F with IDI < IDol + IDol and Q(D)< Ir The reduction lemma makes the proof of the normalization theorem almost immediate.

1.1.2 Normalization Theorem. I f D is a derivation o f F with ~(D) = n, then there is a derivation E o f F with Q(E)= 0 and IEI < 2~ DI.

Normal derivations satisfy the subformula property; thus, the logical complex- ity of formulas in such derivations is bounded by the complexity of formulas in the endsequent. Derivations in the calculus with just EA~ and EA2 (i.e., without requiring closure under cut as we did above) can be quasi-normalized. When extending the purely logical framework by universal axioms we choose not to close under cut, but rather employ quasi-normal derivations. More precisely, universal axioms r are represented as follows: first, r matrix is brought into conjunctive normal form, so that ~b = (Vx)(~boX &.. .& ~bnx), where each r is a disjunction of literals c~ix,j < n,, i < n; then we take as axioms the sequents F w {r :j < ni) for each i< n, for each term t, and any finite set F. ~~ So the basic axioms of elementary arithmetic, for example, are given by: BA~: F,O:#s' , BA2: F, s' :# t', s = t, BA3: F, s + 0 = s and F, s + t' = (s + t)', BAg: F, s. 0 = 0 and F, s. t' = (s. t) + s. Finally, we have BA5 consisting of the four sequents:

F,---ns<0, F, ~ s < t ' , s < t , s = t , F , - - - n s < t , s < t ' , and F , s ~ t , s < t ' .

The last three sequents represent (Vx) (Vy) (x < y'~--~(x < y v x = y)). Such routine reformulations will not be carried out; for example, we write BA6: F, s < t o ( s < t v s = t). Additional axioms for primitive recursive functions in consist of all instances of their defining equations. If T(~) is any such theory, it is clear how to define T(~)-derivation and T(~)~-A. For the latter we write also ~-rt~)A. - The argument for the normalization theorem can be adapted to yield:

lo Similarly for statements with finitely many initial universal quantifiers. - I call such axiomatizations Post-systems, following (almost) [Girard]

414 W. Sieg

1.1.3 T-Normalization. Let D be a T(~)-derivation of F; then there is a quasi- normal T(~)-derivation E of F with IEI < 2 ~ I and m = Q(D)-1.11

Quasi-normal derivations do not have the full subformula property; but that is not needed for significant uses of normalization theorems. The bounding of the logical complexity of formulas occurring in a derivation is what is important and that is still achieved: any formula in a quasi-normal T(~)-derivation of F is either atomic or a subformula of some element of F. As a matter of fact, even when the induction principle O-IA

q~0 & (Vx < a) (~ox-.qgx')---*~pa

for a class of formulas q~ is added a bounding can be obtained by considering quasi- normal derivations supporting ~--x(~)[~ O-IA], q~.~2 It is, however, more conve- nient to consider an equivalent formalization of parts of arithmetic using the O- induction rule 13 O-IR*:

F, ~o0 F, -7 r qm'

F, q~t Here a q~ P(Fu { ~0t}; t is a term and q~t is contained in the class q~ of formulas. The theory obtained from (an extension of) elementary arithmetic T(~) by adding this induction rule is denoted by (O(~)-IA). For derivations in such systems we employ the even weaker notion of I-normalization and formulate an appropriate normalization theorem: all derivations can be transformed into I-normal ones. 14

1.1.4 Definition. (1) A cut with cut-formula ~b is called I-cut iffone of its premises is the conclusion of the induction rule with p.f. ~b or ~ ~b.

(2) A derivation is called I-normal iffall its cuts are either I-cuts or have atomic cut-formulas.

Again, the argument for the normalization theorem is readily adapted to allow the transformation of derivations into I-normal ones. Note that for this theorem is modified so as to take into account only the complexity of cuts that are not I-cuts.

1.1.5 I-Normalization. If D is a derivation of F in (O(~)-IA), then there is an I-normal derivation E of F in (r with IEI < 2~ I, m = ~(D)-- 1.

The complexity of formulas in 1-normal derivations is bounded as follows:

1.1.6 Corollary. Let r and ~' be classes of formulas closed under substitution; if D is an I-normal derivation of F in (q~(~)-IA) and F is contained in ~P, then any formula in D is either atomic or a subformula of an element in q~u{-7 ~ : c~ ~ q~} u ~P.

11 One can sharpen this result to obtain T-derivations with only atomic cuts whose cut-formulas are in addition p.f.'s in some axiom sequent of T; in that case m=0(D) 12 [-7 4~-IA], ~p indicates a sequent consisting of ~o and finitely many negated instances and instantiations of O-IA. Similar notation will be used below for other (schematic) principles 13 This form of the induction rule is different from the restricted form introduced by Parsons and used in my "Fragments of Arithmetic"; see p. 40 and note 6 of that paper. That form requires that the formulas in F are also in the class 4~; we denote it by 4~-IR. If 4~ consists of all quantifier-free formulas, then the two formulations are equivalent 14 This notion is related to Takeuti's concept of free cut free derivation, i.e., derivations that are free of free cuts

Herbrand analyses 415

Now I want to describe the crucial steps in a finitist consistency proof for the fragment (QF(~I)-IA) of elementary arithmetic (Z), where 9~ is a finite set of primitive recursive functions. [QF(9t) consists of all quantifier-free formulas in L(9t).] This fragment .was shown consistent by Ackermann, von Neumann, and Herbrand. Gentzen exploited the possibility of bounding the complexity of formulas in normal derivations to re-establish this result, but one can do this also by using the bounding in I-normal derivations. That feature allows the proof of formal soundness by means of a partial truth definition 15, based on a valuation function for terms. The latter is used to recognize the truth of the endsequent of any I-normal derivation by induction on.their length in a suitable meta-theory. These considerations, including the proof of the normalization theorem, can be carried out for quantifier-free ~b most directly in the first-order extension (S~ of primitive recursive arithmetic (PRA). (Z~ is shown in Sect. 2.1 to be a conservative extension of (PRA) for //~ in the following sense: if (2;~ proves (Vx) (~y)Rxy and R is quantifier-free, then there is a primitive recursive function f such that (PRA) proves Raf(a) . As this conservation result is established in (PRA), we have proved the reflection principle and thus the consistency for (QR(9~)-IA) in (PRA). (PRA) is certainly a part of finitist mathematics; so we have established the consistency of (QF(9t)-IA) finitistically.

I want to sharpen and generalize these considerations; to do that I remind the reader of an elegant characterization of the G-hierarchy, G for Grzegorczyk, that is due to Ritchie and based on the following sequence of number theoretic functions:

Ao(x, y) = y'

! if n=0 A~+ l(x,0)= if n = l

if 2__<n

A~ + l(X, y + 1) = A~(x, A~ + l(x, y)).

A~ denotes 2xy.A, , (x ,y) , the n-th branch of the Ackermann-function 2nxy . A,(x ,y); A~, A2, and A 3 are addition, multiplication, and exponentiation respectively. The n-th G-class ~ is the smallest class of number-theoretic functions that contains 2x. 0, 2x. x', Am with re<n, and that is dosed under explicit definition and bounded recursion; ~3 coincides with the class ~ of elementary functions in Kalmar's sense. We consider formal theories (A o(9.I~)-IA), 3 < n, whose languages expand that of elementary arithmetic by function symbols for the elements of 92~ : = {A m : 3 < m < n}. A o(~)-formulas of any language L(~) consid- ered here are obtained from literals by closing under conjunction, disjunction, and universally as well as extentially bounded quantification. 16 The set of quantifier-free formulas of L(~) is denoted by QF(~).

All functions in ~ can be introduced in (A o(9.In)-IA), 3 < n. The considerations are particularly simple, if we use an equivalent characterization of the ~, as the smallest class of number-theoretic functions that contains 2x . O, 2x . x', 2xy . x - y, Am with m < n, and that is closed under explicit definition and bounded search

15 I.e., a truth definition for formulas of bounded complexity 16 The quantifiers are bounded by arbitrary terms of the language; but note, they are just abbreviations, not basic logical symbols

416 w. Sieg

(the bounded #-operator). The functions characterized in this way can be bounded by functions that are denoted by terms of the language and whose graphs are definable by A o(9.In)-formulas. From these facts one can easily infer the first part of the next lemma:

1.1.7 Lemma. (i) (Ao(~n)-IA) is a definitional extension of (Ao(9.1n)-IA); (ii) (QF(~n)-IA) is equivalent to (Ao(~n)-IA).

The second part of 1.1.7 expresses that, in (QF(~n)-IA), every formula in A o(~n) is equivalent to a quantifier-free formula; we have a "bounded quantifier elimination lemma". But what is the generalization of the fact concerning (QF(~)-IA)? It is the following theorem, of which the earlier fact is indeed a very special case:

1.1.8 Theorem. (A0(92~+ I)-IA) proves the consistency of (Ao(9.In)-IA), for 3 <n.

To establish the theorem it is only necessary, thanks to Lemma 1.1.7, to establish the consistency of (QF(@n)-IA) in (A o(~n § 1)-IA); but that can be done in complete analogy to the argument above exploiting two crucial facts: (i)~n+ 1 contains a valuation function for ~n and thus a partial truth definition for quantifier-free formulas, and (ii) the necessary proof theoretic considerations can be formalized in (Ao(~4)-IA).

1.2 Bounding existential quantifiers

We saw that all functions of ~n can be introduced in (Ao(9.1n)-IA). This is the first step in answering the question, are the provably total functions of (Ao(9.1n)-IA) exactly the elements of the n-th class of the G-hierarchy? (Here, as above, we always have 3 < n.) The second step for giving a positive answer to this question, the functional analysis of/ /~ is the crucial proof theoretic one and is based on a form of Herbrand's Theorem for X~

1.2.1 Herbrand's Theorem. Let F= {tpo .. . . , ~on} contain only purely existential formulas; if D is a derivation of F, then there is a quasi-normal derivation of A o ... . . An; A j-- {q~ : i < nj and q)~ is an instance of the matrix of qgi},j < n. The terms occurring in these instances are built up from terms occurring in D.

For applications a corollary is most useful; it expresses the restricted invertibility of q.

1.2.2 Corollary. Let F contain only purely existential formulas and let r be quantifier-free; if D is a derivation of F, (qxo)...(3xn)q), then there are sequences of terms rio, ..., t~n, i<p, and a quasi-normal derivation of F, qg[t ~ ..., tn-l,o ..., q, E t~ . . . . , t~.].

The proof of the corollary can be given directly by induction on quasi-normal derivations; indeed, that argument can be extended to the case of 1-normal derivations. We are going to do that now in a more special setting. Consider theories T(~) of the form (QF(~)-IA), such that T(~) and ~ satisfy the following two conditions:

(H.1) ~ is provably closed under explicit definitions and definition by cases (thus under Boolean operations, max, min);

(H.2) ~ is provably closed under bounded search, i.e., for any q~ in QF(~) there is an h in ~, such that T(~) proves: (qy<=x)~y*--,~h(x). Such T(~) are called Herbrand Theories; and we have:

Herbrand analyses 417

1.2.3 3-Inversion. Let T(~) be an Herbrand theory, let F contain only purely existential formulas, and let ~p be quantifier-free; if D is a T(~)-derivation of F, (3x)~px, then there is a term t* and a(n I-normal) T(~)-derivation D* of F, ~pt*.

Proof (by induction on I-normal T(~)-derivations). I focus on the centralstep in the argument concerned with the induction rule. D must end in an inference of the form

F, ~)0, (3x)~px F, 7 (/)a, dpa', (3x)~x F, c/)t, (3x)~px

The induction hypothesis, applied to the derivations Do and Da leading to the premises of the interference, yields terms r and s[a], possibly containing other parameters as well, and derivations D* of (1) F, dpO,~pr and D* of (2) F, -7 qba, 4)a', ~ps[a]. T(~) obviously proves -7 ~b0, ~bt, (3x < t) (~bx & -7 ~bx') and, using condition (H.2), both

(3) -7 ~b0, c/)t, ~)h(t) and (4) -7 ~b0, ~bt, -7 4)h(t)'.

From (2), replacing the parameter a by h(t), one obtains

F, -7 e)h(t), dph(t)', ~s[h(t)] . (5)

Cutting (5) successively with (3), (4), and (1) yields a derivation of

F, e~t, lpr, ~ps[h(t)] . (6)

Using (H.1), definition by cases, we can define an f in ~, such that F, 4)t, ~pf(t) is provable in T(~). Q.E.D.

For Herbrand theories I can give a straightforward answer to the question concerning Skolem functions.

1.2.4 Term extraction. Let T(~) be an Herbrand theory and let ~b be quantifier- free; if T(~) proves (Vx)(3y)c~xy, then there is a term t[a] in L(~) such that T(~) proves (Vx)~bxt[x]. 2x. t[x] denotes a function in ~.

The proof of this theorem is immediate by y-inversion and subsequent 3-inversion. - A few remarks are in order.

1.2.5 Remarks. (1) The results are insensitive to extensions of the theories by //~ Thus the Theorem and Corollary below hold not only for the theories explicitly described, but also for any of their//~ - All the considerations can be carried out for standardly formulated open theories when the induction principle for quantifier-free formulas is captured by an open axiom schema.

(2) We will, in the investigation of bounded arithmetic, modify the notion of Herbrand theory by using a different induction principle and a corresponding "strictly bounded search".

(3) We did not worry about parameters occurring in terms; if t[a] contained additional parameters, we would simply replace them in t[a] and the derivation of (Vx)e~xt[x] by a constant, say 0. That obviously suffices to see that ~ contains a Skolem function for (Vx)(3y)~bxy; if, however, one is concerned to obtain good bounds, then the choice of appropriate (replacement) terms becomes one of the central issues.

Now let us return to the special theories we treated above:

418 W. Sieg

1.2.6 Theorem. The provably total functions of (A o(9~,)-IA), 3 < n, are exactly the elements of ~..

(A o(9.I3)-IA) is IA 0 + exp in the terminology of [Paris, Wilkie]; (QF(~9t)-IA) is primitive recursive arithmetic (PRA) expanded by first-order logic and equivalent to U {(A o0I,)-IA): n ~ N}. Thus we have as a corollary the well-known facts:

1.2.7 Corollary. (i) The provably total functions of IAo+ex p are exactly the elementary ones; (ii) the provably total functions of (QF(~9t)-IA) and (PRA) are exactly the primitive recursive ones.

The argument for the theorem is brief. We know that (QF(@,)-IA) and (Ao0I,)-IA) are equivalent and thus have the same class of provably total functions; ~, is exactly that class by the term extraction lemma, as (QF(~,)-IA) is an Herbrand theory.- The theorem is not uninteresting; but I presented the argument mainly to point to a schema for the determination of the class of provably total functions of some theory T(~). Instead of directly analyzing T(~), here (A o(9.I,)-IA), we reduce the theory proof theoretically via a suitable definitional extension T(It~), here (Ao(~.)-IA), to an Herbrand theory T*(~), here (QF(~,)-IA). The proof theoretic reduction preserves at least//~ in our special case even all sentences of L(~). The crucial proof theoretic step is easy here as only first order logic is added to essentially quantifier-free theories: their mathematical substance is untouched. 17

1.3 Bounded arithmetic

The schema of the above considerations can be used to prove Buss' theorem, that the provably total functions of the theory S~ of bounded arithmetic are exactly the polynomial time computable functions. 18 I will give a perspicuous argument for what Buss considered to be the difficult part of the theorem, namely that the provably total functions of S~ are contained in ~ ; in particular, witnessing functions will not be used. L(~B), the language of bounded arithmetic, is the language of elementary arithmetic expanded by function symbols I" I, ~, and ~ , where [al yields the length of the binary representation of a,~ is the shift-right- function, and a ~ b is 2 lal Ibl. The language L(~3) is obtained from L(~B) by adding function symbols for each element of ~. The latter class of functions is defined inductively as the smallest class of functions that contains certain initial functions (0, ', .~, 2., )~, choice 19) and that is closed under composition and bounded iteration; a function f is said to be defined by iteration from g and h with time bound p and space bound q (p and q suitable polynomials 2~ iffthe following holds: ifz is defined by

�9 (x, 0) = g(x)

z(x, y') = h(x, y, z(x, y)),

17 A reader unfamiliar with bounded arithmetic might wish to skip Sect. 1.3 18 They were already used in [Buchholz and Sieg] to show that ~ is also the class of provably recursive functions of a certain theory of binary trees introduced by [Ferreira] 19 2, X, and choice are the shift-left-function, the characteristic function of <, and the definition by cases function, respectively 20 A polynomial is called suitable if it has only nonnegative integers as coefficients; thus suitable polynomials are monotonically increasing

Herbrand analyses 419

then we must have

and (Vy ~ p(Ixl)) I~(x, Y)I ~ q(Ixl)

and

f(x) = z(x, p(lx])) ;

x indicates a sequence of variables. Letting ~ stand for ~ or ~3, QF(~) denotes the set of quantifier-free formulas of L(~). The bounded quantifiers (Vx<lt]) and (3x < ]t]), understood again as abbreviations, are called sharply bounded. 3ob(~), the class of sharply bounded formulas, is built up from literals in L(~) using &, v, and sharply bounded quantifiers; if closure under bounded existential quantification is also required, the resulting set of formulas is called zb(~). A formula of L(~) is in s-zb(~) just in case it is of the form (3x < t)~b, where ~b is in QF(~). The theories of bounded arithmetic to be investigated contain the basic axioms for the non-logical symbols of L(~3), the defining equations for the elements of ~ in case the theory is formulated in L(~), and one of the induction principles ~-PIND or ~-LIND. The latter are formulated as rules:

F, q~O F, --1 (p~, ~oa F, opt

F, (pO F, -7 qm, (pa' .

r,q)ltl

where ~0 is in q~ (and the parameter a must not occur in the lower sequent). The resulting theories are denoted by (~-PIND) and (~-LIND); the theory (S~(~)-PIND) is Buss' theory S~ and allows - via a delicate boot-strapping - the introduction of all elements of ~3:(Z~(~)-PIND) is a definitional extension of (Z~(~B)-PIND) and by Theorem 13a in Buss (p. 52) equivalent to (S~(~)-LIND). Let me formulate some facts whose proofs require care, but are standard and will not be given.

1.3.1 Lemma. (i) ~3 is provably in (QF(~3)-LIND) closed under explicit definitions and definition by cases.

(ii) ~ is provably in (QF(~3)-LIND) closed under strictly bounded search, i.e., for any ~b in QF(~) there is an h in ?(3, such that (QF(~)-LIND) proves: (3y ~ Ixl)~b y~-~ e~h(lxl).

The last part of the lemma asserts that in (QF(~3)-LIND) every formula in Ag(~3) is provably equivalent to a quantifier-free formula. By inspecting the proof of Theorem 14 in Buss (p. 53) one can see that QF(~3)-replacement is provable in (s-S~(~)-LIND); that fact in turn allows us to show that in (s-S~(~)-LIND) every 2;~(~3)-formula is equivalent to one in s-Zlb(~3). Thus we have:

1.3.2 Lemma. (i) (A~(~)-LIND) is equivalent to (QF(~3)-LIND). (ii) (s-Z~(~)-LIND) is equivalent to (sb(~3)-LIND).

This completes the preparation for the central considerations involving 3_-inversion and the extraction of terms. First note, that (QF(~)-LIND) is a modified Herbrand Theory in the sense of Remark 1.2.5(2) and that the considerations for the 3_-Inversion Lemma in Sect. 1.1 and 1.2 can be carried through for this theory. The modified Term Extraction Lemma shows then that

420 W. Sieg

the provably total functions are exactly the elements of ~. Secondly, (s-S~(~)- LIND) is equivalent to S~. So it is sufficient for a proof of Buss' theorem for S~ to show the following theorem:

1.3.3 Theorem. The provably total functions of (s-S~(~)-LIND) are exactly the polynomial time computable functions.

We only have to establish that the theory (s-S~(~)-LIND) is conservative over (QF(~)-LIND) for//~ That is obtaineddirectlyfromthenextlemma.

1.3.4 Lemma. Let F contain only Z~ formulas ; if D is an I-normal derivation of F in (s-S~(~)-LIND), then there is an I-normal derivation of F in (QF(~)-LIND).

Proof. The argument proceeds by induction on the number :~ of applications of the s-S~(~)-induction rule in D. The claim is trivial if :~ = 0. So assume that ~: > 0 and consider an application of the s-S~(~)-induction rule, such that no further application occurs above it. The subderivation E determined in this way ends with the inference

A, ~paO A, --n,paa, ~paa'.

A, wan

,paa is of the form (3x) (x < tl-a, a] & ,p*xaa), where ,p* is in QF(~) and a indicates the sequence of parameters occurring in A, ,p. Let E o be the derivation of the left premise and E a that of the right premise. 3-inversion allows us: 1 to extract from Eo a term o-l-a] and a derivation in (QF(~)-LIND) of

A, a[a] < t[a, O] & ,p*o[a]a0. (1)

The application of y-inversion and then of ~_-inversion to Ea yields a new parameter c, a term zra, c, a], and a derivation of

A, ~(e<t[a,a]&~p*caa), T[a,c,a]<t[a,a']&,p*~[a,c,a]aa'. (2)

Now define: p(a, O) = o-ra-I

fzEa, o(a, a), a] if a < [sl Q(a, d) [~(a, a) otherwise;

can be shown to be in ~. For that note first that the term s does contain neither a nor c: the former parameter not by the restrictive condition on the rule LIND, the latter not by choice in Y-inversion; then note also that t does not contain c. Using (1) and (2) we obtain derivations in (QF(~)-LIND) of

A, Q(a, 0) < tra, 0] & ~p*0(a, 0)a0 (3)

and

A, --7 (0(a, a) < tl-a, a] & ,p*0(a, a)aa), ~(a, a') < tl-a, a'] & ~*a(a, a')aa'. (4)

Now we can infer from (3) and (4) by QF(~) -LIND

A, o(a, Isl)_-< tea, Is0 & W*e(a, Isl)alsl (5) and from (5) by

A, (3x)(x < t[a, Is0 & w*xalsl). (6)

21 As D is t-normal we can assume without loss of generality that A contains only existentially quantif~d formulas

Herbrand analyses 421

But (6) is the endsequent of E, established now by a derivation E* in (QF(~)-LIND). The induction hypothesis yields the claim of the lemma, when applied to the derivation obtained from D by replacing E through E*. Q.E.D.

Buss' general result that, for all k, the provably total functions of S k+l are exactly the functions of level ~3 k + 1 (in Buss's terminology: [3~+ 1) of the polynomial hierarchy, is obtained by extending our considerations to suitable operator theories. 22 Recall, that (Xkb+ ~(~)-LIND) is equivalent to S k+~ and allows the introduction of all functions in ~k + r Let me sketch a new proof of the fact that the provably total functions of (Skb+ I(~)-LIND) are contained in ~k + 1" (For k = 0, we just established it.) We expand (s-Xbl(~)-LIND) by Skolem-functions vx < t. (...) of polynomial growth rate for certain polynomially bounded quantified formulas (3x < t)(...). The functions that are denoted by symbols of the extended theory are those in the polynomial time closure of these Skolem-functions PTC(vk): the k indicates that the nesting of Skolem-functions is bounded, i.e. their v-depth, as defined in Sieg (1985, p. 41), is at most k. The appropriate axioms are

SK k (a<=t&dpa)~(vx<=t.(c~x)<=t&c~vx<=t.((bx))

for each term t and each quantifier-free formula ~ba with v-depth < k. The resulting theory is denoted by (s-Sb(PTC(vk))-LIND); it is not difficult to see that (,~kb+ ~-LIND) is a sub-theory of the new theory, as every formula in Skb+ 1 is provably equivalent to one in s-2~b(PTC(vk)). By considerations as in the proof of Lemma 1.3.4 we reduce the new theory (s-Xb(PTC(vk))-LIND) to its sub theory with just QF(PTC(vk))-LIND. The provably total functions of the latter theory are exactly the elements of PTC(vk). In this class lie the characteristic functions for all predicates determined by a formula in Skb; i.e., using Theorem 8 of Buss (p. 20) S~ is included in PTC(v k) and, thus, PTC(X~) is a subset of PTC(vk). Conversely, the Skolem functions of PTC(v k) can be defined in PTC(S~) and, thus, PTC(v k) is a subset of PTC(~). Consequently, we have PTC(v k) = PTC(XP). As the latterclass of functions is, by Proposition 4 of Buss (p. 15), exactly ~3 k + a, we actually established:

1.3.5 Theorem (Buss). The provably total functions of S k+l are exactly the functions of the (k + 1)-st level ~k+l of the polynomial hierarchy, for all k.

The analysis given here is again insensitive to extensions of the various theories by H~ Thus, the main results hold also for o Hi-extensions of the theories involved.

2 Finitary term extraction

The arguments that led to the proof theoretic characterization of the G-classes ~, could be carried out with such ease, because we considered essentially quantifier- free theories. And the term extraction lemma allows to exploit directly the uniformity properties of existential statements that are provable in Herbrand theories. In this section I will be concerned with theories T(~i) whose induction principles are genuinely stronger than those of T*(~). Here I use, as in the investigation of bounded arithmetic in Sect. 1.3, the match-up of recursion and

22 As used in Sieg (1985)

422 W. Sieg

induction: the function classes are closed under recursion schemata that allow the analysis of the induction rule. Only/ /~ results can be obtained, but that evidently suffices for my purposes. In the first subsection results for (fragments of) primitive recursive arithmetic will be established, in particular a useful refinement of the fact that (//~ is conservative over (PRA) for //O_formulas; 23 in the second subsection we come closer to traditional reductive results in proof theory when considering second order extensions of fragments of (PRA).

2.1 (Fragments of) PRA and the G-hierarchy

It is well-known that (Z~ the usual name for (X~ has exactly the primitive recursive functions as its provably total ones. That claim will be established first; it was proved by Parsons and, independently, by Mints and Takeuti. I gave a different proof in Sieg (1985); the proof given here is new and more perspicuous than any I know. In addition, to emphasize this once more, it is generalizable and gives a uniform way of determining the provably total functions of many theories.

2.1.1 Theorem. The provably total functions of (S~ are exactly the primitive recursive functions.

(Z~ is a definitional extension of (S~ Corollary 1.2.7 determined ~ R as the class of provably total functions of (QF(~9~)-IA). Thus, to prove the theorem, we just establish that (S~ is conservative over (QF(~9~)-IA) for //~ The y-inversion lemma permits us to focus on S~ and so it is sufficient to prove the next lemma. But before doing so, notice that by the corollary to the I-normalization theorem, in case both �9 and ~ consist of S~ every formula in an I-normal derivation is in X~176

2.1.2 Lemma. Let F contain only S~ if D is an I-normal derivation of F in (X~ then there is an I-normal derivation of F in (QF(~R)-IA).

Proof. The argument proceeds by induction on the number # of applications of the S~ rule in D. The claim is trivial if # = 0. So assume that # > 0 and consider an application of the X~ rule, such that no further application occurs above it. The subderivation E determined in this way ends with the inference

A,(3x)~pxO A, --7 (3x)~pxa, (3x)~pxa' . A, (3x)~pxt

~p is in QF. Let Eo be the derivation of the left premise and E a that of the right premise, q-inversion allows us 24 to extract from E o a term a[0] and a derivation in (QF(~R)-IA) of

4, ~aEO]O. (1)

23 It was used to estimate the complexity of functionals involved in proofs of Ramsey's Theorem in [Bellin] 24 Without loss of generality we can assume that A contains no universally quantified formulas (we know by the remark before the lemma that A is contained in Z~176 otherwise we can use the Y-inversion first and carry out the subsequent steps with additional parameters that will be removed in the very last step by applying first the rule for 3 and then for u

Herbrand analyses 423

The application of _V-inversion and then parameter c, a term via, c], and a derivation of

A, -7 tpca, 1plEa, c]a' .

Now we define a function f by primitive recursion

f (0 )=a [0 ]

f(a')='c[a,f(a)]

and verify easily - using (1), (2), and quantifier-free induction - that there is a derivation in ( Q F ( ~ ) - I A ) of A, ~pf(a)a and then of

A, (3x)lpxt.

If one replaces that derivation for E in D, the induction hypothesis can be employed to infer the claim of the lemma. Q.E.D.

What is crucial in these considerations is the closure property of the appropriate function class, here ~39~, that allows the functional analysis of the induction schema. This point is also brought out by (the proofs of) additional results for the theories (H~ (2;~ and (H~ If ~ consists of all primitive recursive functions these theories are all equivalent; however, if the class

of Kalmar-elementary functions is chosen as ~, then the first two theories are still equivalent, but the last theory is weaker, as its class of provably total functions is contained in ~4. This fact was established by C.D. Parsons in unpublished notes. Let me start out with the investigation of (H~ where ~ is one of the G-classes ~,, n > 2. ~o is the class of functions obtained from ~ by l-closing under the following restricted schema of primitive recursion2S:

f ( 0 , a, c) = c

f(d', a, c) = g(a "-- d', f (d, a, c)).

The 1-closing of ~,, n > 2, under the usual schema of primitive recursion yields ~,+ 1- Thus ~o is contained in ~,+ 1, but I do not know whether this inclusion is proper; dearly, ~391~ = ~ R .

2.1.3 Theorem. (H~ and (QF(~~ prove the same sentences in the language L(~j).

The theorem follows immediately from Parsons' observation that all functions of ~o can be introduced in (H~ and the next lemma.

2.1.4 Lemma. I f D is an 1-normal derivation of F in (a H~ o f ) (H~ then there is an I-normal derivation of F in (a H~ of ) (QF(~~

Proof. The argument proceeds by induction on the number 4b of applications of /-/~ in 1-normal derivations. As the case ~ = 0 is trivial, assume ~ > 0 and consider an application of this 1-rule in D such that no further application occurs above it. The subderivation E determined in that way ends with the inference

A, (Vx)zxO A, --7 (Vx))~xa, (Vx)zxa'. A, (u163

of 3_-inversion to E a yields a new

(2)

2s This schema plays also a role in other contexts; see: Troelstra's "Mathematical Investigation of Intuitionistic Arithmetic and Analysis", p. 55 and its application on p. 236

424 W. Sieg

Z is in QF and A can also be assumed to be contained in QF. (The elements of A are by the restriction on IR certainly in / /o . If they are not already quantifier-free, apply y-inversion.) Let E 0 be the derivation of the left premise and Ea that of the right premise; these derivations are I-normal ones in (QF(~)-IR). Using Y-inver- sion with a parameter c that is new for both Eo and Ea, we have derivations of

A, ZC0 (1)

and also of A, (3x)--7 zxa, zea'. 3-inversion yields a term z and a derivation of

A, --1 zz[a, c]a, zea'. (2)

z denotes a function in ~ and is used to define a function f in ~o by:

f(0, a, c) = c

f (d ' , a, c) = z(a "-- d', f (d , a, c)].

Now we can prove in (QF(~~

A, ~ z f (b , a, c)a "- b, zca (*)

by induction on b =< a. The case b = 0 is trivial, as --1 zf(O, a, c)a + 0 just is zca, and (*) is consequently a logical axiom. If 0 < b__< a we have by induction hypothesis

A, - q z f ( b "-l ,a,c)a "-(b "--l), zca. (3)

Replacing a by a = b and e by f ( b "-- 1, a, e) in the derivation leading to (2) yields A, -1 zz[a "-- b, f (b "-- 1, a, e)]a "- b, z f (b "- 1, a, e) (a = b)' and thus, using the definition of f and the fact that ( a = b ) ' = a = ( b - l ) for O < b < a ,

A, --1 z f (b , a, c)a "- b, z f ( b -" 1, a, e) (a -" (b -" 1)). (4)

Cutting (3) and (4) with cut-formula z f ( b "- 1, a, c) ( a - ( b - 1)) yields a derivation of the sequent A, --1 z f (b , a, c)a ' -b, zca. Thus we established (*) in (QF(~~ - Setting b = a in (*) gives a derivation of

A, --7 z f (a , a, c)O, zca. (5)

If we replace in the derivation leading to (1) the parameter c by f (a , a, c) we obtain by one cut with z f (a , a, c)0 a derivation of A, zca and thus of A,(Vx)zx t in (QF(~~ But the latter sequent is the conclusion of the derivation E we started out with.

The claim of the theorem follows by the induction hypothesis on # , when we consider the derivation D' obtained from D by removing the subderivations E o and Ea:D' is a derivation of F in (FI~ with only #-1 applications of H~ so there is a derivation of F in (QF(~~ + A, (Vx)zxt. By the above we know that A, (Vx)zxt is provable in (QF(~~ consequently, F is. Q.E.D.

As a direct consequence of the theorem and the earlier observation on the provably total functions of (Z~ we have Parsons' result:

2.1.5 Corollary. (Z~ is properly stronger than (//~

This fact, as well as the others, holds for //~ of the respective theories. - Now we turn our attention to the investigation of (//~ We define ~o as ~ and ~,+ 1 as the 1-closure of ~n under the ordinary schema of

Herbrand analyses 425

primitive recursion; we take ~ again as one of the ~,, n > 2. We will show, if F is provable in (//~ with k applications of H~ then F is provable in (QF(~k)-IR); this gives in particular the old reduction of (H~ to

, ( Q F ( ~ ) - I R ) . The more informative new result is made possible by a direct analysis of the/ /~ rule; the crucial tool is the inversion lemma for the quantifier combination 3V.

2.6 :IV-Inversion. Let T(~) be an Herbrand theory, let A contain only purely existential formulas, and let Z be quantifier-free; if D is a T(~)-derivation of A, (3x) (Vy)zxy, then there is a sequence of terms to,..., t,, a sequence of parameters a o . . . . . a,, and an 1-normal T(~)-derivation D O of A, Ztoao ..... zt,a,. The terms satisfy the conditions aiCP(A)u{P(t~):j<= i}.

Proof [by induction on I-normal T(~)-derivations]. We distinguish two cases. In case 1 we assume that (3x) (Vy)zxy is NOT the p.f. of the last inference. The claim is then either trivial [for T(~)-axioms], follows directly by the induction hypothesis (for rules &, v , 3, and C with atomic cut-formula], or is obtained by considerations similar to those presented in the proof of 3-inversion. Now assume, case 2, that (3x) (Vy)zxa is the p.f. of the last inference. Then D must end in an inference of the form:

A, (3x) (Vy)zxy, (Vy)Ztoy A, (3x) (Vy)zxy

_V-inversion applied to D', the derivation leading to the premise of this inference, yields an I-normal T(~)-derivation of

A, (3x) (Vy)zxy, Ztoao ;

clearly, aoCP(to)wP(A)wP((3x)(Vy)zxy). The claim follows by induction hypothesis. Q.E.D.

(H~ denotes the part of (H~ whose derivations may contain at most k applications of H~

2.1.7 Theorem. (H~ and (QF(~k)-IR) prove the same sentences in the language L(~).

The theorem follows immediately from the fact that all functions of ~k can be introduced in (H~ and the next lemma.

2 . 1 . 8 L e m m a . If D is an 1-normal derivation of F in (H~ then there is an I-normal derivation of F in (QF(~k)-IR).

Proof The argument proceeds by induction on k. As the case k = 0 is trivial, assume k > 0 and consider, as above, an application of the/ /~ in D such that no further application occurs above it. The subderivation E determined in this way ends with the inference

A,~p0 A, ~ p a , lpa' . A, tpt

~p is a/ /~ and A contains only formulas of the same syntactic form. Indeed, the elements of A can be taken to be i n s ~ (If not, apply y-inversion.) Let Eo be the derivation of the left premise and Ea that of the right premise. These derivations are I-normal ones in (QF(~)-IR) and will be exploited below to give us an I-normal

426 W. Sieg

derivation E* of A, ~pt in (QF(~x)-IR). With that we will be done: replace E by E* in D to obtain an I-normal derivation D' o f F in (H~ ~; by induction on k it follows that there is a derivation of F in (QF((~I)k_ 0-IR). But as (~l)k- 1 = ~k the claim of the theorem is established.

In addressing the main task, let us assume that lpa is of the form (Vx) (3y)zxya. From E o we obtain a term a and a derivation of

4, zcG[c]0 (1)

where c is a new parameter for D. From Ea we obtain first, by V-inversion with c, a derivation of A, -7 ~pa, (3y))~cya'; then by ~V-inversion, a derivation of A, -7)~toaoa , -7zqa~a, (3y))~cya'; finally, by 3-inversion, a derivation of

A, -7 Zto[ a, c]aoa, -7 Xtl [ao, a, c]ala, :~cz[ ao, al, a, c]a' . (2)

For simplicity's sake I took n = 2 in 3V-inversion and indicated in (2) the relevant parameters. - We would like to define a primitive recursive function f, such that we can prove in (QF(~0-IR) the sequent A, xcf(c, a)a; that would allow us to infer A, (3y)zcya and then A, (Vx) (3y)zxya. The last sequent is the conclusion of E above. The idea for the definition of such an f comes from trying to use the derivations leading to (1) and (2) in a proof of sequents of the form A, Zcz[..., m, elm for each m. Let us start doing that. Replace in the derivation leading to (2) the parameter a by 0; that yields

A, -7 Xto[0 , C]ao0, -7 ~tx [-ao, 0, c]al0, Xc~[ao, al, O, c]0'. (3)

Now replace c in the derivation leading to (1) by t o : = to[0, c] to obtain

A, zt~ a[t~ (4)

Exploiting the parameter condition we obtain from (3) by replacing ao through a[t ~

A,-qZto[O,c]a[t~ --qXtl[a[t~ Zc'~[a[t~ c]O ' . (5)

Cutting (4) and (5) yields

A, -7 ztaEa[to~ 0, c]axO, Xcz[aE t~ a . O, c]0'. (6)

Setting t~ tx [a[to~ 0, c] we continue substituting; this time we replace c in the derivation leading to (1) by t o and obtain

A, zt~176 (7)

In the derivation leading to (6) we replace c by a[t~ that gives

A, --7 Zt, [a[t~ 0, c]a[t~ Zc$[a[t~ a[t~ O, c]0'. (8)

Recalling the definition of t o we see that we can cut (7) and (8) to prove

A, gczEaE t~ air~ O, c]0'. (9)

Recognizing the pattern, one can define a function f by

f(c, 0) = a[c] ,

f(c, a') = x[f(tao, a), f(t], a), a, c],

where t~ is defined as to[a, c] and t] as q[f(ffo, a), a, c]. This is a one-fold nested recursion and thus reducible to ordinary primitive recursion [see: Peter (p. 96)1.

Herbrand analyses 427

This f satisfies the desideratum formulated above, as the patient reader can readily check. Q.E.D.

Note that this result gives, in particular, a second characterization of the classes ~,, n > 2, of the G-hierarchy: the elements of those classes are exactly the provably total functions of (//2~ that follows from term-extraction applied to (QF(I~,)-IR).

2.2 Second order extensions of (fragments of) PRA

We are going to consider now second order extensions of the theories we have been investigating. The underlying language is that of second order logic with function variables f, g, h,... and function parameters u, v, w,...; it allows also the formation of 2-terms. 26 The base theories (ET,), for n>2, are second order versions of (QF(~,)-IA) with defining axioms for function(al)s and an induction schema that may contain function parameters. They also include all instances of the schema for the explicit definition of function(al)s in the form:

2y. (t.[y]) (b) = t.[b]. ED

The union of the (ET,) is called (BT); these theories are easily seen to be conservative over (QF(~,)-IA) and (QF(~91)-IA), respectively. Stronger theories are obtained by adding further function existence principles, for example, the axiom of choice for Z~ briefly: Z~ and K6nig's Lemma for trees of 0-1-sequences, briefly: WKL, as this principle is also called "Weak K6nig's Lemma".27 If we strengthen the induction principle to S~ we obtain the theory (F); the system obtained from (F) by dropping WKL is denoted by (F-). Crucial theorems of analysis can be proved in (F); for example, Simpson has shown that, over (F-), the Cauchy-Peano Theorem on the existence of solutions for ordinary differential equations 18 is equivalent to WKL.

Let me first review the status and a few basic facts concerning forms of K6nig's Lemma. There is first of all the schematic formulation, where the tree is given by any second order formula. This formulation - as was pointed out by Howard - is equivalent to the full comprehension principle. Matters begin to get more delicate when we consider the abstract formulation KL.

(V f ) IT(f) & (Vx)(3y)(lh(y) = x &f(y) = 1 -*(3g)(Vx)(f(~,(x)) = 1)3.

Here T(f ) abbreviates that {x l f (x )= 1 } forms a finitely branching tree, i.e.

(Vx, y) [ f (x * y) = 1 ~ f (x) = 1] & (Vx)(3z)(Vy) [ f (x * (y ) ) = 1 ~ y < z].

26 Systems that have been obtained from a purely arithmetic T by just adding function parameters are denoted by T + 27 For n = 3 the resulting theory is (K)-introduced in my [Sieg 1985A]; it is equivalent to the theory WKL~ used in [Simpson and Smith]. Both theories were shown, independently, to be conservative over (QF(~)-IA) for//~ 28 Note that (F-) is equivalent to Friedman's RCAo and (F) to Friedman's WKL o. Simpson's result was established in his paper "Which set existence axioms are needed to prove the Cauchy- Peano theorem on ordinary differential equations?"; J. Symb. Logic 49, 361-380 (1984)

428 w. Sieg

In the presence of suitably strong set existence principles, for example, the arithmetic comprehension principle, KL is equivalent to a bounded version BKL, in which a bound for the size of the immediate descendants of a node is given by a function.

(Vf g) [ T ( f g) & (Vx)(qy)(lh(y) = x & f(y) = 1)~ (3 h)(Vx)f(l~(x)) = 1)].

Here T(f, g) abbreviates

(Vx, y) [ f (x * Y) = 1 ~ f (x) = 1] & (Vx, y) [ f (x * (y)) = 1 ~ y < g(x)].

Indeed the bounding function can be taken to be the constant function with value 1. Thus we are looking at trees of 0-1 sequences. It is this special version of BKL that is called Weak Kiinig Lemma WKL. What is the relative strength of these principles? Friedman observed in an unpublished paper (written in 1969) that over (BT+S~ K6nig's lemma K L is equivalent to the full arithmetic comprehension principle/ /~ Thus we have two immediate results: (1) (BT + Z~ o + KL) is equivalent to (H~ r and thus conservative over elementary number theory (Z); and (2)(BT + Z~ + H~-IA + KL)is equivalent to (//~ and thus not conservative over elementary number theory (Z). - That W K L is weaker than KL is witnessed by the following result due to Kreise129. (3) (BT + Z~ + / / ~ - I A + WKL) is conservative over (Z).

Friedman's theory (WKLo), or in my equivalent formulation (F), was shown by Friedman to be conservative over (PRA) for //~ This result was strengthened by Harrington to: (4) (F) is conservative over (Z~ § for H~-sentences. Clearly, looking at the results (3)-(4) one may wonder whether they capture isolated phenomena or whether they are aspects of quite general connections. We will see in the next section that the latter is the case. As a matter of fact, the general connection persists to weaker theories:

2.2.1 Theorem. The theories (ET~ + Z~ + WKL) are conservative with respect to ~~~ over (QF(~n)-IA).

The basic strategy for my proofs of these conservation results is quite uniform. One starts out by embedding the formal theory into a suitable sequent calculus with additional principles; then one shows the normalizability of derivations in the calculus; and finally one eliminates the additional principles from normal derivations of a class of sequents. In the last step J-inversion plays again a crucial role. The first two steps are straightforward in our cases: the appropriate sequent calculus is the standard one for a two-sorted language with induction rule; the additional principles are Z~ and WKL. Derivations in those calculi are I-normalizable; thus it suffices to establish lemmata that allow us to eliminate the additional principles. I present details for the case of (ET~ + Z~ + WKL). You should recall that 2:~ is equivalent to QF-ACo, i.e.

(Vx) (3y)r ~ ( 3 f ) (Vx)r

where ~ is quantifier-free.

2.2.2 Elimination Lemma. Let D be an I-normal ET,-derivation of A [-7 SCHEMA], where SCHEMA stands for QF-AC o or WKL; if A contains only existential formulas, then there is an I-normal ETn-derivation E of A.

29 Presented on pp. 124-126 of Kreisel, Simpson, and Mints "The use of abstract language in elementary metamathematics: some pedagogical examples" (Lect. Notes Math. vol. 453, pp. 38-131) Berlin Heidelberg New York: Springer 1975

Herbrand analyses 429

The Proof is carried out first for QF-ACo, then for the case of K6nig's Lemma. One argues by induction on the length of D and distinguishes cases as to the last rule applied in D.

Let me focus on the central case, when the last rule introduces an instance of 7 QF-ACo. I.e. D must end with an inference of the form

A [ 7 QF-ACo], (Vx) (3y)d?xy A [-7 QF-ACo], 7 (3f) (Vx)qbxf(x) A [ 7 QF-ACo], (Vx)(qy)r & 7 Of) (Vx)dpxf(x)

Let D 1 be the derivation of the left premise and D 2 that of the right premise. After removing via _V-inversion the initial quantifiers of the subformulas of the analyzed instance of 7 QF-ACo, we can apply the induction hypothesis to obtain I-normal ET,-derivations of (1) A, (3y)c~cy and of (2) A, 7 (Vx)c~xf(x). _3-inversion applied to (1) together with ED yields a function g, such that (3) A, ('r is ET,-provable. Now replace the parameter f in the derivation leading to (2) by g and obtain that the sequent (4) A, 7 ('r is ET,-provable. Cutting the derivations of (3) and (4) yields, after I-normalizing, the claim.

For the case ofK6nig's Lemma one argues also by induction on the length of D. And again I focus on the central case, when the last rule introduces an instance of 7 WKL, that is the formula

T(f) & (Vx) (3y) (lh(y) = x & f(y) = 1) & 7 (3g) (Vx)f(~,(x)) = 1.

(In general such an instance will contain a term r, not just a parameter f;3o but the considerations that follow now are completely analogous in that case.) There are I-normal ET,-derivations Di, i< 2, of the sequents

A [ 7 W K L ] , T ( f ) ; A [ 7 W K L ] , (Vx)Oy)(lh(y)=x&f(y)=l); and

A [ 7 WKL], 7 (3g) (u = 1. The Di are shorter than D. y-inversion and the induction hypothesis yield I-normal ET,-derivations Ei, i= 2, of the sequents

A,T(f); A,(3y)(lh(y)=c&f(y)=l); and A,Ox)f(a(x))+l

with new parameters r and u. _3-inversion yields now numerical terms t, s, and 1-normal ET.-derivations F1 and Fz of the sequents

A, lh(t[c])=c& f(t[c])=l and A,f(a(s[u])). l .

s and t may contain further parameters, but u does not occur in t. Now one notices (i) that t describes sequences of arbitrary length, all of which are in f and are not necessarily forming a branch, and (ii) that the formula f(~(s[u]))+ 1 (from the endsequent ofF2) expresses the well-foundedness o f f Thus we have according to Eo a binary tree that is well-founded and contains sequences of arbitrary length. This conflicting situation can be exploited by means of a formalized recursion theoretic observation, namely: s can be majorized by a numerical term s* in L(~.) that does not contain u, as u can be taken to be majorized by 1. 31 Now let t[s*] be the 0-1-sequence to, ..., t~.~ 1 and define the function u* by

u*(n)={~ if n<s* otherwise;

30 Officially f is a variable, not a parameter 31 By [Howard] and the well-known majorization results for the G-classes

430 W. Sieg

u*(s*) is equal to t[s*], f is provably a tree according to Eo, and s* is a bound for s. Thus we have a derivation of A,f(~(s*)) ~e 1 ; by replacing in this derivation u by u* and taking into account the equation u*(s*)= t[s*] we obtain an I-normal ET,- derivation G2 of

A,f(t[s*]) :4= 1.

From F1 we obtain an I-normal ET,-derivation GI (by &-inversion and substituting s* for c) with the endsequent

A,f(t[s*]) = 1.

If we cut now G1 with G2 we obtain the desired I-normal derivation E of just A. Q.E.D.

2.2.3 Remark. _3-inversion plays obviously a decisive role in this argument. It is similarly central for the elimination of other function existence principles, e.g. the axiom of dependent choices of much stronger subsystems of analysis, or the finite axiom of choice and the number theoretic collection principle of fragments of arithmetic. 32

Friedman's result can be established now (and Harrington's in the next section); the argument is simpler than the one I gave in Sieg (1985).

2.2.4 Theorem (Friedman). The theory (F) is conservative with respect to ~~~ mulas over (PRA).

That follows directly from (a slight extension of) 2.1.2 and Lemma 2.2.2, establishing in essence that (F) is conservative with respect to//~ over (BT + S~ The argument is a straightforward adaptation of that for 2.2.2.

2.2.5 Lemma. Let A contain only ,Y,~ and let r be quantifier-free; if D be an I-normal derivation of A [--1QF-ACo, --1WKL], ( 3 y)r ya in (BT + S~ then there is an I-normal derivation E of A, (3y)r in (BT + S~

3 Infinitary term extraction

We turn our attention to fragments of elementary number theory (Z). The classical result for (Z) - its provably total functions are exactly the e-recursive ones with

< Co- is due to Kreisel. The proof that all e-recursive functions can be introduced in (Z) employs a refinement of the classical set-theoretic procedure: e-recursive functions are made explicit by means of finite approximations whose existence is established by e-induction; the latter principle is also used to show that the explicitly defined functions satisfy the appropriate recursion equations. Specifi- cally proof theoretic techniques allow to show that the (Z),provable//~ have e-recursive Skolem functions. Implicit in the more general considerations below is a new and brief proof of Kreisel's old result.33 In Sect. 3.1 I establish that certain second order theories are conservative extensions of fragments of arithmetic. These results, first presented in Sieg (1987), generalize and strengthen Friedman's Theorem 2.2.4. They imply, in particular, that these fragments can be

32 As to the former see Feferman and Sieg (1981), as to the latter (Sieg 1985) and Sieg (1987) 33 Another elegant proof was given in [Buchholz and Wainer]

Herbrand analyses 431

conservatively extended by the quantifier-free axiom of choice; a result that will be most useful for considerations in the second part of the section, when it is shown that all nested ~-recursive functionals, ~ < a~n ~, can be introduced in (Z~ 1-IA)+. This is done by refining Schwichtenberg's argument for the fact that the < So-recursive functionals can be introduced in (Z).

3.1 Second order extensions of (fragments o f ) (Z)

Answering the question in Sect. 2.2 1 will use infinitary analogues of J-inversion, QF-AC0-elimination, and WKL-elimination with ordinal bounds to prove the following theorem:

3.1.1 Main Theorem. (F,): = (BT + S~ + ql,-IA + WKL) is conservative over (Z~ + for Fl~-sentences, n > O.

91, denotes the class of prenex formulas in the language of analysis whose prefix is of length at most n and starts with an existential quantifier. - As corollaries we obtain the results of Harrington's and Kreisel's that were mentioned in Sect. 2.2, but also an immediate consequence that will be useful:

3.1.2 Corollary. (i) (F) is conservative over (F-) for H ~-sentences ; (ii) (BT + S~ Co + H~-IA + WKL) is conservative over (Z); (iii) (BT + S~ + S~ is conserva- tive over (Z~ + for Fl~-sentences.

When comparing the claim of the main theorem to claims and arguments in Sect. 2.2, we face two critical questions: (i) can one extend the considerations for Friedman's Theorem to the (F,) with n > 1 ? and (ii) can one enlarge the class of conserved sentences from H ~ to H~? The second question has a simple answer: "Yes, by means of Herbrand's Theorem34! ''. The first question has also a simple answer: "No !". This negative answer is due to the restriction on _3-inversion and the impossibility of reducing the stronger induction principles to the quantifier- free one. Formulating the obstacle in this way suggests a classical way of avoiding it: eliminate the induction principle by the og-rule and prove suitable versions of J-inversion and the elimination lemmata for the semi-formal system. This strategy leads indeed to the desired goal.

The semi-formal system (BT)o~ into which the (F,) can be embedded has not only infinitary derivations, but also infinitary terms as used in Tait (1965). The latter are necessary to obtain a form of _3-inversion. Thus the language of (BT)~o is that of (BT) extended by terms (ti), where the subscript i is always assumed to range over ]hi. So let me define the appropriate notion of_Number- and Function- term with ordinal depth.

3.1.3 Definition 1. (i) all individual constants and parameters are N-terms of depth 1 ; (ii) all function constants and parameters are F-terms of depth 1. 2. (application) If t, t l , . . . , t, are terms of the appropriate kind and depth

]tl, ltl[ . . . . ,It.I, then t(t 1 . . . . . t,) is an N-term of depth max(ltl, lt~[ . . . . ,pt,])+ 1. 3. (2-abstraction) If t is an N-term of depth Itl, then 2x.(t) is an F-term of depth

I t l+l .

34 Not as formulated above, but rather concerned with the provability of(the Herbrand normal form of) an arithmetic statement; as in Schwichtenberg (p. 889)

432 W. Sieg

4. (sequencing) If the ti, i c N, form a sequence of N-terms and the ti are of depth [ti[, then (t~) is an F-term of depth sup {[t/[ + 1 : i~N}.

The calculus for (BT)| is obtained from the finitary one by adding the co-rule:

A(n) for all n o n k A(a)

In addition to the axioms of the finitary calculus above we have ( )-conversion in the form:

( t i) (n)= t,.

The ordinal theoretic measures of complexity-[El, Q(E), td(~b), td (E), are defined as in Schwichtenberg (1975); their values are ordinals below Co. For details concerning the canonical system of notations for such ordinals and the effective coding of infinitary syntactic objects I refer to that paper, too. - The (F,) can be embedded into (BT)~ in such a way that the cut rank Q of the infinitary derivation is less than or equal to n + 1, i,e. Q is determined solely by the complexity of the formulas in the induction schema.

3.1.4 Embedding Lemma. Let F be a set of formulas and let q~ be a formula; if D is a quasi-normal derivation of

F [ 7 QF-ACo, 7 3[,-IA, --1WKL], ~p

then there is a derivation E in (BT)o~ of

F [ 7 QF-AC o, --7 WKL], ~p.

Furthermore we have: IEI < co2, 0(E) <= n + 1, and td(E) < co.

The cut elimination theorem for (BT)~o is easily established. But here I am only interested in transforming special derivations into quasi-normal ones. (Note that cog=co.)

3.1.5 Quasi-normalization. Let n > 0 and let F be a set of formulas; i fD is a (BT)~- derivation of F with ]DI < ~ < co2, Q(D) =< n + 1, td(D) < k < co, then there is a quasi- normal derivation E of F such that [E[ ____ 2~ < co~ and td(E) < 2, k < co.

As before, the crucial fact for the further proof theoretic analysis is a form of J-inversion adapted to the infinitary context. Let me formulate that first - with suitable bounds for the length and term depth of the derivations.

3.1.6 =l-Inversion. Let A contain only existential formulas and let ~pbaf be quantifier-free; if D is a quasi-normal derivation of A, (3x)q~xaf with [DI < ~ and td(D) < fl, then there is an N-term t and a quasi-normal derivation E of A, ~otaf such that [El<k(~+l) and td (E)<f l+k(~+l ) . (k is a fixed natural number determined from the proof.)

The proof proceeds by induction on the length of D and does not hide any surprises; when analyzing the co-rule one exploits the possibility of forming infinite terms. For the proof of the elimination lemmata a more specialized corollary is needed, for the very technical reason of keeping the bound on the term depth in bounds.

3.1.7 Corollary (of the proof). Let A contain only existential formulas and let q~baf be quantifier-free; if D is a quasi-normal derivation of A, (3x)qgxaf with co <__ IDI < ~, td(D)__</~, and 3-inferences to F, (3x)~vx only from F, ~ps with td(s)=<l< co, then there

Herbrand analyses 433

is an N-term t and a quasi-normal derivation E of A, (ptaf such that [E[ __<k(a+ 1), td (E) ~ max (fl, k(a + 1)), and td (t) __< k(~ + 1). (k is a f ixed natural number at least as great as I and is determined from the proof).

This concludes the elementary considerations preparing the ground for the lemmata that allow us to remove instances of the axiom of choice and of K6nig's Lemma from infinitary derivations - without an essential increase in the length and term depth of the derivations. The crucial ideas for eliminating these principles from infinitary derivations are similar to those needed in the finitary case; one just has to keep track of the ordinal complexity of derivations and terms (and extend, in particular, the majorization techniques to the present context).

3.1.8 QF-ACo-Elimination. Let A contain only existential formulas; ifD is a quasi- normal derivation of A [--nQF-ACo] with [D[ __<~, td(D)__<fl, and 3-inferences to F, (3x)~px only from F, ~ps with td(s) __< l < ~, then there is a quasi-normal derivation E of A with [E[<a~.~, and td(E)__<max(fl, a). a2+ 1). (E still satisfies the side- conditions on 3-inferences).

The formulation of the WKL-elimination lemma is completely analogous to that for the elimination of the quantifier-free axiom of choice.

3.1.9 WKL-Elimination. Let A contain only existential formulas; if D is a quasi- normal derivation of A [-3WKL] with [D[__<a, td(D)__<fl, and 3-inferences to F, (3x)~px only from F, ~ps with td(s)__< 1 < ~o, then there is a quasi-normal derivation E of A with ]E[~a~.~, and td(E)__<max(fl, o~.~2+l). (E still satisfies the side- conditions on 3-inferences).

All ingredients for establishing the main theorem have been presented now; they just have to be brought together. We want to show for any//l-sentence ~p that provability in (F,) implies provability in (X~ +.

Proof (of the main theorem). Assume, without loss of generality, that ~p is an arithmetical formula containing possibly function parameters, and that (F,)~_~pn, where ~pn is the Herbrand normal form of ~p. Then there is a quasi-normal derivation of the sequent

[--1QF-ACo, ---1WKL, ---13]n-IA], lp n .

Quasi-normalization of the corresponding infinitary derivation obtained via the embedding lemma yields a quasi-normal derivation of length less than co. ~ and term depth less than co of the sequent

[-q QF-ACo, ---1WKL], ~pn.

The two elimination lemmata, used in a single inductive argument, provide us with a quasi-normal derivation of

ip n

whose length and term depth are both bounded by o)~. ~pn is of the form (3x)q~xaf and we can use 3-inversion to finally get an infinitary term t and a quasi-normal derivation of

q~taf

whose length and term depth are also bounded by ~o~. This is the final step in transforming the finitary derivation of ~p into an infinitary derivation of an instance of ip's Herbrand normal form. The considerations involved in this

434 w. Sieg

transformation can be formalized in (S~ +. The reflection principle - for a quantifier-free formula of term depth less than co~' and with a derivation of length and term depth less than co~- allows us to conclude that

(S~ + t-~p rI"

Herbrand's Theorem (see footnote 34) guarantees the conclusion

(X~

If ~p is purely arithmetic, then it is provable in the fragment of arithmetic (Z~ Q.E.D:

Complementing theme and method by topical results, we can obtain the characterization of the provably total functionals of the (Fn) and, consequently, of the (Z~ But let me first mention the characterization of the provably total functionals for two theories that prove versions of K6nig's Lemma 3s. The two theories are analysis and the theory of arithmetic properties: the provably total functionals of the former are Spector's bar-recursive functionals, those of the latter the < so-recursive functionals. 36 When claiming in the proof of the main theorem that a restricted reflection principle can be proved in (Z~ +, I made implicit use of the fact that the unnested <co~-recursive functionals can be introduced in (S~ +. In turn, the argument yields with an additional remark: if (Fn) proves (Vx) (3y)Rxy and R is primitive recursive in number and function parameters, then there is an < co~-recursive functional f, such that the statement (Vx)Rxf(x) is provable in (Z~ +. As in the proof of the main theorem, we get a quasi-normal derivation D of (3y)Ray with lg(D)< co~ and td(D)< co. _3-inversion yields a term t with td(t) < ~ < co~ and an infinite derivation of Rat, thus of (Vx)Rxt; the value of t can be determined by ~-recursion depending on the parameters that occur in t. Thus we have a < co~-recursive functional f such that (Vx)Rxf(x) is provable in (S~ +. (With Theorem3.2.8 established in the next section) we have established:

3.1.10 Theorem. The provably total functionals of (F,) and (X~ + are exactly the unnested < co~-recursive functionals.

3.1.11 Remark. For n => 1, the provably total functions of (S~ are determined by 3.1.10 as the elements of 91 ~, the unnested ~-recursive functions with ~ < co~. The elements of 9P, the nested ~-recursive functions, can be introduced for each ~ < co~ in (so+ 1-IA). Thus we have a proof theoretic way to show that

U co . 1} = U < COn'} ;

i.e. to reduce nested to unnested recursion. ~ denotes the ~-th class of the extended G-hierarchy, and the above classes equal ~" with ~ = co~ by results of Schwichten- berg and Wainer. 37 The provably total functions of (S~ and of (PRA) are, consequently, the unnested ~-recursive functions with ~ < coo.

35 Compare Sect. 2.2 36 Kreisel (1951, 52) gives the provably total functionals of (Z) +. In his introduction to the Stanford Report Kreisel wrote: "Spector's schemata provide the most perspicuous description to date of the provably recursive functions and functionals of classical analysis." Spector's paper "Provably recursive functions of analysis: a consistency proof of analysis by an extension of principles formulated in current intuitionistic mathematics" was published in Vol. 5 of Proceedings of Symposia in Pure Mathematics, 1962, pp. 1-27 37 See [-Rose] and Tait (1961). The characterization of the provably total functions for fragments of arithmetic was given in Parsons (1966A).

Herbrand analyses 435

3.2 Provably recursive functionals of fragments

We want to establish now that the nested <r functionals can be introduced in (z~~ + by refining Schwichtenberg's proof that all nested < eo-recursive functionals can be introduced in (Z) § and using the conservation result 3.1.2 (iii). In this way, a completely self-contained argument for this result can be given. Then we will see - by a much simpler p r o o f - that all (unnested) <og~'-recursive functionals can be introduced in (S~ § - The nested ~-recursive functionals are obtained from successor and projection functionals by function application, composition, primitive recursion, and nested ~-recursion.

3.2.1 Definition. Let n = no . . . . . npand f = f o . . . . . fq; a functional f (oflevel 2) is called nested ~-recursive iff it can be obtained by means of the following schemata:

(1) (projection) f(n, f) = ni, i < p; (2) (successor) f(n, f) = ni + 1, i < p; (3) (function application)

f(n, f) =f/(njo . . . . . njk ), where i < q and Jo,'",Jk < P

(4) (composition)

f ( n , t) = ~(h'o(n, t) . . . . . h'k(n , f), 2r o . ~'o(ro, n, f), . . . , 2 r , . ~t(rz, n, f ) ) ;

(5) (primitive recursion)

f(O, n, f) = ~(n, f),

f(n + l,n,f)=fi(n, f(n,n,f),n,f)

(6) (nested ~-recursion)

for n~(ct: f(n,n,f)=~,(n,n, 2x.(frn)(x,n,f),f)

( f ln)(k,n,f)= ~f(k,n,f) if k<n where lo otherwise

for ~ n : f(n,n,t)=O (so that f is totally defined).

For the main argument we need one further notion: given a functional f and sequences of arguments n, f it tells us for which arguments we need to know the values of f in order to be able to compute f(n, f).

3.2.2 Definition. A functional My is called a (canonical) modulus of continuity for f ifffor all n, f, g: [My(n, f) is the code of a finite set of argument sequences a8 for fand, if g coincides with f on My(n, 0, then f(n, f) =f(n, g) and My(n, f) = My(n, g)].39

The bracketed requirement of this definition is quantifier-free, as both "finite set of argument sequences for f" and "g coincides with f on M~(n, f)" can be expressed by quantifier-free formulas.

a8 An argument sequence for fis of the form ((...) ..... (...)}, where (. . .) is a tuple of arguments for fo and (.. .) a tuple of arguments for f~; for functions f In the modulus lists by convention only (non-trivial) arguments less than n 39 The last conjunctive condition M~(n, f)= MI-(n, g) is needed, it seems to me, to verify in the proof of the claim below, that a certain finite function is a computation. It is not given in Schwichtenberg (1977), but a perfectly natural condition and the reason for the parenthetical "canonical"

436 W. Sieg

3.2.3 Lemma. For any nested ~-recursive functional f, ~< Co, we can define a nested cr modulus of continuity M ir. f and M s can be introduced in a recursive extension of (Z) + +(QF-ACo): the recursion equations for f can be proved, and M ic can be shown to be a modulus of continuity for f.

As (Z) § + (QF-ACo) is a conservative extension of (Z) § the lemma holds for ( z ) + .

3.2.4 Theorem. All nested ~-recursive functionals can be introduced in recursive extensions of (Z) +.

Now we come to the proof of the lemma; it proceeds by induction on the build- up of nested cr functionals.

Proof (of 3.2.3). The first two cases are direct, as the value of the functionals is independent of the function parameters in f, and 0* can be chosen as the value of M~. For case (iii), function application, the modulus of continuity must'consist of the appropriate argument sequences (...(n~0 .. . . . n~k)... ). Now let me consider composition:

f(n, f) = ~(/~o(n, f) .. . . . hk(n, f), 2r o . ~o(ro, n, f), ..., 2r,. ~,(r,, n, f))

By induction hypothesis the ~,/~o ... . . /~k, 2ro. ~o ... . . 2rz. ~z have moduli of continu- ity and they are, together with their moduli, available in a recursive extension of (Z) + + (QF-ACo). For arbitrary n, f observe that

Ma(/~o(n, IF),..., hk(n, f), 2ro. ~o(ro, n, f) .. . . . 2r,. ~,(r,, n, f))

yields (the code of) a finite set S of argument sequences (y j ) j < z for the 2rj. ~j, such that the value of~ is determined- given n, f; that is done uniformly in n, f. Clearly, f can be explicitly defined in the formal theory; as far as M~ is concerned, let it be defined as follows:

# #

U Mh,(n,f) U~ 0 {Mk~(Yj, n , t ) I (Y j ) j<-~S&j<I} �9 i < k

This is provably a mo/:lulus of continuity for f. - Primitive recursion is treated similarly to, but more simply than a-recursion. So let me present the case of cr f is given by

y(n , . , t3 = (y In)(x, n, f), t ) .

We can assume by induction hypothesis that ~ and M~ have been introduced in a recursive extension of (Z) § + (QF-AC0) and that Mo is provably a modulus of continuity for ~. By an extension of the usual method for introducing primitive recursive functions we will introduce the ct-recursive functional f. The idea is to find - using M 0 - finite approximations u 4~ to f, such that the value of~ at n, n, l'n, and f is determined correctly, u can be found uniformly in n, n, f.

Remark. Here is the reason, why the modulus of continuity is introduced at all: we want to get finite approximations to the a-recursively defined functional; the modulus of continuity is crucial in determining the domain of that approximation.

40 The finite approximations are always given in the following way: u = {(ao .... ),..., (ak .... )}*, such that u(al) is the second component of the pair with first component a~; clearly, {a o ..... ak} is the domain of u

H e r b r a n d analyses 437

Definition. u is called a computation for f at n, n, f iff(i) u is a function with domain {ao, ..., ak} (it is assumed that ao<...-<ak =n);

(ii) u(ai)=~,(ai, n,(ul*ai), f) for i<k, where (ul*ai)(x)=u(x) if x=aj and j<i, and (ul*ai) (x) = 0 otherwise;

(iii) {(r)oo : r ~ M~(a,, n, (ul*a,), f)} n {x : x ~(a,} __c {ao . . . . , ai-~ } for all i < k. - The conditions (i)-(iii) are quantifier-free and do not involve f. Now we show in

(Z) + + (QF-ACo) the following claim:

(Vn, t) (Vn) (3u) [u is a computation for f at n, n, fJ

The proof proceeds by ~(,-induction on n for the S~ (3u)[...] with parameters n, f. If n = 0, we let u be the finite function with domain {0} and u(0) = ~(0, n, 0, f) where 0 is the identically 0 function. (i) and (ii) are clearly satisfied, as (ul*O)(x)=O(x) for all x. Notice for (iii) that

{(r)o o :r ~ Mo(O, n, (ul*0), f)} _-_ 0.

If n = 2 ~ 0, we have by induction hypothesis

(Vx<2)(3u)[u is a computation for f at x,n,f] ;

we want to find a u such that u is a computation for f at 2, n, f. Notice, first of all, that by QF-AC o we obtain from the induction hypothesis a function a, such that (Vx~(2) In(x) is a computation for f at x, n, f]. Define tr*(a) = tr(a) (a) if a~(2 and, otherwise, tr*(a)=0; tr* yields the value of f for all a~(2. From the modulus of continuity M o applied to 2, n, (o-* r2), f we obtain a finite sequence of ordinals Xo,..., x,_ 1, such that Xo ~(... ~, Xl- ~ ~(xl = 2; this sequence is obtained by ordering the elements of {(r)oo :r ~ M~(2, n, (tr* ~2), 0}- It is for these ordinals that values oftr* are needed to determine the value of ~(2, n, (tr* I2), f). Those values are given by the

#

computations tr(xi), i < I. Define the finite function v by U tr(xi); its domain consists

of 2o, ..., 2~, ..., 2k ~(2 and contains all the x~, i < I. Extend it to 2 = x~ by joining to it the pair (2, ~(2, n, (V[*2k), f)); call the resulting finite function u. Noting that vl*2 k = ul*2k and (vl*2)(x)= (tr* I2)(x) for x = xi, i< l, we can see - using the fact that we are dealing with canonical moduli - that u is a computation for f at 2, n, f. This proves the claim.

The claim allows us to introduce in a recursive extension of (Z) § + (QF-AC0) a functional a such that

(Vn, f)(Vn) In(n, n, f) is a computation for f at n, n, f] .

We define explicitly f(n, n, 0 = a(n, n, 0 (n) and show that f, so defined, satisfies the recursion equation f(n, n, 0 = ~(n, n, 2x. ( f In) (x, n, f), t). This is easily done by transfinite induction on n(~(~); one makes use of the facts that a provides computations and that M o is a modulus of continuity of ~. - To complete the treatment of ~-recursion we have to introduce a modulus of continuity My for f. It suffices to show that My can be defined by 0t-recursion, e.g. by

#

My (n ,n , f )=Eu* U {Mf(k,a,f):k~(n&k~(E)oo}

where E is M~(n, n, 2x. ( f rn) (x, n, t), f). This suffices, as the above argument for the introduction of f can be adapted to introduce My. What remains to be done is to prove in (Z) § that My is indeed a modulus of continuity for f. That can be established by ~(~-induction on n (and is not difficult). Q.E.D.

438 W. Sieg

The considerations in the proof of the lemma are elementary, except for the appeal to a conservation result and the use of transfinite induction up to ~ for a S~ If a < co~, that principle is available in (S~ ~-IA)+; by Sect. 3.1 we know that (S~ ~-IA)+ + (QF-ACo) is conservative over (S~ a-IA) n for H~-sen- tences. Thus we have:

3.2.6 Theorem. I f f is a nested a-recursive functional, a < ~o~, then f can be introduced in (S~ 1-IA) +.

Now I want to consider briefly the case of (unnested) ~-recursion and establish that every ~-recursive functional, a<co~, can be introduced in (S~ n. The notion of an ~-recursive functional is given as the nested one, except that the schema (6) of nested a-recursion is replaced by thatfor (unnested) ~-recursion:

(6') f is defined by (unnested) ~-recursion from g and 0, where 0 is called an a-predecessor functional, when the following conditions are satisfied:

f (n,n, t )=~,(n,( f ln)(O(n,n,f),n,f),n,f) with

( f ln ) (k ,n , f )= {fo(k,n,f) ifotherwisek~,n~a

and

0(k,n,t)={-<~ if 0~(k~(a = otherwise.

The counting functional so is defined a-recursively by s0(0,n,f)=0 and so(n, n, f) = so(O(n, n, f), n, t) + 1 if 0-<n-<a; thus s~ counts the number of steps in the strictly descending sequence formed by k, O(k, n, f), 0(0(k, n, f);n, f) ..... 0. For functionals f that are not a-predecessor functionals we let s~ be 0. In Rose (p. 59) the following lemma is formulated; assume that f is defined from g and 0 by a-recursion and that ~ is given by primitive recursion from ~ and 0 as follows:

~(O,n,n,z,f)=z,

r n, n, z, f) = ~,(n, ~(y, n, O(n, o, f), z, f), n, f) ;

then f(n, n, f) = r n, t), n, n, ~(0, 0, n, t),/). The idea underlying this construction will be used in the proof of the lemma.

3.2.7 Lemma. For any a-recursive functional f, ~ < Co, we can define an a-recursive counting functional s~. f and s~ can be introduced in a recursive extension of (Z) +, and s~ can be proved to be a counting functional for f.

Proof(by induction on the build-up of ~-recursive functionals). Only the case of a-recursion will be treated; so let f be defined from ~ using the a-predecessor functional 0. By induction hypothesis ~, s 0, 0, and so are introduced in a recursive extension of (Z) +. Define the iterate of 0 by 0~ and 0~+l(n,n,t) = 0(0~(n, n, f), n, t); the notion of 0-computation is parallel to that of computation used in the proof of 3.2.3.

Definition. u is called a 0-computation for f at n, n, f iff (i) u is a function with domain

{ 0~(n, n, f)ll ~ so(n, n, f)}

Herbrand analyses 439

and

(ii) u(Ol(n, n, t)) = p,(Ol(n, n, t), (ul*OZ(n, n, f)) (Y+ l(n, n, t)), n, f) holds for l < sdn, n, f). Here

(ul*n)(x,n,f)=u(x,n, fl for x=OJ(n,n,f) and O<j<so(n,n, f);

the value of (ul*n)(x, n, t) is otherwise 0. We want to establish the claim that for every n there is a 0-computation u for f

at n, n, f. However, here we can do better by giving a bound b on u. Clearly, it suffices to give a bound on the "value" of u, as

u = { < 0 , ...>,..., < 0(n),...>, <n, ...>} And u is smaller than

{(Oi(n,n,f), r - i ,n, n,~,(O,O,n,f),f)) :i<=so(n,n,f) ~ .

Now we prove in (Z) + the claim:

(Vn, f) (Vn) (3u__< b) [u is a 0-computation for f at n, n, f ] .

The proof proceeds by ~(~-induction on n for the quantifier-free formula (3u = b) [ . . .] with parameters n, f. If n = 0, we let u = {(0, ~(0, 0, n, f))} #. If n ~ 0, then O(n, n, f ) ~ n and we have by induction hypothesis a 0-computation v for f at O(n, n, f), n, f. Extend v by

{(n, ~(n, (vI*n)(O(n, n, f)), n, f))} ~.

The resulting finite function u is clearly a 0-computation for f at n, n, f. That establishes the claim. And the claim allows us to proceed as in the nested case: we can introduce f, prove its recursion equation, and define s~. Q.E.D.

We needed in this proof only ~ , - induct ion for quantifier-free formulas; that principle is available in (Z~ +. So we have:

3.2.8 Theorem. I f f is an ~-recursive functional ~ < o~', then f can be introduced in (E~ +.

4 Concluding remarks

There are hosts of broad philosophical and detailed technical issues. At their intersection lie two of the most interesting and pertinent ones. They concern - on the one hand - the efficient extract.ion of terms (t from derivations D of existential statements (3y)Oy) and generation of associated proofs (D* of ~bt), and - on the other hand - the determination of mathematically significant bounds from proofs. Any attempt to take on the first set of issues must take into account the growing connection between foundational work in computer science and proof theory; the links are provided by formal frames that serve both as (bases for) programming languages and as theories in which parts of mathematics can be developed. 41 Any attempt to take on the second issue must involve substantive mathematical work,

41 See, e.g. Feferman (1990) and Schwichtenberg (1988) - also for further references to the expanding literature

440 w. Sieg

as witnessed strikingly by [Luckhardt] . It seems most likely that experience in these domains will lead to the discovery of important structural properties of mathematical proofs.

I want to make two remarks on general directions that work - more strictly within proof theory - is (or might be) taking.

(1) Investigation of strong subsystems of ana lys i s - the traditional constructive principles have been exhausted; to get closer to (and ultimately, beyond) (/-/21-CA) new "constructible" objects and principles, based on their build-up, have to be found. A deepened connection to set theoretic investigations, in particular of the fine structure of L and large cardinals, is to be achieved.

(2) Investigation of weak fragments of arithmetic (or other weak theories for finite objects)- the traditional syntactic analysis of formal theories is inadequate to obtain discriminating results for theories whose class of provably total functions is properly included in (~3; to obtain such results will require differentiations that mirror directly the "combinatorial" complexity of finite mathematical objects.

Proof theory has left its pure metamathematical shell. We can expect genuine progress: in each case, familiar techniques and considerations will (merely) be fruitful starting-points.

References

Bellin, G.: Ramsey interpreted: a parametric version of Ramsey's theorem. Contemp. Math. 106, 17-37 (1990) Buchholz, W., Sieg, W.: A note on polynomial time computable arithmetic. Contemp. Math. 106,

51-56 (1990) Buchholz, W., Wainer, S.: Provably computable functions and the fast growing hierarchy.

Contemp. Math. 65, 179-198 (1987) Buss, S.: Bounded arithmetic. Naples: Bibliopolis 1986 Feferman, S.: Theories of finite type related to mathematical practice. In: Handbook of

Mathematical Logic, pp. 913-971. Amsterdam: North-Holland 1977 Feferman, S.: Polymorphic typed lambda-calculi in a type-free axiomatic framework. Contemp.

Math. 106, 101-136 (1990) Feferman, S., Sieg, W.: Proof theoretic equivalences between classical and constructive theories

for analysis. In: Lect. Notes in Math., vol. 987, pp. 78-142. Berlin Heidelberg New York: Springer 1981

Ferreira, F.: Polynomial time computable arithmetic. Contemp. Math. 106, 137-156 (1990) Friedman, H., Simpson, S., Smith, R.: Countable algebra and set existence axioms. Ann. Pure

Appl. Logic 25, 141-181 (1983) Girard, J.-Y.: Proof theory and logical complexity. Naples: Bibliopolis 1987 Hilbert, D., Bernays, P.: Grundlagen der Mathematik II. Berlin: Springer 1939 Howard, W.A.: Hereditarily majorizable functionals of finite type. In: Lect. Notes Math., vol. 344,

pp. 454-461. Berlin Heidelberg New York: Springer 1973 Kreisel, G.: On the interpretation of non-finitist proofs I. J. Symb. Logic 16, 241-267 (1951) Kreisel, G.: On the interpretation of non-finitist proofs II. Symb. Logic 17, 43-58 (1952) Kreisel, G.: Mathematical significance of consistency proofs. J. Symb. Logic 23, 155-182 (1958) Luckhardt, H.: Herbrand-Analysen zweier Beweise des Satzes von Roth: Polynomiale Anzahl-

schranken. J. Symb. Logic 54, 234~263 (1989) Mints, G.: Quantifier-free and one-quantifier systems. J. Sov. Math. 1, 71-84 (1973) Mints, G.: What can be done in PRA? Zap. Nauchn. Semin. Leningr. Otd. Mat. Inst. Steklova 60,

93-102 (1976) Parsons, C.D.: Reductions of inductions to quantifier-free induction. Notices Am. Math. Soc. 13,

740 (1966) Parsons, C.D.: Ordinal recursion in partial systems of number theory. Notices Am. Math. Soc. 13,

857-858 (1966A)

Herbrand analyses 441

Parsons, C.D.: On a number-theoretic choice schema and its relation to induction. In: Kino, Myhill, Vessley (eds.) Intuitionism and proof theory, pp. 459~473. Amsterdam: North- Holland 1970

Parsons, C.D.: On n-quantifier-induction. J. Symb. Logic 36, 466-482 (1972) Ritchie, R.W.: Classes of recursive functions based on Ackermann's function. Pac. J. Math. 15,

1027-1044 (1963) Rose, H.E.: Subrecursion - functions and hierarchies. Oxford: Oxford University Press 1984 Schwichtenberg, H.: Eine Klassifikation der eo-rekursiven Funktionen. Z. Math. Logik Grundla-

gen Math. 17, 61-74 (1971) Schwichtenberg, H.: Elimination of higher type levels in definitions of primitive recursive function-

als by means of transfinite recursion. In: Rose, Shepherdson (eds.) Logic Colloquium '73, pp. 279-303. Amsterdam: North-Holland 1975

Schwichtenberg, H.: Proof theory: some applications of cut-elimination. In: Handbook of Mathematical Logic, pp. 867-895. Amsterdam: North-Holland 1977

Schwichtenberg, H.: LCF with realizing terms: a framework for the development and verification of programs (Manuscript, 1988)

Sieg, W.: Fragments of arithmetic. Ann. Pure Appl. Logic 28, 33-71 (1985) Sieg, W.: Reductions of theories for analysis. In: Dorn, Weingartner (eds.) Foundations of Logic

and Linguistics, pp. 199-231. New York: Plenum Press 1985A Sieg, W.: Provably recursive functionals of theories with Krnig's Lemma. Rend. Semin. Mat.

Torino, Fascicolo speciale 1987, pp. 75-92 (1987) Sieg, W.: Herbrand Analyses. Abstract for the European Summer Meeting of The Ass. Symbolic

Logic, Berlin, 1989 Schiitte, K.: Beweistheorie. Berlin Heidelberg New York: Springer 1960 Simpson, S., Smith, R.: Factorization of polynomials and ~~ Ann. Pure Appl. Logic 31,

289-306 (1986) Tait, W.W.: Nested recursion. Math. Ann. 143, 236-250 (1961) Tait, W.W.: Infinitely long terms of transfinite type. In: Crossley, Dummett (eds.) Formal systems

and recursive functions, pp. 176-185. Amsterdam: North-Holland 1965 Tait, W.W.: Normal derivability in classical logic. In: Barwise, J. (ed.) The syntax and semantics of

infinitary languages. (Lect. Notes Math., vol. 72, pp. 204-236) Berlin Heidelberg New York: Springer 1968

Takeuti, G.: Proof theory, 2nd edn. Amsterdam: North-Holland 1987 Wainer, S.: A classification of the ordinal recursive functions. Arch. Math. Logik 13, 136-153

(1970) Wilkie, A.J., Paris, J.B.: On the scheme of induction for bounded arithmetic formulas. Ann. Pure

Appl. Logic 35, 261-302 (1987)