HPSG without PS?

Richard Hudson, UCL

draft: August 1995

1. Introduction

There are two ways of thinking about the structure of a simple example such as Small babies cry: in terms of phrases and the relations among their parts or in terms of the words and their relationships to one another. At a very elementary level, the two approaches could be diagrammed as in Fig. 1.

Fig. 1

The arrows will be explained shortly, but their main significance is to represent some kind of `horizontal' word-word dependency, in contrast with the vertical relationships (following the diagonal lines) in the first diagram. The approaches are based respectively on phrase-structure (PS) and dependency-structure (DS). Both diagrams show that small babies combine to form a phrase, but they show it in different ways. In the PS analysis the phrase itself is explicit, and the word-word relationship is implicit: node 4 stands for the phrase, and the lines connect 1 and 2 to 4, but not to each other. For the DS analysis this balance is reversed: the arrow shows the word-word relationship explicitly, but the resulting phrase is left implicit.

The purpose of this paper is to argue that syntactic analysis which includes DS should not use PS as well; more precisely, I shall argue that this is true of most of syntax, in which constructions are headed (and involve subordination), though I shall not try to argue the same for coordination. (Indeed, I have argued elsewhere that coordination is precisely the one area of syntax where PS is appropriate; see Hudson 1990: chapter 14.)

Virtually everyone accepts DS as part of syntax, even if not by name - the notion `long-distance dependency' makes it explicit, but government, agreement, valency (alias subcategorization), and selection are all horizontal dependency relationships, and all word order rules seem to be expressible as dependencies. Similarly most theories now recognise `grammatical relations' such as head, complement, adjunct and subject; although usually expressed in terms of a function in the larger phrase, these can all be translated easily into types of word-word dependency. As PS was originally defined by Chomsky, none of these notions was available; so there really was no alternative to PS as a basis for syntactic analysis. But now that so many dependency notions are available in most syntactic theories, it is time to ask whether we still need the PS as well.

The question applies particularly urgently to Head-driven Phrase Structure Grammar (HPSG; Pollard & Sag 1994) as can be seen from the simplified version of Pollard and Sag's analysis of Kim gives Sandy Fido (p. 33) in Fig. 2.

Fig. 2

The most interesting thing about this diagram is the way the verb's structure cross-refers directly to the nouns by means of the numbers [1], [2] and [3]. These cross-references are pure DS and could be displayed equally well by means of dependency arcs. Almost equally interesting is the way in which the verb shares its class-membership, indexed as [4], with the VP and S nodes. An even simpler way to show this identity would be to collapse the nodes themselves into one. The only contribution that the phrase nodes make is to record the word-word dependencies via their `SUBCAT' slots: the top node records that the verb's subcategorization requirements have all been met (hence the empty list for SUBCAT), while the VP node shows that it still lacks one dependent, the subject. This separation of the subject from other dependents is the sole independent contribution that PS makes in this diagram; but why is it needed? Pollard and Sag argue persuasively (Chapter 6) against using the VP node in binding theory, they allow languages with free constituent order to have flat, VP-less structures (40), and in any case HPSG recognises separate functional slot for subjects (345). It is therefore important to compare the HPSG diagram in Fig. 2 with its pure-DG equivalent in Fig. 3?

Fig. 3

What empirical difference is there between these two diagrams? What does Fig. 3 lose, if anything, by not having a separate node for the sentence? Could an analysis like Fig. 3 even have positive advantages over Fig. 2? Questions like these are hardly ever raised, less still taken seriously. Pollard and Sag go further in this respect than most syntacticians by at least recognising the need to justify PS:

But for all that a theory that successfully dispenses with a notion of surface constituent structure is to be preferred (other things being equal, of course), the explanatory power of such a notion is too great for many syntacticians to be willing to relinquish it. (p. 10)

Unfortunately they do not take the discussion further; for them the `explanatory power' of PS is self-evident, as it no doubt is for most syntacticians. The evidence may be robust and overwhelming, but it should be presented and debated. A reading of the rest of Pollard and Sag's book yields very few examples of potential evidence. PS seems to play an essential role only in the following areas of syntax:

in adjunct recursion (55-6),

in some kinds of subcategorization where S and VP have to be distinguished (125),

in coordination (203),

in the analysis of internally-headed relative clauses, for which they suggest a non-headed structure with N' dominating S (233).

Apart from coordination (where, as mentioned earlier, I agree that PS is needed) the PS-based analysis is at least open to dispute, though the dispute may of course turn out in Pollard and Sag's favour.

The question, then, is whether a theory such as HPSG which is so well-endowed with machinery for handling dependencies really needs PS as well. My personal view is that this can now be thrown away, having served its purpose as a kind of crutch in the development of sophisticated and explicit theories of syntax; but whether or not this conclusion is correct, our discipline will be all the stronger for having debated the question. The rest of the paper is a contribution to this debate in which I present, as strongly as I can, the case for doing away with PS. The basis for my case will not be simply that PS is redundant, but that it is positively harmful because it prevents us from capturing valid generalisations. My main case will rest on the solutions to two specific syntactic problems: the interaction of ordinary wh-fronting with adverb-fronting as in (1), and the phenomenon in German and Dutch called `partial-VP fronting', illustrated by (2).

Tomorrow what shall we do?

Blumen geben wird er seiner Frau.

Flowers give will he to-his wife. `He'll give his wife flowers.'

First, however, I must explain how a PS-free analysis might work.

2. Word Grammar

My aim as stated above is `to argue that syntactic analysis [of non-coordinate structures] which includes DS should not use PS as well'. Clearly it is impossible to prove that one PS-free analysis is better than all possible analyses that include PS, so the immediate goal is to compare two specific published theories, one with PS and the other without it, in the hope of being able to isolate this particular difference from other differences.

Fortunately there are two such theories: HPSG and Word Grammar (WG; see Hudson 1984, 1990, 1992, 1993, 1994, forthcoming; Fraser and Hudson 1992; Rosta 1994). Apart from the presence of PS in HPSG and its absence from WG, the two theories are very similar:

both are `head-driven' in the sense that constructions are sanctioned by information on the head word;

both include a rich semantic structure in parallel with the syntactic structure;

both are monostratal;

both are declarative;

both make use of inheritance in generating structures;

neither relies on tree geometry to distinguish grammatical functions;

both include contextual information about the utterance event (e.g. the identities of speaker and hearer) in the linguistic structure; and perhaps most important of all for present purposes,

both allow `structure sharing', in which a single element fills more than one structural role.

Admittedly there are some theoretical differences as well:

HPSG allows phonologically empty elements,

HPSG distinguishes complements from one another by means of the ordered SUBCAT list rather than by explicit labels such as `object'.

And not surprisingly there are disagreements in published accounts over the vocabulary of analytical categories (e.g. Pollard and Sag's `specifier' and `marker') and over the analysis of particular constructions (e.g. Hudson's analysis of determiners as pronouns and total rejection of case for English; see Hudson 1990: 268ff, 230ff; 1995a). However these differences, both theoretical and descriptive, are only indirectly related to the question about the status of PS, so we can ignore them for present purposes.

One problem in comparing theories is to find a notation which does justice to both. The standard notation for HPSG uses either attribute-value boxes-within-boxes or trees, both of which are specific to PS, whereas DS structures are usually shown in WG by means of arrows between words whose class-membership is shown separately. To help comparison we can start by using a compromise notation which combines WG arrows with the HPSG unification-based notation, so that the information supplied by the grammar (including the lexicon) will be a partial description of the structures in which the word concerned may be used. For example, a noun normally depends on another word, to which it is connected by an arrow, and (obviously) can be used only at a node labelled `noun'; so Mary emerges from the grammar with the point of an arrow (whose shaft will eventually connect it t