377

Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Embed Size (px)

Citation preview

Page 1: Uriagereka J. Derivations. Exploring the Dynamics of Syntax
Page 2: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

DERIVATIONS

Derivations draws together some of the most influential work of one of theworld’s leading syntactitians, Juan Uriagereka. These essays provide severalempirical analyses and technical solutions within the Minimalist Program. Thebook pursues a naturalistic take on Minimalism, explicitly connecting a varietyof linguistic principles and conditions to arguably analogous laws and circum-stances in nature.

The book can be seen as an argument for a computational approach to thelanguage faculty. It presents an analysis of various central linguistic notions suchas Case, agreement, obviation, rigidity, from a derivational perspective. Con-crete studies are provided, covering phenomena ranging from the nature ofbarriers for extraction to the make-up of categories, and using data from manylanguages including Spanish, English and Basque.

This book will be of interest not only to the working syntactician, but alsothose more generally concerned with word architecture.

Juan Uriagereka is Professor at the University of Maryland. He is the author ofRhyme and Reason and co-editor of Step by Step. He has recently been awardedthe sixth Euskadi Prize for scientific research by the Basque Government.

Page 3: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

ROUTLEDGE LEADING LINGUISTSSeries editor Carlos P. Otero

1 ESSAYS ON SYNTAX AND SEMANTICSJames Higginbotham

2 PARTITIONS AND ATOMS OF CLAUSE STRUCTURESubjects, agreement, case and clitics

Dominique Sportiche

3 THE SYNTAX OF SPECIFIERS AND HEADSCollected essays of Hilda J. Koopman

Hilda J. Koopman

4 CONFIGURATIONS OF SENTENTIAL COMPLEMENTATIONPerspectives from Romance languages

Johan Rooryck

5 ESSAYS IN SYNTACTIC THEORYSamuel David Epstein

6 ON SYNTAX AND SEMANTICSRichard K. Larson

7 COMPARATIVE SYNTAX AND LANGUAGE ACQUISITIONLuigi Rizzi

8 MINIMALIST INVESTIGATIONS IN LINGUISTIC THEORYHoward Lasnik

9 DERIVATIONSExploring the dynamics of syntax

Juan Uriagereka

Page 4: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

DERIVATIONS

Exploring the dynamics of syntax

Juan Uriagereka

London and New York

Page 5: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

First published 2002by Routledge

11 New Fetter Lane, London EC4P 4EE

Simultaneously published in the USA and Canadaby Routledge

29 West 35th Street, New York, NY 10001

Routledge is an imprint of the Taylor & Francis Group

© 2002 Juan Uriagereka

All rights reserved. No part of this book may be reprinted orreproduced or utilized in any form or by any electronic, mechanical,

or other means, now known or hereafter invented, includingphotocopying and recording, or in any information storage or

retrieval system, without permission in writing from the publishers.

British Library Cataloguing in Publication DataA catalogue record for this book is available from the British Library

Library of Congress Cataloging in Publication DataUriagereka, Juan.

Derivations : exploring the dynamics of syntax /Juan Uriagereka.p. cm.

Includes bibliographical references.1. Grammar, Comparative and general – Syntax. 2. Minimalist

theory (Linguistics) I. Title.P291 .U748 2002

415–dc212001058962

ISBN 0-415-24776-4

This edition published in the Taylor & Francis e-Library, 2005.

“To purchase your own copy of this or any of Taylor & Francis or Routledge’scollection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.”

ISBN 0-203-99450-7 Master e-book ISBN

(Print Edition)

Page 6: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

PARA ELENA, ISABEL Y EL BICHITO

Page 7: Uriagereka J. Derivations. Exploring the Dynamics of Syntax
Page 8: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

CONTENTS

Acknowledgments ix

1 Introduction 1

2 Conceptual matters 22

PART ISyntagmatic issues 43

3 Multiple spell-out 45

4 Cyclicity and extraction domains 66with Jairo Nunes

5 Minimal restrictions on Basque movements 86

6 Labels and projections: a note on the syntax of quantifiers 115with Norbert Hornstein

7 A note on successive cyclicity 136with Juan Carlos Castillo

8 Formal and substantive elegance in the Minimalist Program:on the emergence of some linguistic forms 147

PART IIParadigmatic concerns 177

9 Integrals 179with Norbert Hornstein and Sara Rosen

vii

Page 9: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

10 From being to having: questions about ontology from aKayne/Szabolcsi syntax 192

11 Two types of small clauses: toward a syntax of theme/rheme relations 212with Eduardo Raposo

12 A note on rigidity 235

13 Parataxis 253with Esther Torrego

14 Dimensions of natural language 266with Paul Pietroski

15 Warps: some thoughts on categorization 288

Notes 318Bibliography 347Index 358

C O N T E N T S

viii

Page 10: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

ACKNOWLEDGMENTS

Since this constitutes the bulk of what I have written about in the last few years, todo justice to everyone who has contributed to it would take me another book. Mydebt to my co-authors (Norbert Hornstein, Paul Pietroski, Eduardo Raposo, SaraRosen, Esther Torrego and my former students Juan Carlos Castillo and JairoNunes) can hardly be expressed; not only did they make my life easier, they alsomade the collaborative pieces, by far, the best ones. Similarly, I cannot repay afraction of what I got from those who have influenced me the most, in particularthose mentioned in the introduction. I am especially thankful to those linguistswho sat in my classes, from Maryland and several other institutions, and thosewho allowed me to be on their thesis committee, at their own risk. Only for themand with them, have these pages begun to make any sense. I thank Carlos Oterofor his guidance and for giving me the opportunity to put these ideas together,and the editorial staff of Routledge for making it possible, especially RosemaryMorlin. Thanks also to my editorial assistants Acrisio Pires and Ilhan Cagri, aswell as to Haroon and Idris Mokhtarzada, who designed the graphs in the lastchapters. Finally, I more than thank my wife for having given me a daughter whois truly in a different dimension.

Chapter 2, Section 1, “Book review: Noam Chomsky, The Minimalist Program,”is reprinted from Lingua 107: 267–73 (1999), with permission from ElsevierScience.

Chapter 2, Section 2, “On the emptiness of ‘design’ polemics,” is reprinted fromNatural Language and Linguistic Theory 18.4: 863–71 (2000), with permissionfrom Kluwer Academic Publishers.

Chapter 2, Section 3, “Cutting derivational options,” is reprinted from NaturalLanguage and Linguistic Theory 19.4 (2001), with kind permission fromKluwer Academic Publishers.

Chapter 3, “Multiple spell-out,” is reprinted from S. D. Epstein and N. Horn-stein (eds) Working Minimalism, 251–82 (1999), with permission from MITPress.

Chapter 4, “Cyclicity and extraction domains,” with Jairo Nunes, is reprintedfrom Syntax 3:1: 20–43 (2000), with permission from Blackwell Publishers.

ix

Page 11: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Chapter 5, “Minimal restrictions on Basque movements,” is reprinted fromNatural Language and Linguistic Theory 17.2: 403–44 (1999), with permissionfrom Kluwer Academic Publishers.

Chapter 8, “Formal and substantive elegance in the Minimalist Program: on theemergence of some linguistic forms,” is reprinted from C. Wilder, H.-M.Gaertner and M. Bierwisch (eds) Studia Grammatica 40: The Role ofEconomy Principles in Linguistic Theory, 170–204 (1996), with permissionfrom Akademie Verlag.

Chapter 10, “From being to having: questions about ontology from aKayne/Szabolcsi syntax,” is reprinted from A. Schwegler, B. Tranel andM. Uribe-Etxebarria (eds) Romance Linguistics: Theoretical Perspectives(Current Issues in Linguistic Theory 160), 283–306 (1998), with permissionfrom John Benjamins.

Chapter 11, “Two types of small clauses: Toward a syntax of theme/rheme rela-tions,” with Eduardo Raposo, is reprinted from A. Cardinaletti and M. T.Guasti (eds) Syntax and Semantics, Volume 28: Small Clauses, 179–206 (1995),with permission from Academic Press.

Chapter 12, “A note on rigidity,” is reprinted from A. Alexiadou and C. Wilder(eds) Possessors, Predicates and Movement in the Determiner Phrase (Linguis-tics Today 22), 361–82 (1998), with permission from John Benjamins.

Chapter 15, “Warps: Some thoughts on categorization,” is reprinted fromTheoretical Linguistics 25:1, 31–73 (1999), with permission from Walter deGruyter.

A C K N O W L E D G M E N T S

x

Page 12: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

1

INTRODUCTION

This book is called Derivations for two reasons. First, the work is largely deriva-tive, especially on the research of Noam Chomsky. To some extent thatdescribes the work of most generative linguists, but mine derives also from workby others – notably, Howard Lasnik, my teacher, Richard Kayne and JamesHigginbotham – in pretty much the way a composer’s resumé presents varia-tions on great pieces by the classical masters. I have no problem playing second(or nth) fiddle to these people, among other things because this allows me toperform at a level that I would otherwise never achieve, and that is both fun and(for me at least) useful. This is all to be honest with the reader who may dislikethe research of these linguists; mine will follow suit. Hopefully the correlationalso works in the opposite direction.

The second reason I call the book Derivations is more technical, and inas-much as the term is often seen as opposing “representation,” it needs someexplanation.

1 The notion of “representation” in linguistics and elsewhere

The word “representation” is used in at least two different senses. A technicaluse, which I will write in bold in this introduction, is common in linguistics; amore general one, in philosophy. Although these uses are somewhat connected,they must be kept apart in principle.

The linguistic notion of representation is inherited from the tradition of con-catenation algebras that gave rise to generative grammar. A level of representa-tion must include:

(1) Level of representationiii. a vocabulary of symbolsiii. a procedure to form associations among symbolsiii. a class of acceptable formal objects that the particular level admitsiv. a unification procedureiv. a procedure to map different levels among themselves.

Examples of (i) are nouns or verbs at one level, consonants or vowels atanother, and so on. Concatenation, or some looser association instantiates (ii).

1

Page 13: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Instances of (iii) are valid phrases, valid syllables and so on. The idea that a sen-tence must have a subject and a predicate, or whatever unifies words, are goodexamples of unification as in (iv). Finally, the traditional mapping of S-structurefrom D-structure in the Standard System illustrates the role of (v).

The notions so defined are called levels of representation, and not just“levels,” because the way in which one level relates to the next is through aparticular relation called “representation,” or “rho” in shorthand. This notion isactually more familiar to linguists in its “is-a” guise, the converse of “rho.” Theway a level of representation is structured is through its “is-a” relations withregard to objects of the next level. For example, we can take words as outputobjects of a morpho-phonemic level of representation and ask how these objectsrelate to the next level up, that of phrases. Clearly we want to say that, in thesentence John loves Mary, for instance loves Mary is a VP, whereas that relationis not defined for John loves. Identically, VP “rhos” or represents loves Mary. Incomparable ways we would have said that Mary represents [m] [ae] [r] [i], andsimilarly for other relevant pieces of information in the syntactic object (e.g. a“chain,” or set of phrase-markers linked via the movement transformation,could in these terms be said to “rho” its links or component parts).

A full characterization of representations is what the linguistics program isultimately about, and the hypothesis that the language faculty is organized interms of these particular layers of distinctive structure is obviously substantive.Indeed, it is not necessary to characterize linguistic knowledge in these particu-lar terms. In particular, it is possible that linguistic stuff does not cohere intoparticular layers of structure with the characteristics just outlined. Instead of itbeing organized in terms of relevant notions of the form in (iii) above, unified asin (iv), and mapped to the next layer as in (v), it could well be that linguisticinformation is scattered around a dynamic computational procedure, with nosuch thing as specific objects and operations of the form in (iii) ever arising inthe unified terms in (iv) – thus with no overall mapping of the sort implied in(v). Or to go to the other extreme: perhaps there is a single level of representa-tion where all relevant formal objects coexist, without there being any sense indeclaring that given objects are logically prior to others, thus making at least (v)irrelevant.

Linguists speak of “purely representational” systems whenever they aredealing with a formal model which does not have any derivational properties. Aderivation, in the linguistic sense, is a finite sequence of computational steps,with a definite beginning and a definite end, which are taken one step at a timeaccording to some pre-established rules. Derivational systems manipulate non-terminal and terminal symbols, according to concrete permissible productions,set in motion through a starting axiom (e.g. S→NP VP). Representations areestablished between non-terminals and strings of terminals. In this sense, a stan-dard derivational system is partly representational, in fact heavily so ifrepresentations cohere into levels as described above, so that Level n-1 pre-cedes Level n, and so on.

Right away we must say that there should be no temptation of understanding

D E R I V A T I O N S

2

Page 14: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

these dynamic metaphors (“productions set in motion,” “Level n-1 precedesLevel n”) in a temporal sense. While such an interpretation has been advancedin the literature, it is by no means necessary: it is possible, I would even saydesirable, to understand each rule application as a timeless logical process, as amatter of separating the linguistic competence of a speaker from how it is put touse in a given environment.

A purely representational system lacks all of that computational machinery,and in this sense may stand in a super-case relation with regards to a deriva-tional system (although not all such systems do). The way to characterize apurely representational system is through formal axioms of some sort oranother, which determine, in the form of admissibility conditions, which particu-lar set-theoretic objects are recognized as valid formal objects. Of course, aderivational system is that too, but what counts as admissible in that system iswhat conforms to the computational pattern sketched above.

In contrast to all that technical talk of representations, the word “represent”is given a dozen or so entries in good dictionaries, the primary ones revolvingaround the notions of “portraying” or “embodying” something. These uses havegiven rise to classical philosophical concerns over the ultimate reference ofsymbols. For instance the name “Jack Kennedy” is said to represent the Presi-dent of the US assassinated in 1963 (ideally one should now be looking at theman himself). Moreover, a slightly more exotic use of “represent” (listed byWebster’s dictionary as “rare”) is more pertinent to a recent philosophical tradi-tion: “to bring before the mind, come to understand.” This is the leitmotivbehind the “representational theory of mind,” which in the works of JerryFodor and others has used the metaphor of the Turing machine and its compu-tational representation as a model for mind in general.

Both of those philosophical uses have some bearing on linguistics. The clas-sical understanding of “representation” is relevant to many semantic theories,for instance. This notion is entirely at right angles with everything I have to sayin this book, even those chapters where reference is discussed. Only rabidlyanti-mentalist theories of reference would be incompatible with the ratherabstract semantic proposals in this book, and I have nothing to contribute to thistopic one way or the other.

The modern philosophical understanding of “representation” is only a bitmore pertinent to the ideas in the foregoing pages. Philologically, the termrepresentation in concatenation algebras was probably borrowed from a philo-sophical use, although the moment that the notion “rho” is given its technicalmeaning, philology becomes a curiosity. Similarly, the notion “work” in physicsmay have its origins in British labor relations, but this matters very little to cal-culation in joules.

Then again, I certainly provide in these pages an account of such technicallinguistic notions as “command,” “Case,” “agreement,” or even “noun” and“verb.” The philosopher concerned with the representational theory of mindmay be interested in whether notions of that ilk are represented somewhere, inthe sense that they are “brought (in some form) before the mind” as one

I N T R O D U C T I O N

3

Page 15: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

“comes to understand” them (in some form, also). Of course, both derivationaland representational systems in principle face this sort of question; the fact thatthe latter are called “representational” does not make them more or less so inphilosophical terms, vis-à-vis derivational alternatives.

I ought to clarify right away that – respectable though the philosophical ques-tion may be – I do not think that linguistic symbols and procedures have toomuch to add to its elucidation. I say this because I doubt that VP, or the firstsegment [m] in Mary, or any such element, represents (in the philosophicalunderstanding of the term) a phrase or a consonant or whatever. There is noserious sense – at least no linguistic sense I can think of – that VP or [m] (or forthat matter whatever those shorthands boil down to, things like [�consonant,�nasal, etc.]) “bring before the mind” a given phrase or consonant, or one“comes to understand” those elements as a result of VP or [m]. In fact, so far asI can see those elements themselves are (they do not represent) a given phraseor a given consonant.

One can, even ought to, ask what it means for VP or [m], etc. to be linguisticobjects. There is a non-trivial question about what it means for those objects tobe mental, which I return to in passing in the next section. But it is not clear tome how involving the extra representational assumption does anything towardclarifying this very difficult issue. I suppose that, ultimately, objects like thesereduce to energy patterns of some sort, but only because in this sense everythingelse does – including (literally) the shapes of our various organs or the way inwhich our immune system operates. Needless to say, claiming that these thingsare (complex) energy patterns says nothing very deep, given the gap that existsbetween our understanding of energy patterns and the way organs, systems,mind and so on, come out. But by the same standards, claiming that VP, [m] andso on, represent something or other, given our present level of understanding,amounts to adding a claim about linguistic reality without much empirical basis,so far as I can tell.

2 Five differences between derivational and representationalsystems

In this book I try to navigate mainly the derivational ocean, to find out whatwould count as a (minimalist) explanation from that perspective, and whatgeneral properties we expect the language faculty to have under those circum-stances. Others study representational alternatives, and hopefully we will even-tually find out, with the attained knowledge, how to decide on whether onealternative is superior; they are irreducible yet both necessary, or somethingdeeper is common to both. I want to state in this introduction, though, what Itake to be the main predictions of each sort of system, and what general the-oretical moves favor one or the other take, as best as I understand them.

It ought to be insisted on at this point that, since there is no temporal claimbeing made in any of the discussion above, it would be fallacious to distinguish adynamic derivational system from a static representational one in terms of

D E R I V A T I O N S

4

Page 16: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

whether or not the mind/brain acts dynamically or statically in performance.Under current understanding of all these notions (especially whether themind/brain acts dynamically, statically, or in some mixed, or deeper way) specu-lations in this respect seem senseless.

As I said, derivational systems may be sub-cases of many representationalsystems, though this is not necessary. A purely representational system is onewhose filtering conditions simply cannot be expressed as internally motivatedderivational steps. Optimality Theory (OT) is one such system. Traditional(Paninian) rule-based systems work on input structures to generate, in a step-wise fashion, appropriate output structures. OT theorists question this mode ofoperation, arguing that many grammatical forms cannot be generated in thisway with both an appropriate level of explanatory adequacy and lack of internalparadoxes for the computational system. Typically, paradoxical situations ariseif computational rules R1 and R2 are logically ordered, but the only way onecould generate (grammatical) object O is by applying R1, then R2 and then R1again, to the very same object. In contrast, this sort of situation is not particu-larly problematic for OT systems, which work on abstract, input set-theoreticobjects which get filtered by the system in terms of violable (soft) output con-straints ranked in various fashions, in much the same way as standard connec-tionist networks proceed. This constitutes the first difference of a serious sortbetween derivational and representational systems.

Note that this difference can push matters in the derivational direction also.For instance, imagine that object O must a fortiori be described as involving ruleordering, as in some sense rule R1 has the effect of destroying the context foranother instance of this rule to apply, so that only R2 can apply next. If such asituation arises, the emerging ordering may not be describable in purely repre-sentational terms, as it may well be that some crucial information that waspresent in the course of the derivation disappears by the end of it, and is notrecoverable.

In any case, both of these formal circumstances, one showing a representa-tional bias and the other a derivational one, have to be qualified in terms of atleast two things. One is the possibility that there should always be a representa-tional residue (e.g. a copy trace) of any rule application, in which case strictlyspeaking no rule would entirely destroy the context for its own later application,or that of another rule, if the residue is fully informative. Second, the object Owhere rules R1 and R2 apply may be structured in two “cycles” C1 and C2 (i.e.C2[… C1[…]…]), so that rules R1 and R2 apply within C1, and next R1 appliesagain at C2. As will be clear throughout, motivating cyclic domains is a centraltopic of this book.

Purely derivational systems expect ultimately no stability points of the repre-sentational sort, whereby a set of representations can be adequately unified intoa particular level. For example, in the transition from the Government andBinding (GB) model to minimalism, D-structure as a level of representationvanished, for empirical reasons. As a result, the system could intersperse D-structure-like objects (phrases) with S-structure-like objects (chains), in a

I N T R O D U C T I O N

5

Page 17: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

strictly cyclic fashion. Whereas in any Extended Standard version of the systemone would first complete D-structure and then go back down the phrase-markerto start generating transformations, minimalism does both things simulta-neously, relying on economy principles to decide on which sort of step comesfirst. In that sense, minimalism is obviously less representational than GB,although standard minimalism (if there is such a thing) still retains representa-tional residues in LF or PF, levels of representation in the sense above. Someresearchers, myself included, have been exploring the possibility that this sort ofresidue is unnecessary too, and the entire model can be purely cyclic, with noproper levels.

We say that a derivation converges at a level of representation like PF or LFif it meets whatever legibility conditions these levels impose; otherwise itcrashes. This is the principle of Full Interpretation, demanding the legibility ofsyntactic representations. In its most radical version, a purely derivationalistsystem with no levels of representation should not have a coherent notion ofconvergence as such, and thus not obey the legibility conditions of Full Interpre-tation. This is neither good nor bad a priori: it depends on whether the systemitself actually generates all and only the sorts of objects that convergence/legibility is supposed to sanction. The focus of the system, and the resources, aredifferent, however. Thus a partly representational system can generate object Oand then ask: “Does O converge according to Full Interpretation (is Olegible)?” The filtering effect of a negative answer is lost in a purely derivationalsystem: the syntactic procedure itself has to come out with the grammaticaloutput, for there is no intermediate representational point for the system to stopand evaluate the representations achieved, proceeding onwards with only thegrammatical ones in terms of what is legible to the next component of thesystem.

We can illustrate that point in terms of various familiar notions; for instance,a chain C understood as a set of (characteristically connected) phrase-markersK and L, that is: {K, L}.

(2) [K…[…L…]…]… Chain C� {K, L}

movement

For a pure representationalist this is a formal object obeying certain generaladequacy conditions (of command, uniformity and locality among its links, forexample). In contrast, the derivationalist must take a chain as nothing but theadequate – obeying command, uniformity, locality – derivational history of acomputational procedure: say, movement (again, “history” having no temporalconnotations). The emphasis for the representationalist is on a (complex)symbol, whereas for a derivationalist it is on a (complex) mechanism, whichconstitutes the second difference of a serious sort among the systems.

At this point it is appropriate to raise a potential issue with regards to thephilosophical notion of representation. A philosopher may be interested inwhether representing symbols is more or less troubling than representing

D E R I V A T I O N S

6

Page 18: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

mechanisms. If the issue of representations is scientifically sound, a symbolwould seem like an a priori candidate for a representation (in the Webster defi-nition a symbol is “something which represents or typifies another thing,quality, etc.”). In contrast, a mechanism (in the Webster sense, “the workingparts of a machine collectively”) need not be a priori representational, and itcould be as dull – though “clever” as well – as the immune or the motor systemsappear to be. For what it is worth, one can certainly give a derivational accountof obviously non-symbolic systems: computational biology has scores ofexamples of just that sort, where talk of representations, in the philosophicalsense at least, would be utterly beside the point.

It is hard to turn that into an argument for derivational systems for tworeasons. First of all, it is not obvious that the philosophical issue, as presentlyformulated, has a real scientific status. A representationalist could rightlydefend this view by saying that no one has a clear picture of what it means to bea symbol, and hence whether a symbol needs to be mentally represented anymore or less than a (mental) mechanism does. And as for the fact that severalcomputational accounts exist of processes in the natural world which invoke norepresentations, the representationalist could dismiss their significance topresent concerns by claiming that they are just using the computational mechan-ism as a modeling tool, whereas a derivationalist assigns this mechanism anontological status of some sort which, when push comes to shove, ought to berepresented as much (or little) as symbols are.

Second, even for radically derivational systems which reduce everything tomechanisms, use of symbols at some level is not eliminable, at least in principle.It is for instance possible that the grammar generates some object according toits purely internal procedures, but the object in question simply cannot be inter-preted by the interface components of the system. Strictly, this sort of objectwould not violate legibility conditions as such, as there are none in this extremeview; but it would not be intelligible. Intelligibility is an extra-syntactic notion,perhaps even a performative one – though not because of that a notion that onecan shy away from. Conditions against vacuous quantification, for example,ruling out an illicit sentence like Who does John love Mary?, are very possibly ofthis sort; they may well have a say on whether otherwise grammatical structures,generated by standard derivational procedures, are judged acceptable by nativespeakers. If so, sooner or later all systems out there, be they derivational or rep-resentational, have to hit on symbols, which is the representational issue raisedin these paragraphs.

The role that symbolic elements play on one’s system may not decide its rep-resentational nature, but nonetheless have a bearing on the system’s internalshape. At stake ought to be how the system bottoms out. Symbols are arbitrary,and one must a fortiori admit either one of two things regarding prime symbols:(i) the primes that the systemic apparatus arranges in this or that fashionare completely random and could have been other elements; or (ii) there is asub-system that predicts the properties of those primes. Of course, if (ii)obtains, one would want to understand the nature of that sub-system, and then a

I N T R O D U C T I O N

7

Page 19: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

slippery-slope kind of argument suggests itself: the sub-system will also have tobottom-out, and the same puzzle arises all over again. Nonetheless, in principlea purely derivational system can at least pursue a type (ii) approach in meaning-ful ways, in much the same way that one does in a science like physics, whereobjects at some level are interactive processes at another – and the slippery-slope concern is basically set aside. In contrast, a representational system, by itsvery nature, has fewer scruples with a type (i) approach. The important thing inthis general view is to establish the axioms that govern the interactions amonggiven primes, and the exact nature of these symbols is admittedly immaterial.To be sure, this difference does not, in itself, distinguish between the two per-spectives: it will depend on how reality turns out to be, whether prime symbolsout there are more or less random or can be successfully modeled as havinginternal computational properties. In any case, in principle this constitutes athird difference between the two sorts of systems.

We have seen above how, for both theoretical and empirical reasons, aderivational system may need to have cyclic characteristics. To repeat, a systemwithout levels of representation ought to capture whatever properties theselevels were designed to code in terms of cyclic application of processes; in turn,to address otherwise serious internal paradoxes with rule ordering, a deriva-tional system may need to fix a given order of rule-application (for instance interms of complexity) and then whenever that order seems to be violated, invokea new cycle. Given this cyclic nature of the system, it is worth asking, also,whether cycles cut computational complexity, and whether this is a desirableproperty for grammars to have. If it is, this may turn into an argument for aderivational system. Naturally, concerns regarding computational complexity (asopposed to other sorts of elegance) should be immaterial in a representationalsystem. This is a fourth difference between the systems.

The possibility that computational complexity is relevant to the appropriatemodeling of linguistic phenomena turns out to be a very difficult premise toestablish in purely formal terms. Once again, one should not confuse the com-putational complexity that arises, normally in some mathematical limit, for asystem of use (a performance matter) with whether complexity determines thedesign of the system of knowledge (a consideration about competence).Suppose one were to show that, as a class of alternative derivational objects thatenters a computation grows, one cannot reach a computational decision aboutthe optimality of the chosen derivation in polynomial time, or even in realistictime. Even if this were the case, it tells us little about the system of knowledge,just as problems with, say, center embedding do which make memory resourcesblow up. Speakers would simply use whichever chunks of structure work, notgoing into limiting situations, just as they avoid center embedding.

Nonetheless, there are potential empirical advantages to proceeding with thedesire to avoid computational complexity. Trivially, a system without levels ofrepresentation faces no need to treat unlimitedly large representations as unitswhere dependencies can in principle hold. In turn, if a procedure is found forbreaking down representations into minimal blocks, and these turn out to be

D E R I V A T I O N S

8

Page 20: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

natural domains for syntactic processes to take place, then an argument for thederivational system is at hand. Conversely, if there are syntactic dependencieswhich are established across computationally defined cyclic domains, that wouldbe an argument for a representational system.

Aside from the four differences I have sketched above between the twosystems, all of which can be found in the specialized literature, there is a fifthdifference, which to the best of my knowledge has not been seriously discussed.Consider the matter of glitches or imperfections in a system. A representationalmodel has the formal structure of a logical edifice obeying eternal axioms; anerror in such a system should constitute the end of the game. Computationalsystems, however, are less pristine, as familiar bugs in our personal computersreadily attest. A computational glitch need not halt a program, which may findways around it and even benefit (very rarely) from particular bugs. Once again,this is a priori neither good nor bad for modeling the linguistic system. It willdepend on how it pans out, empirically. For the most part, I will have relativelylittle to say about this last difference, but it is something to keep in mind in prin-ciple, particularly as linguists discuss to what extent the language faculty obeysproperties of design optimality, itself a purely empirical matter.

In sum, these are the serious differences existing between each sort ofsystem:

(3) The first difference: The role of orderingThe second difference: The presence of symbols vs. proceduresThe third difference: The role of prime architectureThe fourth difference: The role of computational complexityThe fifth difference: The presence of glitches

3 Symbols vs. procedures

Chomsky’s derivational version of the Minimalist Program (MP) addressestheoretical and empirical worries with Deep and Surface-structures by denyingthe existence of these particular levels. If the grammar is an optimal interfacewith outside systems, only the latter will determine the actual levels ofrepresentation; this rules out the model-internal S-structure. In turn, the inter-face levels are supposed to be (virtually) conceptually necessary. Although it isunclear how this rules out D-structure (the lexicon could in principle introducean interface with a conceptual system), empirical arguments exist againstD-structure as a level of representation in the technical meaning above (see e.g.Chomsky 1995b: 188, reporting on an argument by Lasnik based on an exampleby Kevin Kearney).

Epstein and his associates have explored the possibility that no particularlevel of representation is part of the system – not even LF or PF. This book,among other things, considers the conditions that the model should have inorder to fulfill that sort of goal. Note that failing to meet any of the five definingconditions in Section 1 would suffice for an object not to count as a level of

I N T R O D U C T I O N

9

Page 21: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

representation. Possibly, MP does not need (v), given assumptions about inter-face optimality (i.e. if there is only PF and LF, and they do not feed oneanother). (iv) is also unnecessary if a formal unification is achieved in perform-ance, a perfectly reasonable possibility. For example a subject and a predicatewould unite to form a proposition only when (Fregean) interpretation demandsarise, after LF. Of course, (i) through (iii) above constitute the substantivecontent of each representational bit in the system (what distinguishes, say,phonology from semantics). While discussing the representational nature ofthese elements is central to this book, it is nonetheless clear that meeting thoseconditions need not be done in terms of a level of representation: those can bejust properties of whatever formal objects derivations work with.

Consider in that respect Chomsky’s “bare Phrase-structure” (BPS) project,adapted from earlier work by Muyskens. This thesis takes minimal and maximalprojections as initial or end states of (non-vacuous) merger processes. Althoughthey are real representational objects in the system (for instance in determininguniform chains), levels of projection are thus not primitive, and in fact theychange as the derivation unfolds (e.g. X� prior to and after merge, from amaximal to intermediate projection status). Epstein’s general approach tocommand can be interpreted in a similar vein: while this notion is real in that itdetermines a host of representational properties, command from X to Y isnothing but a reflex of whether X and Y are introduced in the same derivationalworkspace. Whether X commands Y depends only on whether there happens toemerge a K containing Y that immediately contains X; that situation will ariseonly if there is no separate, as it were stable, L that X is part of.

The Multiple Spell-Out (MSO) idea, a central topic of this book, was pro-posed in that spirit. Assume a version of Kayne’s Linear CorrespondenceAxiom (LCA) compatible with Chomsky’s BPS:

(4) a. Base: If X commands Y then X precedes Y.b. Induction: If X precedes Y and X dominates Z, then Z precedes Y.

In minimalist terms (4) should either follow (from “virtual” conceptual neces-sity, including economy considerations in this rubric) or be a consequence ofinterface conditions. Assume furthermore that linearization is a PF demand.(4a) can be deduced as one optimal solution to that interface need, given pre-existing merged objects (see Chapter 3). However, (1b) does not follow fromanything, so the question is whether it can be eliminated. MSO suggests that itcan, if Spell-out is taken to be a mere rule applying as many times as necessaryup to convergence (within economy limits).

In the MSO system, natural “cascades” of structure emerge every time thederivation encounters a non-complement. Thus these units are predicted to besubstantively employed, for instance in:

(5) a. A definition of “command” (relevant wherever it is employed,e.g. “distance”).

b. Focus projections in PF.

D E R I V A T I O N S

10

Page 22: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

c. Scope, etc. at LF.d. CED (and other complement/non-complement) effects in the

derivation.

The latter follow if Spell-out “flattens” structure, converting an array of embed-ded sets (a phrase-marker) into a string of items which can be read as ultimatelyphonetic symbols. That string is no longer a syntactic object, hence it disallowsoperations within it (the “impenetrability” of cascades is thus deduced; seeChapters 3, 4 and 5 on this).

Conversely, phenomena not amenable to the “cascade” pattern cannot beaccounted for within the limits of the narrow computation (in a standardbottom-up fashion); for instance in:

(6) a. Anti-command effects.b. Liaison and other across cascade effects in PF.c. Antecedence effects at LF.d. Combinations of these (e.g. Weak cross-over, including anti-

command and novelty).

Within this system, what are notions such as agreement (concord) or Case?Agreement may just be an “address system” to unify separate cascades (thiswould explain why it does not arise for head-complement relations, part of thesame cascade; see Chapters 3 and 5). Case may be a mark of syntactic distinct-ness for identical lexical types, which would otherwise collapse into one another(see Chapter 8). Observe, for instance:

(7) DP V DP (e.g. He hit he.)

How does the grammar tell the “first” DP from the “second” in order to operatewith them? By hypothesis (structure dependency) the grammar cannot use“first” and “second” notions. It may use syntactic context as token identifica-tion, but what happens prior to merge (in the numeration, a set of tokens) orafter merged units collapse into PF strings (at the point of unification of sepa-rately spelled-out cascades)? There is no syntactic context in those instances,and Case may be seen as a configurational coding: e.g. “nominative”� sister toT projection, etc. Apparently this phenomenon (distinct Case valuation) onlyhappens locally; not across separate cascades, for instance. That is expected if itis locally (within cascades) that Case values are important.

Aside from descriptive consequences, that approach has indirect con-sequences if coupled with assumptions about a narrow mapping into the seman-tics, a broad minimalistic assumption:

(8) Transparency ThesisSemantic relations arise from computationally established syntacticrelations.

Then, all other things being equal, (10) should hold (a central theme of Chap-ter 8):

I N T R O D U C T I O N

11

Page 23: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(9) Semantic interpretation of DP dependency (binding, control,licensing) is under command.

(10) Elements marked for syntactic distinctness (e.g. through Case value)are interpreted as semantically distinct (not co-referent or co-variant),a local obviation effect.

Clearly, then, the general derivational dynamics of MSO have important repre-sentational consequences. One of the main goals of this book is to substantiatethis general point.

4 Computational complexity

A different kind of argument for a derivational approach is based on the possi-bility that computational complexity is relevant to the appropriate modeling oflinguistic phenomena. Chomsky has made that argument central to his lastcouple of papers. MSO shares those papers’ characteristic of cutting down com-putational complexity, although it arises for different reasons there. ForChomsky the issue emerges technically – as a way to cyclically eliminatingunwanted features – and is taken to be primitive: the desire to cut computa-tional complexity makes the system assume a proviso about “phases” in thecomputation. As we saw in the previous section, for MSO the issue arose as anattempt to deduce Kayne’s LCA from computational dynamics. Thus, althoughcyclic Spell-out does cut down computational complexity, that does not moti-vate its presence within the system – convergence does.

Another difference between Chomsky’s system of phases and MSO is in thestatus of “impenetrability.” Since Chomsky’s phases are stipulated (in terms ofcomplexity avoidance) he can stipulate, further, whether these are transparentdomains, and if so under what circumstances (e.g. at the “edge”). But MSO cas-cades (in Chapters 3, 4, and 8) are theorematic, hence cannot be tampered with.Adding to or subtracting from cascades would have to be done by way ofmanipulating the reasoning that yields them as side-effects.

Three other chapters dealing with “impenetrability” are 4, 6 and 7, the lattertwo collaborations with Hornstein and Castillo. All these pieces have as a side-effect the emergence of cyclic domains with island properties (across agreeingsubjects, at LF and successive-cyclically, respectively). But it is important tonote that in all these instances a barrier for an operation (and hence a cycle ofsorts) arises because of derivational properties, not just in order to cut deriva-tional complexity, although of course that is a consequence as well.

5 Ordering

We have examined syntactic notions that either make no obvious representa-tional sense (Why should specifiers, and not complements, agree? Why shouldarguments, and not predicates, host Case differences?) or are as natural as rep-resentational alternatives that do not arise (Why are dependencies under

D E R I V A T I O N S

12

Page 24: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

command, and not anti-command?). But there are also notions that either seemas natural as derivational alternatives that do not arise, or make no obviousderivational sense. Among the former, consider the second argument of a binaryquantifier, as in (11e):

(11) a. John likes most children.b. [[most children] [John likes ____]]c. The size of the intersection of the set of children and the set of

entities that John likes is larger than half the set of children.d. most�children� set of childrene. most�John likes� set of entities that John likes

While it is clear what sort of derivational syntax relates most and its first argu-ment, children, what kind of relation exists between the terms of “�” in (11e)?

As Chapter 6 shows, even if we do QR, that question has no obvious answer:

(12) [XP [DPmost children] … [YP John likes t]]

Whatever we say XP and YP are, John likes t is not in construction with most orits projections. Familiar modes of argument taking are as seen in (13), which canbe deduced derivationally:

(13) a. Head-complement relations (merge of head to maximalprojection).

b. Head-specifier relations (merge of two maximal projections).

(13a) is used for direct, internal arguments whereas (13b) is for indirect, exter-nal arguments, in both instances as seems natural: internal arguments withinhead projection, external arguments without, but still merging. Semantically, wewant John likes t to be the external argument (thus the specifier) of most but itis not even a sister to the determiner, let alone its specifier.

We can stipulate some new, non-derivational specification to indicate thatJohn likes t is the external argument of most. For instance, assuming standardQR:

(14) If (at LF) quantifier X is immediately contained within Y, then Y isX’s external argument.

But we should ask ourselves why (14) holds, and not (15) (or more radicalalternatives):

(15) a. If (at LF) quantifier X is (immediately) dominated by Y, then[same as (14)].

b. If [same as (14)], then Y is X’s internal argument.

Representationally, (15) seems no more or less elegant than (14). Whichmakes one wonder whether QR plus (14) is the right way to go, since nothingrequires it.

A derivationalist ought to worry and try to address these serious difficulties.Ideally, this should be done in such a way that leaves no room for doubt: with

I N T R O D U C T I O N

13

Page 25: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

the certainty that the solution is not “patch-work,” with the possibility that itcannot be replicated in natural representational terms. For the latter, the bestsituation is based on systemic characteristics that are best expressed in purelycomputational terms; this should be so in such radical conditions that the outputof the computational procedure cannot reproduce the input, which is like sayingthat some information gets lost along the way.

It is hard to find relevant examples of “loss of information,” among otherthings because, on the average, linguistic processes are highly conservative.Familiar constraints on the recoverability of deletion operations, or whatChomsky calls the “inclusiveness” of derivations (no features can be added to acomputation if they were not part of the input), can obviously be expressed interms of some sort of Law of the Conservation of Patterns (see Chapters 8 and15). That same law, however, normally prevents us from teasing apart a compu-tational and a representational approach.

At the same time, there may be places where the law does not hold, just asthere are points in the universe where some physical conservation laws comeinto question (e.g. black holes or the early universe). One such point mightinvolve the creation of a linguistic object which is possible only after someother, definably simpler computational object has been achieved. Although inother circumstances the “simpler” object can surface, in the concrete construc-tion in point such an object would be a mere computational state with no repre-sentational consequence.

One relevant example is provided by the analysis of binary quantificationalexpressions based on the notion reprojection, as discussed in Chapter 6. This is asort of formal entity which changes its label in mid derivation, in such a way thatthe original label could only be recovered by positing dual representations. Fora sentence like most people love children Hornstein and I argue that the unitmost people – the specifier of IP early in the derivation – literally takes I� as itsown specifier at a later stage, resulting in a sort of giant DP, of which people isthe complement and the rest of the structure in the sentence is the specifier.This provides a natural syntax for binary quantification, and as I mentionedturns out to also predict peculiar islands that arise late in the derivation (a rela-tion across a reprojected binary quantifier would be akin to a relation across anon-complement in the LF component, thus generally barred). A representa-tional version of this sort of analysis, based on the dual nature of quantifiers(they are both arguments at some level and predicates at a further, derivedlevel), seems more unnatural, in that it must postulate some artifact to accountfor the lost information.

A further issue that arises with regards to rule ordering is why successive-cyclic processes exist. Suppose that for some reason of the sort discussed above,linguistic objects are structured in cycles C2 [. . . C1 [. . .]. . .], which happen to beimpenetrable. How then do we ever manage long-distance relations? The mini-malist answers I am familiar with are stipulative: they either deny successive-cyclicity or they code it through some technical, “peripheral” feature whichhappens to drive the computation to intermediate stages. It is, of course, per-

D E R I V A T I O N S

14

Page 26: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

fectly conceivable that the system may have come out with a different altern-ative: the impossibility of long-distance movement altogether. While thatappears to be the case in some languages for some constructions (e.g. wh-movement), most familiar languages allow unbounded extractions.

In Chapter 7, Castillo and I suggest a very different take on this matter,based on an idea from Tree-Adjoining Grammars: in effect there is no long-distance wh-movement. What happens in unbounded movements is that the“intervening” structure gets successively merged in “tuck-in” fashion, between amoved wh-phrase and the rest of the structure. This eliminates artificial fea-tures, and raises the possibility that merger need not extend a phrase-marker atthe root. Successive-cyclic effects arise in terms of the kinds of complementizersthat can be successively tucked-in. If a language allows for non-identical com-plementizers for declaratives and questions, then a successive-cyclic effect willarise. In contrast, a language with identical complementizers for both sorts ofclause types, or a situation whereby a wh-element is trying to move acrossanother wh-element, will result in an impossible merge as the merging items col-lapse into one another under identity (see Chapter 8). This amounts to an islandeffect.

From a representational perspective, it is not entirely clear why the phenom-enon of successive-cyclicity should exist, or why there should be variationsamong languages in this respect. Needless to say, one can code relevant proper-ties of the shape of chains, but it is very strange that in language A wh-chainsshould be possible across wh-islands, while in language B only across bridge-verbs, while still in language C relevant chains should be sanctioned only whenstrictly local.

6 Prime architecture: relational terms

But to carry a derivational program through one cannot stop at showing howsome representational notions are best seen as derivational consequences. Onehas to pull the trick for all such notions, or the system is still (partially) repre-sentational. The hardest problem turns out to be with such notions as V or N,which cannot be blamed on the properties of syntagmatic or “horizontal”syntax, as these are paradigmatic or “vertical” notions, which standard deriva-tional systems are not designed to care about: in traditional derivational terms,it makes no difference whether we are manipulating V or N, or for that matterany of the ones and zeroes of a binary code. Of course, one can try to applysome horizontal syntax to such atomic-looking elements as V or N, which iswhat the generative semanticists did in the late 1960s. That project, in the limit,leads to the denial of autonomous syntax, making it a superfluous, basicallyemergent object between thought and surface linguistic representation.

For both empirical and philosophical reasons I agree with those atomiststhat take that project to be misguided. Empirically, it still seems to be thecase, as was some thirty years ago, that sub-lexical stuff does not have thecharacteristic productivity, systematicity and transparency of supra-lexical

I N T R O D U C T I O N

15

Page 27: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

combination; plus it obeys other sorts of restrictions involving the canonicity ofgiven expressions or predictable gaps in paradigms (the notion “paradigm”makes sense only in this domain). In a deeper conceptual sense one does notreally want to split the lexical atom, if only because what to do next is notaltogether clear, particularly if one’s view of semantics is as constrained assketched in the Transparency Thesis above. But then derivationalists seem to becornered into an irreducible position: they cannot explore a standard syntacticapproach to sub-lexical stuff, short of falling into a version of generative seman-tics; yet if they do not, they hit a bedrock representational tier of the languagefaculty.

I am clearer about the puzzle than about its possible solutions. The one Iattempt here springs from two sources, one more directly relevant than theother. I became seriously concerned with lexical relations when coming to the (Ithink) surprising realization that so-called relational terms cannot be confinedto a handful of peculiar elements in the lexicon which behave at the same timeas objects and as concepts – things like brother and the like. The usual treatmentfor these creatures is to assume they involve two different variables, one rela-tional and the other referential. If the referential one is used we can then comeup with expressions like the Marx brothers, which is much like the Marx guys;but if the relational variable is used, then we are allowed predicative expres-sions like Groucho is Chico’s brother. In possessive guise both of these aspectsare present at the same time, as in Groucho has a brother. The latter is import-ant, if only because much serious syntactic work has gone, during the lastdecade, into this kind of syntax, after Kayne brought to the fore some interest-ing ideas of Anna Szabolcsi’s from the early 1980s (see Chapter 10 on theseissues).

The Kayne/Szabolcsi program sought, in essence, to explore the parallelismsexisting between nominal and clausal structure, both syntactically and semantic-ally. As such, parallelisms are neither here nor there; nonetheless, if carried tothe limit this general idea may point toward the possibility that what looks likestable syntactic objects, nouns, in some instances at least, may encompass somebasic conceptual relation. That could be anecdotal (the view that this, in theend, only happens for “relational” terms) or systematic; in the latter instance,surely it would have some relevance for the matter of a putative decompositionof core lexical spaces. After all, the “ur” lexical space is the noun (one canalways think of verbs as relational, necessarily involving arguments which, in theend, bottom out as nominal). If that “ur” space can be relational, then it is notatomic, at least not necessarily so.

Working with Hornstein and Rosen, it became clear that relational terms arevery common, particularly if we extend our observations in terms of aKayne/Szabolcsi syntax to part-whole relations (the content of Chapter 9). Inshort, all concrete nouns (most of those in the lexicon) are either part of some-thing or whole to something else; indeed, often these relations are possible inboth directions (e.g. a city contains neighborhoods but is part of a nation). Cru-cially, the relevant expressions (of the sort in My car has a Ford T engine) are so

D E R I V A T I O N S

16

Page 28: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

common and predictable that it would be pointless to list their relational prop-erties as lexical idiosyncrasies.

What is more, after a serious syntactic analysis, it becomes obvious that notall logically possible “possessive” relations find an expression in syntactic terms.For instance, together with (16a) we find (16b), as the Kayne/Szabolcsi syntaxwould lead us to predict if appropriately extended to these sorts of expressions.But together with (17a) we do not find (17b):

(16) a. the poor neighborhoods of the cityb. the city’s poor neighborhoods

(17) a. a city of poor neighborhoodsb. *the poor neighborhoods’ city

Chapter 10 shows how this can be predicted, assuming a rich syntactic system, interms having to do with the locality of movement (while movement of the city in(16b), from a base position to its specifier site, can be local, movement of thepoor neighborhoods in (17b) is not). From the point of view of these relationsbeing purely lexical, this gap in the paradigm makes no sense.

If syntax aptly predicts some of these phenomena, it becomes legitimate toexplore them syntactically, perhaps without committing to too many hostagesregarding the general details of this syntactic apparatus. In the end, what isneeded comes down to a small-clause, or basic predication. From a Neo-Davidsonian perspective of the sort advocated by Higginbotham – which buildsall semantic relations around basic predications – this is arguably a welcomestep. Although I must admit that there is nothing traditional about these sorts ofpredications, which involve such nightmares as “material constitution” (e.g. thisring is/has gold) and appropriately cumbersome versions of these notions inwhat one may think as “formal” terms (e.g. this organization is/has severallevels). And to complicate things further, only some of the various combinationsof these notions are possible; thus compare:

(18) a. A [robust [ninety-pound [calf]]]FORM SUBSTANCE

b. (#) A [ninety-pound [robust [calf]]]SUBSTANCE FORM

(18b) either adds emphasis on ninety-pound or is very odd; the neutral way tosay the expression is as in (18a). Similarly in other languages, where thesenotions manifest themselves in standard possessive guise, for instance theSpanish (19):

(19) a. una ternera de noventa libras de buena presenciaa calf of ninety pounds of good looks

b. (#) una ternera de buena presencia de noventa librasa calf of good looks of ninety pounds

This must mean that there is a certain ordering in the way in which thesevarious relations manifest themselves in syntax, whatever that follows from.

I N T R O D U C T I O N

17

Page 29: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

The sort of thesis just sketched – that some apparently complex lexical itemsare best seen as the reflex of some standard syntax – is still a fairly traditionalsyntagmatic proposal. In effect, the idea comes down to taking a characteristic ofa noun which is lumped together with other features as one of its paradigmaticproperties, and discharging it as some piece of syntax. Chapters 11, 12 and 13,involving small-clauses/partitives, names and propositions, respectively, discussother such examples of the relevant sort, all analyzed in terms of roughly “pos-sessive” syntax. Of course, these moves, as such, are not entirely unlike thoseundertaken by generative semanticists. If the moves are constrained, however, toinstances where the various dependencies involved manifest themselves in richlysyntactic (meaning systematic, productive, transparent) terms, then the atomistneed not be worried. He or she can diffuse this potentially troubling instance byadmitting that some apparently lexical relations are not that, after all; the sameconcession is made regarding inflectional morphology, without the atomisticthesis being affected. Where the atomist cannot yield is in the idea that syntax,somewhere, eventually stops messing with some lexical items, more or less at thepoint that speakers stop having strong intuitions and more or less where system-aticity, productivity and transparency break down or go away.

At the same time, once a “possessive” syntax is in place, in particular, fornominal relations of the syntagmatic sort, we can ask whether that syntax doesus any good for nominal relations of the paradigmatic sort, whatever residue ofthose we are prepared to accept. We have seen some of that already, in(18)/(19). True, these are facts of syntagmatic syntax, inasmuch as we are deter-mining possible combinations among successive words; at the same time, wheredo we code the appropriate restriction? Should it not be in some paradigmaticcut of, in that instance, a noun like calf which must code “substance” depen-dents before it codes “formal” ones? That sort of ordering is familiar from otherparts of the grammar where dependencies are organized. For example, we knowthat verbs code “themes” before “agents,” or that auxiliary selection is notrandom and correlates with aspectual properties of verbs. In a sense these areall vertical cuts of language (corresponding to familiar organizations amonglexical spaces, such as the one that has all events implying a state, but not con-versely; or a count term implying a mass, but not vice versa). Those are the onesthat, if seen in sheer syntagmatic terms, worry the atomist: if we say that anevent is analyzed syntagmatically as a state plus some sort of (say, causal) func-tion, or a count term is analyzed syntagmatically as a mass term plus some sortof (say, classifier) function, we have again started the linguistic wars. Be that asit may, though, is there any advantage to thinking that the paradigmatic charac-teristic of nominal stuff being, for instance, concrete, stands in a kind of predica-tional relation to that very nominal stuff?

7 Prime architecture: categories

We know horizontal syntax to be full of hierarchies, for verbal and nominalontologies, thematic roles, auxiliaries and so on. Curiously, our normal syntax

D E R I V A T I O N S

18

Page 30: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

works without being sensitive to them – they have to be stipulated on the side.What we generally do with these objects is blame them on some outside notion,the structure of either the world or of the extra-linguistic mind. In both my collab-orations with Paul Pietroski (Chapter 14), and Chapter 15, it is suggested thatascribing responsibility for the relevant orderings in those domains is unilluminat-ing. We also submit that, at the very least, we should ask ourselves whether thatother implicational, thus hierarchical, edifice that we know we are capable of as aspecies – namely counting – is partly responsible for the observed hierarchies inlanguage. The idea would be that our mind is structured in those “dimensional”terms, and we use that apparatus to construct the structure of linguistic concepts.

In this view, if it turns out that, say, events come out as formally morecomplex than states, or count nouns than mass terms, this would have nothingto do with the structure of the semantic correspondences to these terms. Theculprit for this ordering would have to be found in the fact that, for somereason, the syntactic objects (event V vs. state V or count N vs. mass N) alignthemselves in the dimensional way that the whole and the natural numbers do,for instance.

Whatever the merit of this general approach, it would have one immediateconsequence: most of the representational apparatus needed to code symbolswould reduce to something with the formal status of “0,” the basic prime weneed in order to build mathematical concepts. One essentially starts with anominal formal space corresponding to the “ur” abstract concept (with noparticular specification about its lexical meaning, at least in syntax), and morecomplex formal spaces correspond to what amounts to topological folds on thisspace, which boost its dimensionality to the next level. This yields hierarchies ofthe sort animate�count�mass�abstract, within the nominal system. In turnverbal systems can be conceived as adding a time dimension in horizontal syntax(verbs, unlike nouns, must have a syntagmatic nature, the combination withtheir arguments). In particular, a verb can be seen as a dynamic function, thederivative over time of the nominal space of its theme, or internal argument.Drinking a beer, for instance, can be modeled as a kind of function that moni-tors the beer change in the glass until it no longer exists, which is commonlythought of as the grammatical aspect of this sort of action. In turn verbal spacescan also be “warped” into higher dimensions through the addition of arguments,yielding familiar accomplishment�achievement�activity� state hierarchies.

What is derivational and representational in that sort of “warping” system isnot easy to tell. Surely something with the notional status of “0” is still represen-tational, but it is no longer clear whether whatever has the notional status of aderivative of a function of arguments like “0” over time need be any more rep-resentational than the computation of “command” as we saw above, or any suchsimple-minded computational entity. I cannot venture any speculations todecide on this interesting question simply because I lack the mathematical, com-putational or philosophical expertise to do so. My intention is simply to bringthe matter to an appropriate level of abstraction, so that the question can bedecided, or at least posed, in terms that merit the discussion.

I N T R O D U C T I O N

19

Page 31: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

This view of things addresses the atomism vs. decomposition conflict only inpart, as it is a mixed solution. In an ideal world the atomist should rest assuredthat horizontal syntax does not have to have the properties of vertical syntax.True, in essence I am assuming a sort of very abstract predication “all the waydown,” so that whenever we find a hierarchy in language of the sort . . .A�B�C, I am in the end analyzing it as PRED(C)�B, PRED(B)�A.Although in some languages these very basic predications clearly show up inmorphemic, or even lexical guise (classifiers, causativizers, etc.), their nature issignificantly different from that of “normal” predicates like red (x). In this viewof things, CLASSIFIER(x), for instance, when applied to whatever codes theextension of a mass term, yields a new kind of formal object: an individual, orclassified extension of mass. And the point that matters to me right now is thatthis is not a syntagmatic entity, but a paradigmatic one. As to why that dif-ference should translate to lack of productivity, systematicity or transparency ofthe relevant system, be sensitive to canonicity considerations (ultimately amatter of frequency in standardized use), or for that matter why people shouldhave less solid intuitions about this sub-lexical stuff than supra-lexical stuff, thatis anybody’s guess. In turn the decompositionalist, given this general view,should also be satisfied with the fact that it allows for all the standard reasonswe have to look inside lexical atoms: to capture lexical entailments (or similarrelations) and model the structure of languages where given lexical propertiesshow up as morphemes, or separate words. That sort of evidence is totally con-sistent with the model just sketched, and furthermore constrained by it.

As for the representational residue, there is little to say. One can blame it onsome pattern of brain activity, or even the interlocking of patterns coming fromthe various organs that are possibly implicated in “ur” thoughts (in a dimen-sional system, it is not hard to imagine “talk” across dimensions of differentorgans without giving up a modularity thesis). Of course, this will not satisfy thephilosopher who wants to nail linguists into having to admit some sort ofrepresentations, in the philosophical sense. The good news here, I suppose, isthat this representation is so far removed from even familiar notions like V orN, that perhaps it should not be all that troubling. It poses a central questionabout what “thought” is, but not, perhaps, any deeper than that.

8 Organization of the chapters

The present book is divided in two main halves: a series of standard (syntag-matic) derivational chapters, and a series of more radical (eventually paradig-matic) derivational chapters. In addition, a conceptual, introductory section setsup those concrete parts in Chapter 2. The introductory section includes a reviewof the Minimalist Program that appeared in Lingua (107, 1999) and my contri-bution to a polemic that came out in NLLT (2000 and 2001). I think that thesepieces give a substantial presentation of the general derivational approach froma conceptually broad minimalist perspective.

The syntagmatic derivations section starts (Chapter 3) with an article written

D E R I V A T I O N S

20

Page 32: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

in the mid-1990s, which was published in Epstein and Hornstein (1999), wherethe MSO system is sketched. Although less conceptual, the paper with JairoNunes, which appeared in Syntax (3, 2000), does a better job as Chapter 4 atpresenting the technical details of this system. An empirical application tovarious problems of extraction is discussed in Chapter 5, which appeared inNLLT (17, 1999). The most far-reaching, and also radical ideas within thisgeneral framework are discussed in the last paper in this section, Chapter 8,which was published in Wilder, Gaertner and Bierwisch 1996. Two other papersin the same spirit, though with slightly different implications, are collaborationswith Hornstein (Chapter 6) and Castillo (Chapter 7), both of which appeared ascontributions to UMDWPL (8, in 1999 and 9, in 2000, respectively); theypresent “reprojections” and a highly derivational account of successive cyclicity,respectively.

The paradigmatic derivations part is introduced in Chapter 9 by the collabo-ration with Hornstein and Rosen, which came out as a WCCFL presentationand, in an expanded fashion, in UMDWPL (2, 1994) – the version includedhere. Next is a conceptual, thus broader, piece that came out in Schwegler,Tranel and Uribe-Etxebarria (1998); the main ideas concerning the syntax ofpossession, as well as what it may entail for the larger concerns mentionedabove, are sketched in this Chapter 10. Papers on names and rigidity matters(Chapter 12, which appeared in Alexiadou and Wilder 1998) and two other co-authored works on small-clauses (with Raposo, from Cardinaletti and Guasti1995, Chapter 11) and parataxis (with Torrego, given as a paper at Georgetownand unpublished, Chapter 13 in this volume) expand on general possessive con-cerns. The chapters demonstrate how those issues show up in unexpecteddomains: e.g. the relation between a nominal and a syntactic presentation of itsreference, a clause and a syntactic presentation of its truth value, or a variety ofintricacies arising in the analysis of small-clauses of different kinds. Those arealready notions which are normally analyzed in paradigmatic terms. We thencome to the last two chapters in this section, one that appeared in TheoreticalLinguistics (29, 1999), now Chapter 15, and the collaboration with Pietroski inUMDWPL (2001), Chapter 14, where it is explicitly proposed that many famil-iar paradigmatic cuts on various standard categories may be analyzed in roughlypossessive guise, creating various dimensions of syntactic complexity.

I N T R O D U C T I O N

21

Page 33: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

2

CONCEPTUAL MATTERS

In this chapter a general architectural discussion of minimalism is presentedfrom a derivational viewpoint. Section 1 is a review of the program. Sections 2and 3 are responses which appeared in the context of a debate over the deriva-tional take on the minimalist perspective.

1 Book review: Noam Chomsky, The Minimalist Program1

It may already be a cliché to note that Chomsky’s latest work (Chomsky 1995b) isboth revolutionary and frustrating. To address the latter, the book was publishedat break-neck speed and would have certainly benefited from more time prior topublication to correct typos (e.g. “with pied-piping” instead of “without pied-piping” on p. 234, second paragraph), “thinkos” (e.g. the claim on p. 235 that“Merge is costless for principled reasons,” which confuses the type of operationwith tokens of its application, with non-trivial consequences for the analysis of(169, on p. 346), ensure consistency of style (particularly since the first three chap-ters show the always healthy influence of Lasnik), sort out the non sequiturs andcontradictions in Chapter 4 and guarantee the reliability of the index. Despitethese imperfections, the book is a work of genius and I urge readers to overlookthese minor defects and persevere with it, in much the same way as an audience ata performance by a legendary artist would ignore small technical inaccuracies.

With all that out of the way, I will refrain from repeating the technicalwonders of the Minimalist Program (MP), among other things because the veryable work of Freidin (1997), and Zwart (1998), already does it justice, in widelyaccessible and very clear reviews. My interest is slightly more conceptual, whichis where I think MP is truly unique. Concretely, the main focus of this reviewwill be to compare the theory of grammar presented in MP (and its assump-tions) to its predecessor GB (Government and Binding theory), with which itshares so much and from which it is so very distant.

1.1 Modularity

GB was taken to be a modular system where different sub-components ofgrammar (Case, Theta, Binding, etc.) conspire to limit the class of admissible

22

Page 34: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

objects. This was a somewhat peculiar architecture from the point of view ofFodor’s famous modularity thesis (Fodor 1983). For him, there are a variety ofisolated mental modules (language faculty, visual system, etc.) connected amongthemselves by a central system. It is not within the spirit of strict modularity toopen the gate to modules within modules (and so on), which in the limit reducesto a connectionist (anti-modular) network. But in GB, for instance the (sub-)module of Binding Theory did have further (sub-sub-)modules: Condition Aand Condition B could be independently (in NP-t or pro) or jointly satisfied (inPRO), or not at all (in wh-t).

Indeed, because of the character just discussed, GB could be, and in fact isbeing modeled as a connectionist network, as in recent OT proposals. (The onlynon-connectionist aspect of OT is that, in the classical version of Prince andSmolensky (1993), summations of lower-ranked constraint violations may notoutweigh higher-ranked ones; I understand that this has changed in recent pro-posals.) Constraints on X� structures, theta configurations, Case and Bindingrelations, and all the rest, do lend themselves to the OT take on language andmind. Thus, X� constraints, say, are ranked highest (depending on whether non-configurationality really exists), and others are ranked in this or the otherfashion, yielding variation. The point is: the GB architecture, although sympa-thetic to Fodorian terminology, was in fact allowing an anti-Fodorian loophole.

Of course, a connectionist network can model just about anything; somethings, though, it models better than others. If we take the input to the networkto be a completely trivial (non articulated) set-theoretic object, then it may ormay not be the case that differently ranked constraints do the limiting job, inthe meantime yielding variation. This is right or wrong, but not vacuous. Moreor less explicitly (see footnote 124 to Chapter 2 of Chomsky 1981), the GB inputto the phrase-structure component was Lasnik and Kupin’s (1977) sets of mono-strings, they themselves described from an unconstrained class of set-theoreticobjects that was filtered out through admissibility conditions of various sorts.All other GB modules could be stated as admissibility conditions on those initialset-theoretic objects, and representational versions of the system assumed justthat. Which is indeed, more or less reasonably, modeled in OT terms.

MP does away with any natural connectionist modeling, since the theory hasno modules – only features (p. 225ff.). It could be thought that a feature is just“a tiny module,” but this is really not the case. Whereas a Case module, forexample, allows you to say anything you want about its internal workings(whether it invokes command, adjacency, directionality, etc.), a Case feature is aformative of the system, and can only be manipulated according to whateverlaws the system as a whole is subject to. It makes no sense to claim, for instance,that (only) Case features are assigned under adjacency; on the other hand, itmade perfect sense to say that a configuration of Case assignment which doesnot satisfy adjacency between the assigner and the assignee is ill-formed. Theminimalist thinking is dramatically more limited.

MP presents a strongly inductive system, where operations work in a strictly“inside-out” fashion to produce larger objects, which in turn become even larger

C O N C E P T U A L M A T T E R S

23

Page 35: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

through movement, and so on (p. 189). Whereas GB’s sets of monostrings couldbe a reasonable OT input, or GEN (indeed, even the more basic sets of merestrings could), it is senseless to say that MP’s Merge yields something remotelylike a representation that OT constraints (of any sort) can manipulate. Nothinggets manipulated by, say, theta requirements until it is merged; no phrase canhave anything to do with Case until it is moved (p. 258ff.), etc. The entiresystem is based on a complex inter-leaving of operations that invoke interac-tions and construct objects literally at the same time. You could, of course,model that with a connectionist network: having GEN be the output of Mergeor Move – but that is trivially uninteresting.

If there is a criticism to be made to MP in all these respects, it may be that itstill is (not surprisingly) too similar to its GB parent. Thus, the program stillanalyzes derivations and representations in terms of phi (Case and agreement)features (p. 146ff., etc.), wh-features (p. 266ff.), or categorial (N, V, D. . .) fea-tures (e.g. p. 349ff.), obviously because the corresponding phenomena wereunderstood in terms of modules that, in some cases, went by those very names.In most instances this leads to no particular surprises, but it does involve theoccasional nightmare when it comes down to such classics as the Extended Pro-jection Principle (pp. 232, 344) or Successive Cyclic wh-movement (e.g.p. 301ff.) – for which no real analysis is provided. Unfortunately, it is unclear(judging from Chomsky 2000, or his Fall 1997 lectures) that any solutions tothese problems are forthcoming.

1.2 Government

If something was the hallmark of the traditional GB model, it was the ever-present theme of government. Conceptually, government was quite significant:it served as the unifying criterion underlying all modules. It was a welcomenotion too, since that sort of thing is obviously unpleasant to a connectionist:you do not really expect unifying notions that emerge in the interactions of ran-domly ranked constraints; why should Case constraints care about movementconstraints, say? But of course, the grand unifying role was more rhetorical thananything. For X� and theta theories, only government under sisterhood evermattered; for Bounding Theory, sisterhood plus (perhaps) exceptional govern-ment was relevant, for Case and Binding theories, clearly exceptional govern-ment, with some added complications posed by specifiers, which are technicallynot governed by the head; for Control theory, it was not even obvious thatgovernment made a difference at all. Calling all these notions “government”was the only thing that unified them.

MP bites the bullet, and riding on a wave that was already in motion in GBstudies of the late 1980s, it flatly denies the theoretical possibility of invokinggovernment (p. 173). This is quite an interesting move, both theoretically andconceptually.

First of all, absence of government leads to all sorts of twists and turns of MP,the most intriguing ones having to do with the treatment of exceptional Case

D E R I V A T I O N S

24

Page 36: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

marking and related topics (p. 345ff.) – now by necessity a matter of LF underlocal conditions (see 147ff.). The door for such analyses was already opened inChomsky (1986b), which invoked LF movement of associates to their corres-ponding expletives, to capture their “chain” behavior (see also p. 156). Only asmall leap of faith moves us from that long-distance relation to one where noexpletive awaits the moved element, and instead a mere relation of grammar(Case checking) obtains (p. 174). The process has to be checking, and not assign-ing Case, because the covertly moving element must carry its Case from thelexicon, or else it would never fork to PF in the appropriate guise (see p. 195).

Second, and more importantly, exactly what is it that disallows the theorist touse government? That is: “what is ‘minimalist’ about Chomsky’s getting rid ofgovernment?” Or still in other words: could government ever come back?

If as minimalism unfolds, it sticks to its guns, government should never comeback. Government is to language like weight or motion are to heavenly bodies:a side effect. Of course, nothing has really been wasted by studying government;all that knowledge is something that any system will have to account for. Thebad news, though, is equally interesting: we do not get rid of governmentbecause it did not work, but because it was ugly!

Government was, first of all, ugly in itself – awfully hard to define coherently.Second, government was theoretically ugly in that, to make it work as it wassupposed to (across modules) one had to keep adding except provisos or dis-junctive clauses.

If MP is right and, as Chomsky puts it (1986b: 168), “the hypotheses are morethan just an artifact reflecting a mode of inquiry,” there may be something deepabout the sort of beauty found in the language faculty, so much so that itrestricts our ways of theorizing. Which is to say that, even if it turns out that atheory of language is better off without government, MP would assume nogovernment regardless – so the rhetoric goes. That is, the decision to eliminategovernment from the theory does not seem to be just the methodological resultof Ockham’s Razor, but rather a vague ontological intuition having to do withlanguage being intrinsically beautiful, or “perfect.” Since that is an admittedlycrazy stand to take, we should make much of this speculation.

1.3 Economy

The best illustration of the bluntest move of MP is the concept of economy. Onemay have thought that Chomsky goes into the streamlined program he exploresbecause this is the best available theory compatible with the data. This,however, is not obviously the case. Consider, for concreteness, one of the classictraits of GB, the Empty Category Principle (ECP) alluded to in p. 78ff., whichwas proposed in Lasnik and Saito (1984).

Lasnik and Saito’s piece was, I think, misunderstood. They were the precur-sors of a single transformational process (“do whatever anywhere to anything”)that massively over generates, to then constrain the excess baggage in terms ofprincipled (intra-modular) requirements. Their GB logic was as impeccable as

C O N C E P T U A L M A T T E R S

25

Page 37: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

their explanation was clear. Yet just about everyone was troubled by derivationsthat allowed movement “back and forth”; but why? Was that not the besttheory with the fewest assumptions, with reasonable interacting theoretical ele-ments that were there (to either make use of, or else explain away)?

Yet MP explicitly denies us the possibility of moving “back and forth” tosatisfy the ECP (the central notion last resort, via the various versions of greedexplored in the book, impedes any movement that is not self-serving). Why isthis so? Because that is how we are going to design things from now on. Ofcourse, it is not the best theory, if only because it has to add a stipulation to theset of axioms (greed), no matter how many fancy terms we give it. Plainly, thetheory is not so economical as the previous, but is nonetheless about a moreeconomical object.

I am loading the dice here to make a point. There are senses in which theemerging theory in MP is better than GB, in the Ockham’s Razor sense. It has,for example, fewer levels of representation (two instead of four, as fromChapter 3 onward), fewer modules (only one), less reference to non-trivial rela-tions (perhaps just command, as in p. 254), and less of just about everythingelse. But, first of all, one has to be careful with this estimate. The system mayonly have two levels of representation, but it packs into definitions of varioussorts all kinds of references to what used to be, say, D-structure (a multi-set oflexical tokens or numeration (p. 225), theta-relations in hierarchically configura-tional terms and one-to-one correspondence to arguments (p. 312), a mysteriousrequirement to check subject D-features (p. 232), and so on). Similarly, whilethe system may have no sub-modules, it does fundamentally distinguish amongdifferent types of features (p. 277) – so much so that relativized minimality iscued to these distinctions (297ff.). Which is all to say the obvious: it is hard tomake good Ockham’s Razor arguments; what you take here, you often have togive back over there.

Still, regardless of how many true stipulations are left in MP when someonefigures out exactly what it is saying and exactly how it does what it claims to do,the program will probably stick to a rhetorical point. It does not matter. Whatyou care about is not how many stipulations you need (at least, you do not anymore than you do in any other science); rather, the fuss seems to be all abouthow “perfect” language is, how much “like a crystal” it is, or like one of thosebody plans that enthuse complexity theorists (e.g. the Fibonacci patterns thatarise in plant morphology). These are all metaphors from Chomsky’s classes,although the present book only indirectly alludes to these issues (e.g. on p. 1 orp. 161ff.; see also the first few pages of Chomsky 2000).

1.4 A “mind plan”

But of course, those considerations are not new to Chomskyan thought. Cer-tainly, there are aspects of MP, and even more so of its follow-up (Chomsky2000), which closely resemble Chomsky’s (1955) masterpiece. These include aheavily derivational model, a highly dynamic system where chunks of structure

D E R I V A T I O N S

26

Page 38: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

are merged and moved and manipulated as the derivational cycle unfolds and, ina word, the flexibility of a derivational flow that makes one seriously comparethe linguistic “mind plan” with other “body plans” out there in nature, which areunmistakably flexible, dynamic and derivational (in the mathematical sense ofthe term; cf. the derivational models of plant or shell growth that stem from Aris-tide Lindemayer’s work, which is explicitly based on Chomsky’s rewrite rules).

This “wormhole” between Chomsky’s last work and his first is, I think, veryimportant for two reasons. One is intrinsic to MP, and clearly noted on p. 168.What are we to make of a system that is “perfect,” in the sense of MP – that is,presenting “discrete unboundedness,” “plastic underdeterminacy” and “struc-tural economy”? Perhaps Chomsky’s greatest intellectual contribution so far ishaving the world acknowledge that there is something biological to grammar.But what is biological about those “perfect” properties? If this question is notaddressed, a serious crisis lurks behind MP: how a seemingly artificial (“perfect”)program is to be squared with the traditional Chomskyan naturalistic goal.

The fact that some modern biology is going in the direction that Chomskyforesaw, indeed decades ago, is not only a way out of a crisis that is at that pointmerely methodological (why are organisms that way, in general, if standard neo-Darwinism has nothing to say about their “perfection”), but it is also a tribute toone of the deepest glories of MP: that it presents in detail what might be a rea-sonable “mind plan” in a concrete organism, us.

More generally, this biological connection (albeit with the new biology) maybe of use to those of us who think of ourselves as standard syntacticians. Let usnot forget that, when all sound and fury is over, MP as it stands has relativelylittle to say about the ECP and all it was designed to achieve – not to mentionscores of other issues. This is neither here nor there, but the question remainswhether solutions are to be found within the technical confines of features righthere or checking over there. Of course, this is just one among several possibleinstantiations of what looks like a very good idea, which makes syntacticrepresentations and their interactions awfully similar, mathematically at least, towhat goes on in nature. But if the latter is the case, one need not stop where MPhas. Obvious questions remain. What does it mean to have movement, vari-ation, morphology altogether? Why are syntactic objects arranged in terms ofmerge and its history? What are syntactic features and categories? Why doesthe system involve uninterpretable mechanisms? How tight is the connection tothe interface systems, in what guise does it come, how many of those reallyexist? This is only the tip of the iceberg.

Seen from the perspective of nature’s laws, those are truly fascinating ques-tions, and the fact that we are even asking them is part of Chomsky’s legacy.Indeed, the possibility that “out there” and “in here” may be so closely related –even if obviously different – is, apart from humbling, some serious cause foroptimism in our search, particularly for those who resist embarking on methodo-logically dualist quests that treat mind as different from matter. The irony is that(organized) matter might be much closer to mind than anybody had imagined,except Chomsky.

C O N C E P T U A L M A T T E R S

27

Page 39: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

2 On the emptiness of “design” polemics

Lappin, Levine and Johnson (2000, henceforth LLJ) are concerned about thefield gravitating toward the Minimalist Program (MP) without having exhaustedthe possibilities of the “Government and Binding” (GB) theory.2 In this reply, Iconcentrate on a concrete paradigm that has resisted a GB analysis. Since LLJattack Chomsky’s system using in large part my book Rhyme and Reason(Uriagereka 1998), I will respond in my own terms. I believe that the analysis Idiscuss is in the spirit of precisely those general aspects of MP that LLJ findoffensive, and it constitutes a good example of what they call a “staple cliché oftrendy ‘parascientific’ chit-chat.” After I present the analysis, I discuss itsassumptions, and attempt to show how they are not particularly unreasonable,or profoundly different from those that any natural scientist would make. Thatbeing the case, the force behind LLJ’s criticism dissipates.

Languages differ as to whether the associate (A) in expletive constructionsappears before or after a participial (P) expression. The (P, A) order is found,for instance, in Spanish (1), while (A, P) is attested in English (2):

(1) Quedaron escritos tres libros.remained.AGR written.AGR three books

(2) There remained three books written.

Two variables seem implicated in the distribution of the orderings. One isverbal agreement. In constructions that present default V agreement in Spanish,the (P, A) order reverts to (A, P):

(3) Hay tres libros escritos.have.DEF three books written.AGR“There are three books written.”

The other relevant variable appears to be participial agreement. Swedishexamples illustrate:

(4) a. Det tre böcker skrivna.it three books written-AGR

b. Det blev skrivet tre böcker.it became written.No-AGR three books

Note first that V does not agree in these instances; second, observe how in con-structions that lack P agreement in Swedish, the (A, P) order reverts to (P, A)(in contrast to (2)–(3)). Generalizing:

(5) a. (P, A) in i(i) V-agr, P-agr instancesand, (ii) non-V-agr, non-P-agr instances.

b. (A, P) in i(i) V-agr, non-P-agr instancesand (ii) non-V-agr, P-agr instances.

(5ai) is exemplified by V-agreeing Romance constructions and (5aii) bystandard Danish, Norwegian and non-P-agreeing Swedish, whereas English

D E R I V A T I O N S

28

Page 40: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

exemplifies (5bi) and West Norwegian, default-V-agreeing Spanish, and P-agreeing Swedish exemplify (5bii).3

While that is a description of a complex state of affairs, it is hardly an expla-nation. The problem is in the system presupposed by (5), in which a higherorder predicate is needed to make the system intelligible. We need to rangeover values of parameters to state relevant generalizations:

(6) For � ranging over “�” and “�”:a. �V-agr, �P-agr ↔ (P, A)b. �V-agr, -�P-agr ↔ (A, P)

An example of this sort of generalization from the GB literature is Jaeggli andSafir’s (1989) observation that languages have pro-drop if they have full-fledgedagreement (Spanish) or no agreement at all (Chinese). But MP does not allowus to formulate such a meta-statement, as it would not be expressible with thebare representational notation that comes from the lexicon.

In large part, minimalism was designed to deal with precisely the sorts offacts above, where one particular derivation is not generally ungrammatical;rather, it is bad because an alternative derivation, in some definable sense, isbetter. To carry this logic through, we must make certain assumptions whosesoundness I leave for the end of this note. The basic idea is discussed in Martinand Uriagereka (forthcoming):

(7) Within local derivational horizons, derivations take those steps whichmaximize further convergent derivational options.

(8) All else being equal, derivations involving fewest steps outrank theiralternatives.

(8) is Chomsky’s (1995b) idea about comparing derivations within a given refer-ence set. (7) is a new proposal, also in the spirit of optimality criteria. Assumethat “local derivational horizon” means a choice of formal alternatives for aderivationally committed structure (e.g. move vs. merge of some formal item). Ifwe select a new substantive lexical item, relevant derivations are no longer com-parable from that point on. Furthermore, observe how (7) explicitly introducesthe idea of the entropy of a derivation, if we think of derivational options (i.e.whether after a given point further moves/merges are allowed) as “micro-states”of the system compatible with the whole derivational “macro-state,” given therelevant horizon. A derivational decision d will allow further convergent steps ina number of n possibilities, whereas some other derivational decision d� will onlyallow a number m of possible continuations, where m�n. In those circumstancesd induces more derivational entropy than d� does, which (7) aims at optimizing.

Given this machinery, (9) sketches the first relevant bifurcation of deriva-tional paths. If we take the derivational step in (9a), putting aside further move-ments (see Note 2), the resulting order will be (P, A), whereas (9b) immediatelyleads to the (A, P) order.4 But the question from the point of view of entropy iswhat happens next, within a relevant derivational horizon. We must consider inturn each of the derivational paths in (9).

C O N C E P T U A L M A T T E R S

29

Page 41: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Starting with (9a), observe how the expletive should block further movement ofN across it (a locality effect). Let us state this in terms of Chomsky’s (1995b)notion “Attract.” Some further category higher up in the tree (T) can attractEXPL, as in (10a), but not N across EXPL (10b):

(10)

PP

P'

P[agr]

N

EXPL

EXPL P'

P[agr]

N

T'

T

a.

b.

EXPL P'

P[agr]

N

T'

T PP

PP

b. PP

N P'

P[agr]

t

(9) a. PP

P'

P[agr]

N

EXPL

D E R I V A T I O N S

30

Page 42: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Compare this situation to the one in (9b). For that structure, two derivationalpaths are possible:

(11)

In (11a) we insert the expletive in the TP Spec , whereas in (11b) T attracts theassociate. These are two convergent steps, unlike (10b).5 Therefore the entropyof (9b) (the number of continuations it allows) is higher. Again, the system goeswith the derivational decision that leads to this result, in compliance with (7),and thus predicting languages of the sort in (5b).

The logic of the reasoning forces us to say that, in languages of the sort in(5a), alternative derivational steps are equally entropic. Observe that, if this isindeed the case, familiar economy considerations as in (8) will predict (9a) (withno costly moves) to outrank (9b). But why should it be that in the relevant cir-cumstances entropy gets equalized? As it turns out, languages with agreementin the V and P systems obtain this result in a way which has nothing to do withwhy languages with no agreement do. Let us examine this in detail.

In languages with agreement in V and P, overt expletives are generallyimpossible. Suppose that an element with no phonetic or semantic realization (anull expletive) simply does not exist. If so, in this language step (9a) is not anoption. When the Part� domain is reached, the only options are to take or not totake step (9b). Taking it will lead to taking or not taking step (11b); not taking(9b) will lead (in principle) to again taking or not taking (9b). The derivations

T PP

N

TP

EXPL

a.

b.

T'

P'

P[agr]

t

N P'

P[agr]

t

T'

T PP

PP

N P'

P[agr]

t

C O N C E P T U A L M A T T E R S

31

Page 43: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

are equally entropic, hence the costless one wins. That is, the winner is the onewith the non-moved A, according to generalization (5ai).6

In contrast, in languages with no agreement, the expletive is clearly present.So how do we distinguish this case from that of languages with an expletive butsome form of agreement in the participial? Observe each instance side by side:

(12)

The very same relation between T and N which is impossible in (12a) (case(10b) repeated) becomes possible across an expletive which does not agree witha participial head (12b). The easiest way to state this contrast is to say that theagreeing expletive is visible to the system in a way that the non-agreeing exple-tive is not, hence only the former prevents Attraction across it.7 Of course, if(12b) is indeed an option, then the entropy of each of the alternatives in(10)/(11) gets equalized, and again the grammaticality decision must be made interms of (8), thus correctly predicting no movement in these instances (5aii).

I would welcome a complete explanation of facts like these in non-minimalistterms; I am not familiar with any. What is even more important is that while (6)is true, it is a spurious generalization. There is no deep sense in which languageswith equal values for V – or P – agreement should also have a (P, A) order, orlanguages with different values for these variables should have an (A, P) order.If we were told that the very opposite generalization held, nothing would shockus. But after having seen the reasoning above, the opposite state of affairswould be unexpected.

I believe that it is analyses of this sort that should be used to decide whetherChomsky’s idea is productive. Of course, some non-trivial assumptions that wehave made need to be discussed. For instance, why is economy “ranked” withregards to entropy? In this respect, R&Rs’ suggestion about extending natural-istic metaphors may be more than “chit-chat,” and no different from similarextensions in complexity studies more generally. In nature, (7) is comparable tothe Second Law of Thermodynamics, whereas (8) has the flavor of the SlavingPrinciple in synergetics. All of nature obeys the second law, but within the para-meters it allows, certain systems find points of stability as a consequence ofvarious processes of local feedback. Turbulence emerges in this fashion – not atrandom points, but in those where a certain fluctuation exists at lower levels ofenergy. This is consistent with the second law, but not predicted by it; an order-ing principle is required to determine why turbulence emerges here and not

a.

EXPL P'

P[�agr]

N

T'

T PP

b.

EXPL P'

P[�agr]

N

T'

T PP

D E R I V A T I O N S

32

Page 44: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

there, and that principle can be metaphorically linked with the idea that costlyderivations are dispreferred. What interests me, however, is the ranking. TheSecond Law first, the Slaving Principle within its confines. Could the ranking of(7) over (8) be related to this state of affairs?

My speculation that it can will be fully tested only when we learn more aboutthe physical basis of language in evolution, development and performance.Then again, how different would this metaphorical extension be from the onesmade in computational biology when modeling organisms in terms of optima?Fukui’s (1996) original paper – from which R&R took the provocative thoughtthat comparing derivational alternatives resembles Least Action in physics –explicitly made this same point for optimization in computer science. A con-crete example that Fukui could not have anticipated, but which neatly illustrateshis claim, is the analysis that West, Brown and Enqist (1997) provide of the car-diovascular system of vertebrates as a fractal space filling networks of branchingtubes, under the assumption that the energy dissipated by this transportationsystem is minimized. That condition is a defining property of the system – itdoes not follow from anything. Suppose we applied LLJ’s logic for why my nat-uralistic take on MP is “groundless,” to this celebrated example. Optimalityprinciples in physics “are derived from deeper physical properties of the (enti-ties) which satisfy them . . . By contrast, the MP takes economy . . . to be one of(the grammar’s; here, the biological system’s) defining properties.” Thepremises are both true, but nothing follows. Would LLJ ask the scientificcommunity to dump West, Brown and Enquist’s explanation of scaling laws –the only one around in biology – because it is irreducible in the way theysuggest, hence “groundless”?

That reductionism would be short-sighted. Intuitions about optima in physicsappeared well before anyone had any idea as to how to deduce them from firstprinciples, as in the current systems that LLJ allude to. This did not stop physi-cists from proceeding with their assumptions. Surely LLJ cannot be implyingthat only present-day physics, one state in an enterprise, matters, unless they areprepared to call “groundless” what was done prior to Einstein. By the same rea-soning, computational biology is a perfectly grounded science, even if its optimaare not based on the properties of the units that biology studies or do notreduce to the structure of quarks. And by the very same token, I fail to see whylinguistics should be subject to a different standard.

Many linguists who worked seriously within GB noticed that it allowed toomuch power: “Do anything to anything anywhere at any time.” That of courseprovided very interesting accounts of Empty Category Principle effects, but itbecame useful to limit the movement mechanism. The result was “last resort.”Do something which has a purpose, in some sense. Obviously that is a newaxiom, with an economy flavor to it. Coupling it with other axioms explored atthe time, which invoked certain symmetries within linguistic representations,field-like conditions on locality, uniformities across various notions and levels, anew spirit emerged. Could it be that language is structured in precisely thoseelegant terms? Taking the idea to the status of an appropriately extreme

C O N C E P T U A L M A T T E R S

33

Page 45: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

research question, one is led to ask: “How much sense is there in asking whetherlanguage is ‘perfect’?” The answer to this is purely empirical, and whether MPsucceeds as a natural extension of GB should not be decided a priori.

That means we have to, as Brecht’s Galileo puts it, “bother to look.” Ibelieve that LLJ are too hasty in their conclusion that MP is “ungrounded inempirical considerations.” Are the facts above not empirical? I cannot go into alengthy presentation of the relevant literature, but without even consideringworks by the many established linguists who have already made significant con-tributions to MP, one can find a host of younger colleagues who have providedwhat looks to me like deep accounts of Case/agreement distributions, islands,superiority effects, parasitic gaps, coordination, different binding requirements,quantifier relations, reconstruction, and control, just to name the obvious. Noneof these were discussed in detail in R&R (it was not the place for that), but theones accessible in the mid-1990s were certainly included in the twenty pluspages of widely available references. One just has to bother to look.

Although I let my other colleagues comment on the passionate, unnecessaryovertones of LLJs’ piece, I do want to mention that conceding that “the concep-tual defects of [MP] are probably no worse in kind than earlier examples mightbe” is unfortunately familiar rhetoric, especially when coupled with wild claimsabout the field of linguistics being irrationally dominated by Chomsky’s demon.Ironically, this perpetuates the myth that MP is Chomsky’s toy story, and notthe collegial effort of many scholars who knew the limitations GB faced, andforged ahead.

3 Cutting derivational options8

Lappin, Levine and Johnson (LLJ) categorize my first reply to their piece assomething “which addresses our main objection to the MP by attempting topresent a substantive defense of economy conditions in the theory of grammar.”This is why I chose the paradigm I did, among other possible ones amenable tobroadly minimalistic considerations: it illustrated competing derivations at play,and procedures to choose among them. I purposely tried to study, also, a casewhere different possible orders come out as grammatical in different languages.Any analysis of these facts should be welcome.

The “most important” critique that LLJ present of my reply aims directly atthat choice of facts. This is surprising, as the Romance data I reported havebeen replicated elsewhere, having first being pointed out, in some form, byLuigi Burzio – some twenty years ago. The Scandinavian facts I mentioned aredue to Anders Holmberg in his careful study of various Germanic languages,and have been the focus of attention of several papers recently. My rathermodest contribution was putting the various observations in the form of a gen-eralization: languages with the same value (� or �) for verb and participialagreement exhibit participle-associate NP (�P,A�) order, while languages withdistinct values for verb and participial agreement display �A,P� order in exple-tive structures.

D E R I V A T I O N S

34

Page 46: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

LLJs’ counterexample to my generalization is this: in some languages bothorders arise, regardless of agreement. LLJs’ data is as reported in (13), withtheir judgments, and its Hebrew translation (English has mixed agreementwhile Hebrew has full agreement, both relevant test-cases):

(13) a. ?There remained three players sitting.b. ?There remained sitting three players.

Although it is certainly true that the grammaticality contrasts I was alluding toare not always dramatic (speakers often “can get” the alternative order, but it issignaled as less preferred, marked, or just bizarre), I have not found a nativespeaker of English who does not detect something odd about (13b). In myexperience, to the extent that examples like (13b) are acceptable in this lan-guage, they need an intonation break between sitting and the rest of the sen-tence, which is best if heavy in either syllabic terms (i.e. long) or phonologicalweight; thus my data analysis would be as in (14), instead of (13):

(14) a. ??? There remained sitting three players.b. There remained sitting three players who had no idea what the play

was.c. There remained sitting THREE PLAYERS.

None of these considerations about heaviness obtain for the alternative order-ings of the associate and P. Several recent papers discuss the possibility that the“inverted” orders in (14) result from stylistic adjustment, not unlike Heavy NPshift. If so, the facts in (14) are no counterexample to my syntactic generaliza-tion. Appropriately idealized, (13a) is pretty good and (13b) is pretty bad.

Also, observe (15), an existential construction with be which, to start with, isthe most natural in English (vis-à-vis examples with unaccusative verbs), andthus yields more robust judgments:

(15) a. There was not a single player sitting.b. ?*There was not sitting a single player.

(15), and the appropriately analyzed instances in (14), seem like potentiallyrelevant examples for derivational accounts of the minimalist sort: precisely inthat we have to justify why (15a) (presumably with A movement) “wins over”alternative (15b) (certainly with no such movement).

That (15b) is a reasonable alternative, in principle, is shown by the factthat the relevant order in that instance is fine, in fact preferred, in other lan-guages. Given the sort of generalization I attempted, in a language like Hebrew– with full agreement in both the verbal and participial paradigms – my expecta-tion would be that precisely the option where A does not move is what elimi-nates the alternative with movement as a grammatical route. So in a paradigmlike (16), which LLJ provide, I clearly expect (16b) (where shlosha saxkanim“three players” is presumably not displaced) to derivationally outrank (16a)(where shlosha saxkanim is “wrapped around” yoshvim “sitting,” the Pelement):

C O N C E P T U A L M A T T E R S

35

Page 47: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(16) a. Nisharu shlosha saxkanim yoshvim.remained-3pl-past three-m players-m-pl sitting-m-pl“There remained sitting three players.”

b. Nisharu yoshvim shlosha saxkanim.remained-3pl-past sitting-m-pl three-m players-m-pl“There remained sitting three players.”

LLJ detect no preferences in (16), unlike what happens, say, in Romance. Notethat Hebrew allows for post-verbal subjects, hence (16b) could be analyzed asthe equivalent of the English three players remained sitting, albeit with stylisticinversion of the subject – which again is irrelevant for my purposes (no displace-ment of three players). Of course, the potentially troubling example for me is(16a), and the question I face, whether I am forced to analyze this string ofwords in structural terms which are pertinent for the generalization I posited:has A been displaced here?

The relevant structural terms are as in (17), where the main verb selects the Pstructure sitting three players, and subsequently three players moves:

(17) [remained [three players [sitting t]]]

However, the verb remain, or its Hebrew version nisharu, may also select for aregular nominal expression, in which case in principle two further structurescould be at issue (actually four, if we also consider the possibility of invertedsubjects noted above – which I now set aside):

(18) a. [remained [three [[players] sitting]]]]b. [[remained [three players]] sitting]

In (18a) sitting is a modificational element; in (18b), a secondary predicate. Allof this has some relevance for the worrisome (16a), which could have the analy-ses in (19):

(19) a. [Nisharu [shlosha [[saxkanim] yoshvim]]]]remained three players sitting

b. [[Nisharu [shlosha saxkanim]] yoshvim]remained three players sitting

Perhaps obviously now, neither of these structures is pertinent to my general-ization.

Not all Hebrew speakers like yoshvim “sitting” as an adjectival element,although all those that I have consulted accept it as a participial. For thosepeople, then, only (17) should be a possible analysis. Most of my informants find(16a), presumably under non-adjectival circumstances, worse than (16b). Again,the structure is recovered with postposition to the right periphery, for con-trastive focus:

(20) nisharu ba-gan shlosha saxkanim yoshvimremain in-the-garden three players sitting

D E R I V A T I O N S

36

Page 48: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

This too is irrelevant, and makes (20) similar, in significant respects, to theexamples in (14).

Alongside active participles like yoshvim, Hebrew also allows for passive par-ticiples like yeshuvim, which is analogous to the English “seated” and thuseasier to use as an adjective. Predictably, those speakers that find (16a)degraded accept (21) without troubles:

(21) Nisharu shlosha saxkanim yeshuvimremained-3pl-past three-m players-m-pl seated-m-pl“There remained three seated players/three players seated.”

Once more this particular structure has nothing to do with my generalization,and confirms, in its contrast to the degraded nature of (16a) in relevant guise,the overall claims I made.

The empirical challenge, thus, dissipates both for English and for Hebrew,which align with other languages as predicted. This is hardly surprising. Trueoptionality is rather scarce; it often tells us more about our carelessness asresearchers than about the nature of syntax.

It is worth mentioning, also, what LLJ propose as an analysis of what theytake to be the facts (both possible {P, A} orders in all sorts of languages, contramy generalization): in the end “lexical properties of the participle (optionalCase assignment?) are the main factor in determining the order of the NP asso-ciate relative to P.” The disconcerting aspect of LLJs’ critique is not so muchthe denial of the facts as lexical idiosyncrasies, but rather the thought thatoptional Case assignment – which poses too many problems for me to address ina few pages – may have anything to do with the overall analysis.

So much for the key critique. Let me turn to the interesting questions, assum-ing the facts. This is the main body of their response to my proposal, which theyfind to have four “serious problems.”

The first is presented within the veiled complaint that my formulation of theentropy condition is not precise; LLJ were able to understand it as follows: “atany given point d in a derivation D, it selects from the set O of possible opera-tions the one which, when applied to d, produces the largest number of conver-gent derivational paths from d compared to other operations in O.” I could nothave said it any better. What is imprecise about that is unclear. But this isnot the criticism (just the general tone); the actual concern is the fact that thisis not an economy condition. Of course, I never said it was. Their question is,“Why should maximizing the number of convergent continuations of a deriva-tion produce a more economical derivation?” This reflects, I believe, a profoundmisconception as to how the strong minimalist thesis is supposed to work, so letme pause to address it.

There are two misunderstandings hidden in the presuppositions of that ques-tion. One is technical. MP was conceived, in part, as an attempt to limit the classof possible derivations that the P&P system allowed. Several tools wereexplored for that; the notion of last resort was one of them, economy conditions(in the technical sense) a different one. In either instance, by the end of the day

C O N C E P T U A L M A T T E R S

37

Page 49: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

only some of many possible derivations are convergent or optimal, and in thoseterms grammatical. My entropy condition – aside from relevant to the broadconceptual concerns that LLJ were raising in their first piece – is just anotherway of limiting the class of possible derivations. How it works is simple: youarrive at a derivational juncture d and face, say, two options x and y. Where theP&P model would have said “take either,” MP says “choose one.” The entropycondition provides us with a method to decide: if x allows more convergentderivational continuations than y (within a defined space), then x outranks y.This is certainly not an economy condition as such (for instance in the sense thata strategy to take the fewest derivational steps is). But why should it be aneconomy condition? All that matters is the selection of a grammatical deriva-tional path over its alternatives. Similarly, when Chomsky in recent work maxi-mizes the checking of features at a given site this is not, in itself, an economycondition; but it certainly has indirect economy consequences, in that it reducesthe class of grammatical derivations.

The second misunderstanding is conceptual. The strong minimalist thesisseeks to determine whether a very interesting proposal concerning the nature ofthe language faculty is true, and if so in what form. The idea is that languagearises at the optimal interface between an autonomous syntactic organ andother mental capacities. The question then becomes empirical, concerning, inlarge part, the specific sense in which the interface and how it is accessed turnout to be optimal. In that context, an economy condition is as expected as amaximization principle of the sort I (or Chomsky, in the case of feature check-ing) proposed, as would be other optima which encounter familiar expressionsin other sciences. Optimal design cannot be determined a priori, and the onlyway to go about understanding its properties is to propose various reasonableconcepts and see what empirical pay off they have.

LLJ claim that “[f]rom a computational perspective [my suggestion] is pre-cisely the opposite of an optimal system . . . Uriagereka provides no independ-ent justification for his principle, and it is not clear to us how it encodes anyrecognizable concept of perfection in computational design.” I believe thistouches on the crux of the matter, and is one of the reasons why LLJ seem sopassionate about all of this. The field has largely ignored their concerns aboutglobality in previous work, which they cite abundantly. This was so for tworeasons, both raised by Chomsky in various places. First of all, a formal problemin some limit for a system of use has no obvious bearing on a system of know-ledge. Familiar examples of this abound: e.g. center embedding poses a compu-tational problem, so speakers avoid it. If the kinds of derivations that LLJ areconcerned with are equally problematic, speakers will refrain from going intothem – there is plenty of expressive space elsewhere. Second, and more import-antly, it turned out to be trivial to reduce those phenomenal formal problems tonought. For example, any sort of cyclic computation entails that the class ofalternative derivations at any given point is very small, thus not a mathematicalexplosion. In the very example I provided, I insisted on the fact that entropy iscomputed locally, within a derivational horizon established by lexical choice. If

D E R I V A T I O N S

38

Page 50: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

you get a new substantive item from the lexicon, there goes the computation ofentropy; there is, thus, no computational blow-up. Thus the worry about com-plexity, in and of itself, seems superfluous.

In contrast, how to limit derivational choices is still an issue; not because ofcomputational concerns, but because alternatives to grammatical options, evenif convergent and intelligible, are plainly unacceptable to speakers. In thatcontext, all that a design property of grammars (whether they obey last resort,economy, maximal feature matching, entropy, or whatever) really needs inorder to be justified is to show whether it cuts derivational choices. A seriousconceptual problem with my entropy suggestion (or any other such design prop-erty) would arise only if it has no formal consequence, thus no weight on decid-ing among derivations, to choose the grammatical output. To judge designproperties in any other terms (e.g. to demand that they be, specifically, economyconditions, in the technical sense of the notion) is to have misunderstood whywe are positing them to start with.

Let me continue with LLJs’ list of conceptual challenges. Their second onepertains to the fact that my “explanation requires that entropy be ranked abovethe smallest-number-of-steps-condition. He seeks to justify this ranking by com-paring entropy to the Second Law of Thermodynamics . . . Presumably thesmallest steps condition is less binding in its effects.” All of this is true, whichLLJ take to be “deeply confused and based largely on a misplaced analogy.”While I have not provided, and cannot provide, an extra-linguistic justificationfor my analogy, I fail to see how it can be decided, a priori, whether it is mis-placed. True, I believe – with Chomsky – that the specific properties of mindthat we are exploring will ultimately be unified with characteristics of the“boring” universe, and how they manifest themselves in minds as a result ofbrain organization in evolution and development, within the constraintsimposed by the physico-chemical channel; and true, I cannot justify this norknow of any theory that even attempts to do so. Nonetheless, my take on thestrong minimalist thesis is in pretty much those terms, however distant they maybe from a satisfying unification with other disciplines. That was the thesis inRhyme and Reason, for conceptual reasons: once you give up a functionalistapproach to optimality in grammars, what else could it be other than the famil-iarly optimal universe (in physics and chemistry) manifesting itself in biology?In that and other works I have tried to show that this proposition is not particu-larly crazy for other domains of (computational) biology, body plans and allthat. To decide, from the start, that this is misplaced analogy and misunder-standing seems at best unimaginative.

It is arguably a bit more dishonest than that. LLJ ought to know that there isa connection discussed in the computational literature between information andnegative entropy, and that entropy is routinely used in the life sciences, as perthe advice of no less than Schrödinger. In fact, the model study that I followedfor the presentation of phyllotaxis in Rhyme and Reason, namely Roger Jean’swork, explicitly discusses an account for the ordering facts one of whose factorsis a characterization of entropy. I do not want to venture whether LLJ consider

C O N C E P T U A L M A T T E R S

39

Page 51: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

those extensions misplaced as well, but I do reckon that in those instances, too,nobody has been able to reduce the explanation to first principles in quantummechanics. The measure of success has been how well the model accounts forobserved properties of plant symmetry. Why should our science be subject to ahigher standard?

LLJ take me, “[i]n seeking to rank economy conditions relative to each other. . . to be invoking as his model an OT hierarchy of defeasible constraints.” Thatclaim is entirely false. My ranking has the same status within the system as thatexisting between, say, the choice of a minimal part of a numeration, inChomsky’s recent terms, and within those confines the particular derivation dthat such-and-such. This is a kind of ranking that results from the design of thegrammar, and as such has nothing to do with the variable rankings possible inOT, the source of linguistic variation in that model; I do not expect any vari-ation in entropy trumping economy. Suppose I had ranked economy aboveentropy; what would that mean? If entropy is taken, in essence, as the oppositeof information, augmenting entropy is akin to reducing committed information(leaving options open); it is sensible to find the most economical among thoseopen options. But once you commit to the optimal option, I am not sure whatsense there is in, then, seeking the option that leaves most options open. A dif-ferent story, of course, is whether that particular modeling of the system is ulti-mately sound (for my proposal as well as in that of parts of numerations chosenprior to derivations, or any others); whether, for instance, the rationale for theranking I have outlined follows from its own logic or, rather, is a dull con-sequence of how information arises in the physical universe. That question isunresolved in linguistics, as it is in plant morphology or wherever else in the uni-verse that form matters.

The third and fourth challenges are less grandiose, and reflect a misunder-standing, whether resulting from their reading or my writing, I do not know. Ido claim that, as they put it, “null expletives are not present in [languages withfull agreement], and so expletive insertion is not possible.” That is a familiarposition from the literature, which would for instance account for why thesesorts of languages do not generally have overt T expletives (there is nothing tobe had). I also claim that a grammar with those characteristics will never face aderivational decision in the entropic terms I suggest, as the continuations of thepossible paths to be taken are equally rich: to move or not to move. (Alterna-tively, as Roger Martin suggests, in this instance there is no derivational choiceand thus entropy is irrelevant.) They find this assertion to rely “on the vaguenature of the entropy principle which permits Uriagereka to claim equivalenceof derivational options without any obvious basis for doing so.” I think this per-ception stems from a misconstrual of how the entropy condition is supposed towork, as this passage suggests: “If one selects movement of A to Spec of P, thenfurther movement of A to Spec of T . . . is possible, while on the initial in situchoice this latter derivation is ruled out.” I cannot tell why LLJ think the latterderivational step (A from the P Spec to the T Spec ) is in principle ruled out(aside from its ultimate fate in this particular derivation). My condition seeks to

D E R I V A T I O N S

40

Page 52: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

choose the derivational path which “increase[s] the number of (locally) avail-able convergent paths in the derivation,” [my emphasis in their own, correct,rendition of my proposal]; the path in question is used for standard movementto subject position, hence is a fortiori convergent, hence the entropy of eachpath is the same.

The fourth criticism grows from a fact that I myself pointed out in the foot-notes: my “explanation of the �P,A� order for languages with no verbal orparticiple agreement relies crucially on the assumption that a non-agreeingexpletive is invisible to movement, and so it does not block ‘long’ move-ment . . .” I said that a more elaborate explanation is possible if instead of takingthe expletive as the blocker we take agreement to be the relevant element.Although this will add absolutely nothing to the conceptual discussion, I willsketch the alternative. (The impatient reader can skip this part.)

The relevant patterns are as in (22):

(22)

Suppose that the associate NP is being probed (in Chomsky’s recent sense,represented as an arrow in (22)), and only �Agr elements are active for thetask. In (22c) and (22d) the active Agr heads probe the associate NP. In (22a),suppose one Agr head probes the other, as they are both active; thus somethingelse must probe NP, say T. A similar situation arises in (22b), where both Agrheads are inactive. Next consider the computation of entropy, this time withrespect to Probe/goal relations across no intervening extra Probe (i.e. locally).For (22c) and (22d) – where in the end NP moves – we want entropy to favor amovement step, which must mean that the continuations of that step are morenumerous than continuations of a competing merging step. In the merging stepa pleonastic is entered as an AgrP specifier. Assuming, with Chomsky, thatpleonastic elements can themselves be probing heads if they have agreeing(person) features, there will be active pleonastics in (22c) and (22d), licensed interms of association with Agr heads, either directly (22d) or after movement(22c). It is these active pleonastics that count as intervening probes, thus elimin-ating relations across them. The reasoning just given must mean that in neither(22a) or (22b) should there be active pleonastics, as they are no active licensingAgr heads there. This is rather obvious for (22a), as nothing is pronounced orinterpreted in subject position; for (22b) it must mean that the pleonasticinvolved in this case is only a PF reflex. As a consequence, in both theseinstances relations across either nothing or a mere PF formative are possible,and entropy gets equalized; then merge trumps move, as desired.

In the end that more or less elaborate reasoning follows, also, if one simplyassumes that the sort of expletive in (22b) is a late arrival in the derivation, inthe PF component, hardly a new proposal in the literature. It seems unfair toconsider “a serious problem with [my] argument” a claim that suggests that if

a. … �AgrV … �AgrP … NP

c. … �AgrV … �AgrP … NP

b. … �AgrV … �AgrP … NP

d. … �AgrV … �AgrP … NP

C O N C E P T U A L M A T T E R S

41

Page 53: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

a language has some agreement (in V or in P), then it can syntactically license abona fide pleonastic, but if it does not have any agreement whatsoever, then apleonastic is only licensed stylistically or phonologically. That may be wrong,but I find it neither “vague” nor “ad hoc.”

LLJ compare – for I suppose postmodern purposes – the language faculty toa urinary tract. Our main point of disagreement may ultimately be this. Whatseems to me remarkable about the linguistic system is that it has amazinglyelegant properties regardless of the utterly bizarre uses humans can put thissystem to (e.g. this whole discussion). In that it seems different from LLJs’urinary tract, which might also correlate with why they share the latter with allother vertebrates, whereas the former is a bit more restricted. I read Chomskyin MP as taking a stab at that mystery.

D E R I V A T I O N S

42

Page 54: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Part I

SYNTAGMATIC ISSUES

Page 55: Uriagereka J. Derivations. Exploring the Dynamics of Syntax
Page 56: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

3

MULTIPLE SPELL-OUT†

1 Deducing the base step of the LCA

A main desideratum of the Minimalist Program is reducing substantive prin-ciples to interface (or bare output) conditions, and formal principles to economyconditions. Much energy has been devoted to rethinking constraints and phe-nomena that appear to challenge this idea, in the process sharpening observa-tions and descriptions. In this chapter, I attempt to reduce a version of Kayne’s(1994) Linear Correspondence Axiom (LCA).

Chomsky (1995c) already limits the LCA’s place in the grammar. Kayne’sversion of the axiom is a formal condition on the shape of phrase markers.Chomsky’s (for reasons that go back to Higginbotham 1983b) is a condition thatoperates at Spell-out, because of PF demands. Kayne’s intuition is that a nonlin-earized phrase marker is ill formed, in itself, whereas for Chomsky such anobject is ill formed only at PF, hence the need to linearize it upon branchingto this component. Chomsky’s version is arguably “more minimalist” in thatlinearization is taken to follow from bare output conditions.

The axiom has a formal and a substantive character. The formal partdemands the linearization of a complex object (assembled by the Merge opera-tion, which produces mere associations among terms). A visual image to keep inmind is a mobile by Calder. The hanging pieces relate in a fixed way, but are notlinearly ordered with respect to one another; one way to linearize the mobile(e.g. so as to measure it) is to lay it on the ground. The substantive part ofKayne’s axiom does for the complex linguistic object what the ground does forthe mobile: it tells us how to map the unordered set of terms into a sequenceof PF slots. But even if Chomsky’s reasoning helps us deduce the formal part ofthe axiom (assuming that PF demands linearization), the question remains ofexactly how the mapping works.

Kayne is explicit about that. Unfairly, I will adapt his ideas to Chomsky’sminimalist “bare phrase structure” (Chomsky avoids committing himself to adefinition in either Chomsky 1995a or 1995c).

(1) Linear Correspondence Axioma. Base step: If � commands �, then � precedes �.b. Induction step: If precedes � and dominates �, then � precedes �.

45

Page 57: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

I will discuss each of the steps in (1) in turn, with an eye toward deducingtheir substantive character from either bare output or economy considerations.Consider why command should be a sufficient condition for precedence. Itis best to ask this question with a formal object in mind. I will call this objecta command unit (CU), for the simple reason that it emerges in a deriva-tion through the continuous application of Merge. That is, if we merge elementsto an already merged phrase marker, then we obtain a CU, as in (2a). In con-trast, (2b) is not a CU, since it implies the application of Merge to differentobjects.

(2) a. Command unit: formed by continuous application of Merge to thesame object{�, {, {�, {�, {�…}}}}} →↑← {�, {�, {�…}}}

� →↑← {�…}b. Not a command unit: formed by discontinuous application of Merge

to two separately assembled objects{�, {{, {, {…}}}, {�, {�, {�…}}}}}

{, {, {…}}} →↑← {�, {�, {�…}}} →↑← {…} � →↑← {�…}

Regarding CUs, the ultimate question is why, among the possible linearizationsin (3), (3d) is chosen.

(3)

To continue with the mobile image, there are n! ways in which we can lay it onthe ground, for n the number of hanging elements. Why is it that, among all the

{�, …} �

a. {�, …}

{�, …}

� {�, …}

b. {�, …}

{�, …}

�{�, …}

c. {�, …}

{�, …}

{�, …}�

d. {�, …}

{�, …}

�{�, …}

e. {�, …}

{�, …}

f.

{�, …}�

{�, …}

{�, …}

D E R I V A T I O N S

46

Page 58: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

apparently reasonable permutations, the linguistic mobile collapses into alinearized sequence specifically in the order �, �, {�…}�?

We may ask the question from the point of view of what syntactic relationsare relevant to the terminals of the structures in (3). Concentrating on the ter-minals, we see that the only relation that exists between them in a CU is “I havemerged with your ancestors.” We can produce an order within CUs in terms ofthis relation, which essentially keeps track of what has merged with what when.This is, essentially, the insight behind Epstein’s (1999) interpretation ofcommand, which has the effect of ordering the terminal objects in (3) as follows:�, �, {�…}�. If PF requirements demand that the Merge mobile collapse intoa flat object, it is not unreasonable to expect that the collapse piggybacks on apreviously existing relation. Indeed, minimalist assumptions lead us to expectprecisely this sort of parsimony.

However, we have not yet achieved the desired results. To see this, imagine agroup of people trapped inside a building, with access to a window that allowsthem to exit just one at a time. These people may order themselves according tosome previously existing relation (e.g. age). But having found an order does notmean having decided how to leave the building. Does the youngest exit first orlast – or in the middle? Likewise, a decision has to be made with regard to the�, �, {�…}� order. Exactly how do we map it to the PF order?

In minimalist terms, the question is not just how to map the collapsed �, �,{�…}� sequence to a PF order, but actually how to do it optimally. The hope isthat mapping the collapsed �, �, {�…}� command order to the �, �, {�…}�PF order in (3d) is (one among) the best solution(s).

Another analogy might help clarify the intuition. Visualize a house of cards,and imagine how pulling out one crucial card makes it collapse. To a reasonableextent, the order in which the cards fall maps homomorphically to the order inwhich they were assembled, with higher cards landing on top, and cards placedon the left or right falling more or less in those directions (assuming no forcesother than gravity). If Merge operations could be conceived as producing whatamounts to a merge-wave of terminals, it is not unreasonable to expect such awave to collapse into a linearized terminal sequence in a way that harmonizes(in the same local direction) the various wave states, thus essentially mappingthe merge order into the PF linear order in a homomorphic way. This, of course,is handwaving until one establishes what such a merge-wave is, but I will not gointo that here (see Martin and Uriagereka (forthcoming), on the concept of col-lapsed waves in syntax).

Even if we managed to collapse the merge order into the PF sequence thatmost directly reflects it, why have we chosen (3d) over the equally plausible(3a)? In short, why does the command relation collapse into precedence, andnot the opposite? The harmonized collapse problem seems to have not oneoptimal solution, but two.

Three different answers are possible. First, one can attribute the choice of(3d) over (3a) to something deep; it would have to be as deep as whateverexplains the forward movement of time. (I am not entirely joking here; physical

M U L T I P L E S P E L L - O U T

47

Page 59: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

properties are taken by many biologists to affect core aspects of the morphologyof organisms, and Kayne (1994) speculates in this direction.)

Second, one can say (assuming that (3a) and (3d) are equally optimal solu-tions) that (3d) gave humans an adaptive edge of some sort, in terms of parsingor perhaps learnability. One could also imagine that a species that had chosen(3a) over (3d) as a collapsing technique might equally well have evolved a parserand an acquisition device for the relevant structures (but see Weinberg 1999).

Third, one can shrug one’s shoulders. So what if (3a) and (3d) are equallyharmonic? Two equally valid solutions exist, so pick the one that does the work.(This view of the world would be very consistent with Stephen Jay Gould’spunctuated equilibrium perspective in biology; see Uriagereka 1998 andChapter 8.) This situation is acceptable within the Minimalist Program, or forthat matter within any program that seeks to understand how optimality worksin nature, which cannot reasonably seek the best solution to optimality prob-lems, but instead expects an optimal solution; often, even mathematicallyoptimal solutions are various.

If I am ultimately on the right track, (3d) can correctly be chosen as theactual PF ordering that the system employs; that is, we should not need to state(1a) as an axiom. In a nutshell, command maps to a PF linearization conventionin simple CUs (those (1a) is designed to target) because this state of affairs isoptimal. I will not claim I have proven this, for I have only indicated the direc-tion in which a demonstration could proceed, raising some obvious questions. Ihave little more to say about this here and will proceed on the assumption thatthe base step of the LCA can be turned into a theorem.

2 Deducing the induction step of the LCA

Having met the demands of the Minimalist Program by showing how part of theLCA can reduce to more justifiable conditions, we should naturally ask whetherthe whole LCA can be deduced this way. I know of no deduction of the sortsketched above, given standard assumptions about the model.

Nonetheless, an older model provides an intriguing way of deducing theLCA.1 For reasons that become apparent shortly, I refer to it as a dynamicallysplit model. The origins of this outlook are discussions about successive cyclicityand whether this condition affects interpretation. Are the interpretive com-ponents accessed in successive derivational cascades? Much of this debate wasabandoned the moment a single level, S-Structure, was postulated as the inter-face to the interpretive components. Now that S-Structure has itself been aban-doned, the question is alive again. What would it mean for the system to accessthe interpretation split in a dynamic way?

I want to demonstrate that the simplest assumption (i.e. nothing prevents adynamically split access to interpretation) allows the LCA’s induction step to besatisfied trivially. In effect, this would permit the deduction of (1b), albeit in adrastically changed model that neither Chomsky (1995c) nor Kayne (1994) wasassuming.

D E R I V A T I O N S

48

Page 60: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

One way of framing the issue is to ask how many times the rule of Spell-outshould apply. If we stipulate that it applies only once, then PF and LF areaccessed only once, at that point. On the other hand, liberally accessing PF andLF in successive derivational cascades entails multiple applications of Spell-out.Surely, assuming that computational steps are costly, economy considerationsfavor a single application of Spell-out. But are there circumstances in which aderivation is forced to spell out different chunks of structure in different steps?

One such instance might arise when a derivation involves more than one CU.As noted, CUs emerge as the derivational process unfolds, and they are triviallycollapsible by means of the base step of the LCA. Now, what if only those triv-ially linearizable chunks of structure (e.g. (2a)) are in fact linearized? That is,what if, instead of complicating the LCA by including (1b), when we encountera complex structure of the sort in (2b) we simply do not collapse it (thus lin-earizing it), causing a derivational crash? Only two results are then logicallypossible: either structures like (2b) do not exist, or they are linearized in varioussteps, each of which involves only CUs. The first possibility is factually wrong,so we conclude that Multiple Spell-Out (MSO) is an alternative.

Before we explore whether MSO is empirically desirable, consider its pos-sible mechanics. Bear in mind that CUs are singly spelled out – the most eco-nomical alternative. The issue, then, is what happens beyond CUs. Byassumption, we have no way of collapsing them into given linearizations, so wemust do the job prior to their merger, when they are still individual CUs. Whatwe need, then, is a procedure to relate a structure that has already been spelledout to the still “active” phrase marker. Otherwise, we cannot assemble a finalunified and linearized object.

The procedure for relating CUs can be conceived in conservative or radicalterms, either solution being consistent with the program in this chapter. Theconservative proposal is based on the fact that the collapsed Merge structure isno longer phrasal, after Spell-out; in essence, the phrase marker that has under-gone Spell-out is like a giant lexical compound, whose syntactic terms are obvi-ously interpretable but are not accessible to movement, ellipsis and so forth.2

The radical proposal assumes that each spelled-out CU does not even mergewith the rest of the structure, the final process of interphrasal association beingaccomplished in the performative components.3 I will detail briefly each of theseversions.

In the conservative version, the spelled-out phrase marker behaves like aword, so that it can associate with the rest of the structure; this means it mustkeep its label after Spell-out. Technically, if a phrase marker {�, {L, K}} col-lapses through Spell-out, the result is {�, �L, K�}, which is mathematicallyequivalent to {�, {{L}, {L, K}}}.4 Since this object is not a syntactic object, itclearly can behave like a “frozen” compound. As a consequence, we need notadd any further stipulations: the collapsing procedure of Spell-out itself resultsin something akin to a word.

To see how we reach this conclusion, we need to take seriously Chomsky’s(1995c) notion of syntactic object. Syntactic objects can take two forms.

M U L T I P L E S P E L L - O U T

49

Page 61: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(4) a. Base: A word is a syntactic object.b. Induction: {�, {L, K}} is a syntactic object, for L and K syntactic

objects and � a label.

(4a) speaks for itself, although it is not innocent. The general instance is not toocomplicated: a word is an item from the lexicon. However, the MinimalistProgram permits the formation of complex words, whose internal structure andstructural properties are not determined by the syntax. (Indeed, the objectresulting from Spell-out also qualifies as a word, in the technical sense of havinga label and a structure that is inaccessible to the syntax.) (4b) is obtainedthrough Merge and involves a labeling function that Chomsky argues isnecessarily projection. What is relevant here is how a label is structurallyexpressed.

(5) Within a syntactic object, a label � is not a term.

(6) K is a term if and only if (a) or (b):a. Base: K is a phrase marker.b. Induction: K is a member of a member of a term.

(6a) hides no secrets. (6b) is based on the sort of object that is obtained bymerging K and L: one set containing K and L, and another containing {L, K}and label � – namely, {�, {L, K}. This whole object (a phrase marker) is a term,by (6a). Members of members of this term (L and K) are also terms, by (6b).Label � is a member of the first term, hence not a term. All of these results areas desired.

Consider next the collapse of {�, {L, K}} as {�, �L, K�}, equivalent to {�,{{L}, {L, K}}}. By (6b), {L} and {L, K} are terms. However, {L, K} is not a syntac-tic object, by either (4a) or (4b). Therefore, {a, {{L}, {L, K}}} cannot be a syntac-tic object by (4b); if it is to be merged higher up, it can be a syntactic object onlyby (4a) – as a word. This is good; we want the collapsed object to be like a com-pound, that is, essentially a word: it has a label, and it has terms, but they arenot objects accessible to the syntax.

Note that set-theoretic notions have been taken very seriously here; forexample, such notions as linearity have been expressed without any codingtricks (angled brackets, as opposed to particular sets). In essence, the discussionhas revealed that generally merged structures (those that go beyond the head-complement relation) are fundamentally nonlinear, to the point that linearizingthem literally destroys their phrasal base. This conclusion lends some credibilityto Chomsky’s conjectures that (a) Merge produces a completely basic andmerely associative set-theoretic object, with no internal ordering, and (b) only ifcollapsed into a flat structure can this unordered object be interpreted at PF.

Though the current notation does the job, the appropriate results can beachieved regardless of the notation. Various positions can be taken, the mostradical having been mentioned already. In the version that ships spelled-outphrase markers to performance, one must assume a procedure by which alreadyprocessed (henceforth, “cashed out”) phrase markers find their way “back” to

D E R I V A T I O N S

50

Page 62: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

their interpretation site. Plausibly, this is the role agreement plays in thegrammar.

It is interesting to note that, according to present assumptions, MSO appliesto noncomplements (which are not part of CUs). Similarly, agreement does notmanifest itself in complements, which makes it reasonable to suppose that whatagreement does is “glue together” separate derivational cascades that are splitat Spell-out, the way an address links two separate computers.

In either version of MSO, we have now deduced (1b), which stipulates thatthe elements dominated by in a CU precede whatever precedes. That should precede or be preceded by the other elements in its CU was shown inSection 1. The fact that the elements dominated by act as does within its CUis a direct consequence of the fact that has been spelled out separately fromthe CU it is attached to, in a different derivational cascade. The elementsdominated by cannot interact with those that interacts with, in the “mother”CU. Thus, their place in the structure is as frozen under ’s dominance as wouldbe the place of the members of a compound , the syllables of a word , orworse still, elements that have already “gone to performance.”5

I should point out one final, important assumption I am making. The situ-ation we have been considering can be schematized as in (7). But what preventsa projection like the one in (8)?

(7)

(8)

In (8), it is the spelled-out category Y� that projects a YP. This results in forcingthe linearization of X’s projection prior to that of Y’s, contrary to fact.

The problem is somewhat familiar. In systems with a single Spell-out, bothKayne and Chomsky must resort to ad hoc solutions to avoid this sort of unde-sired result involving specifiers. Kayne eliminates the distinction betweenadjuncts and specifiers,6 and Chomsky defines command in a peculiar way: onlyfor heads and maximal projections, although intermediate projections must be“taken into account” as well.7

Within the conservative implementation of MSO, (8) can be prevented if

Y'

…Y

Spell-out

Merge

YP

X'

…X

YP

…Y

Spell-out

Merge

XP

X'

…X

M U L T I P L E S P E L L - O U T

51

Page 63: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

only lexical items project. Again, MSO is designed to collapse a phrase markerinto a compound of sorts. Yet this “word” cannot be seen as an item that pro-jects any further; it can merge with something else, but it can never be the itemthat supports further lexical dependencies. This might relate to some ofChomsky’s (2000) conjectures regarding a fundamental asymmetry indirectlyinvolved in the labeling of the Merge function; in particular, it may be thatMerge (like Move) implies a form of Attract, where certain properties of one ofthe merging items are met by the other. It is conceivable that properties rele-vant to Attract are “active” only in lexical items within the lexical array, ornumeration, that leads to a derivation, and not in words formed in the course ofthe derivation. This would include collapsed units of the sort discussed here, butit may extend as well to complex predicate formation, which is typically cappedoff after it takes place (there is no complex complex-predicate formation, and soon).8 At any rate, the price to pay for unequivocal attachment of spelled-outnoncomplements is to have two (perhaps not unreasonable) notions of termi-nals: lexicon-born ones and derived ones.

Under the radically performative interpretation of MSO, there is a trivialreason why a spelled-out chunk of structure should not project: it is gone from thesyntax. The price to pay for equivocal attachment of spelled-out noncomplementsis, as noted earlier, the agreement of these elements with corresponding heads.

3 Some predictions for derivations

I have essentially shown how the base step of the LCA may follow fromeconomy, and how the induction step may follow from a minimalist architecturethat makes central use of MSO, thus yielding dynamically bifurcated access tointerpretive components. Given the central position it accords CUs, this archi-tecture makes certain predictions. In a nutshell, command is important becauseit is only within CUs that syntactic terms “communicate” with each other, in aderivational cascade.

To get a taste of this sort of prediction, consider Chomsky’s (1995c) notion ofdistance, which is sensitive to command. The reason for involving command inthe characterization of distance is empirical and concerns superiority effects ofthe following sort:

(9) a. who t saw whatb. *what did who see tc. which professor t saw which studentd. which student did which professor see t

Chomsky’s account capitalizes on the simple fact that the competing wh-elements (who, what, which) stand in a command relation in (9a,b), but clearlynot in (9c,d), as (10a) and (10b) show, respectively.9

(10) a. [C [who … [saw what]]]b. [C [[which professor] … [saw [which student]]]]

D E R I V A T I O N S

52

Page 64: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Thus, he defines distance in terms of the following proviso:

(11) Only if � commands � can � be closer to a higher than � is.

This is the case in (10a): the target C is closer to who than to what. Crucially,though, in (10b) the target C is as close to the which in which professor as it is tothe which in which student; these positions being equidistant from C, bothmovements in (9c) and (9d) are allowed, as desired. Needless to say, this solu-tion works. But why should this be? Why is command relevant?

In MSO terms, the explanation is direct. The two wh-elements in (10a)belong to the same derivational cascade, since they are assembled throughMerge into the same CU. This is not true of the two wh-phrases in (10b); inparticular, which professor and which student are assembled in different CUsand hence do not compete within the same derivational space (I return below tohow the wh-features in each instance are even accessible). The fact that thephrases are equally close to the target C is thus expected, being architecturallytrue, and need not be stated in a definition of distance.

It might seem that this idea does not carry through to the radically performa-tive interpretation of MSO; but in fact it does. Even if which professor in (10) isin some sense gone from the syntactic computation, the relevant (here, wh-)feature that is attracted to the periphery of the clause stays accessible, again forreasons that I return to shortly.

The general architectural reasoning that determines what information is andis not gone from the computation extends to classical restrictions on extractiondomains, which must be complements.10 The contrast in (12) is extremely prob-lematic for the Minimalist Program.

(12) a. […X […t…]]e.g. who did you see [a critic of t]

b. [[…t…] X…]e.g. *who did [a critic of t] see you

The problem is that whatever licenses (12a) in terms of the Minimal Link Con-dition or last resort should also license (12b); so what is wrong with the latter?A minimalist should not simply translate the observation in (12) into a newprinciple; such a principle must again fall within the general desiderata of theprogram – and thus reduce to economy or bare output conditions. I know of nominimalist way of explaining the contrast in (12).11

But now consider the problem from the MSO perspective. A complement isvery different from any other dependent of a head in that the elements a com-plement dominates are within the same CU of the “governing” head, whereasthis is not true for the elements a noncomplement dominates. As a result,extraction from a complement can occur within the same derivational cascade,whereas extraction from a noncomplement cannot, given my assumptions. Basi-cally, the following paradox arises. If a noncomplement is spelled out indepen-dently from its head, any extraction from the noncomplement will involvematerial from something that is not even a syntactic object (or, more radically,

M U L T I P L E S P E L L - O U T

53

Page 65: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

not even there); thus, it should be as hard as extracting part of a compound (orworse). On the other hand, if the noncomplement is not spelled out in order toallow extraction from it, then it will not be possible to collapse its elements,always assuming that the only procedure for linearization is the command –precedence correspondence that economy considerations sanction.

Of course, one might now wonder how such simple structures as (13) canever be generated, with movement of a complex wh-phrase.

(13) [[which professor] [did you say [t left]]]

If, for the collapse of a complex noncomplement’s elements to be sanctioned,they must be spelled out before they merge with the rest of the phrase marker,how can movement of noncomplements exist? Should such elements not be pro-nounced where they are spelled out?

(14) *[did you say [[which professor] left]]

The answer to this puzzle relates to the pending question of wh-feature accessi-bility in spelled-out phrases. I address both matters in the following section.

4 General predictions for the interpretive components

The dynamically split model that MSO involves produces derivational cascades,each of which reaches the interpretive components in its own derivational life. Ifthis model is correct, we should see some evidence of the relevant dynamics.

Let us start with PF matters. The first sort of prediction that comes to mindrelates to work by Cinque (1993), which goes back to Chomsky’s (1972) obser-vations on focus “projections.” Generally speaking, the focus that manifestsitself on a (complement) “right branch” may project higher up in the phrasemarker, whereas this is not the case for the focus that manifests itself on a (non-complement) “left branch.” For instance, consider (15).

(15) a. Michaelangelo painted THOSE FRESCOESb. MICHAELANGELO painted those frescoes

(15a) can answer several questions: “What did Michaelangelo paint?”, “Whatdid Michaelangelo do?”, and even “What happened?” In contrast, (15b) canonly answer the question “Who painted those frescoes?” Why?

The architecture discussed here is very consistent with the asymmetry,regardless of the ultimate nature of focal “projection” or spreading (aboutwhich I will say nothing). The main contribution that MSO can make to thematter is evident: for this model, focus can only spread within a CU – that is,through a “right branch.” Spreading up a “left branch” would involve movingacross two different CUs and hence would be an instance of a “cross-dimensional” communication between different elements.12

There are other phonological domains that conform to this picture, predict-ing that a pause or a parenthetical phrase will sound natural between subjectand predicate, for instance, or between any phrase and its adjuncts.

D E R I V A T I O N S

54

Page 66: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(16) a. Natural: Michaelangelo . . . painted those frescoesUnnatural or emphatic: Michaelangelo painted . . . those frescoes

b. Natural: Michaelangelo painted those frescoes . . . in FlorenceUnnatural or emphatic, or different in interpretation:Michaelangelo painted . . . those frescoes in Florence

The same results are obtained by replacing the dots in (16) with standard fillerslike you know, I’m told, or I’ve heard (see Selkirk 1984).

There are interesting complications, too. For example, Kaisse (1985) andNespor and Vogel (1986) suggest that functional items phonologically associateto the lexical head they govern. Consider the examples in (17), from Lebeaux(1996), where underlined items are phonologically phrased together.

(17) a. John may have seen Maryb. the picture of Mary

These sorts of paradigms are compatible with the MSO proposal, although some-thing else must be responsible for the cliticization (note, within a given CU).

One can think of harder cases. A particularly difficult one from Galician ismentioned in Uriagereka (1988a).

(18) vimo-los pallasos chegarsaw.we-the clowns arrive“We saw the clowns arrive.”

In this language, determiners surprisingly cliticize to previous, often thematicallynonrelated heads; in (18), for instance, the determiner introducing the embeddedsubject attaches to the verb that takes as internal argument the reduced clausethat this subject is part of. The sort of analysis I have given elsewhere (Uriagereka1988a, 1996), whereby the determiner syntactically moves to the position shown in(18), is contrary to expectations, given my present account of the paradigm in(12). Otero (1996) gives reasons to believe that the determiner cliticization cannotbe syntactic, but is instead a late morphophonological process; if my presentanalysis is correct, then Otero’s general conclusion must also be correct.

Otero’s suggestion is in fact compatible with the general MSO architecture,so long as prosodic phrasing is allowed to take place after Spell-out, naturallyenough in terms of the edges of “adjacent” (or successively spelled out) cas-cades of structure (see Otero (1996: 316); Lasnik (1995); and more generallyBobaljik (1995) and references cited there). This is straightforward for theconservative version of MSO, but is possible as well in the radical version, solong as prosody is a unifying mechanism in performance (in the same league asagreement, in the sense above).

In fact, the radical version of MSO can rather naturally account for thevariant of (18) given in (19).

(19) *vimo-los pallasos chegaren (but OK: vimos os pallasos chegaren)saw.we-the clowns arrive.they

“We saw the clowns arrive.”

M U L T I P L E S P E L L - O U T

55

Page 67: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

A minor change in the form of (18) – introducing agreement in the infinitival,the -en morpheme after chegar “arrive” – makes the determiner cliticizationimpossible. This can be explained if the cliticization is a form of agreement, inwhich case the subject of the embedded clause in (19) is forced to agree withtwo elements at once: the inflected infinitival and the matrix verb. If agreementis indeed an address, as is expected in the radical version of MSO, the kind ofduplicity in (19) is unwanted; see the Agreement Criterion below.13

Needless to say, this bird’s-eye view of the problem does little justice to thecomplex issues involved in prosodic phrasing, not to mention liaison, phrasalstress, pausing, and other related topics. My only intention has been to point outwhat is perhaps already obvious: within the MSO system, “left branches” shouldbe natural bifurcation points for PF processes, if the present architecture iscorrect. At the same time, if we find “communication” across the outputs ofderivational cascades, the natural thing to do is attribute it to performative (atany rate, post-Spell-out) representations, plausibly under cascade adjacency.

Similar issues arise for the LF component, where immediate predictions canbe made and rather interesting problems again arise. The general predictionshould by now be rather obvious: CUs are natural domains for LF phenomena.This is true for a variety of processes (binding of different sorts, obviation,scopal interactions, negative polarity licensing); it is indeed much harder to findinstances of LF processes that do not involve command than otherwise.(Though there are such instances, to which I return.) More importantly, wemust observe that CUs are just a subcase of the situations where commandemerges.

A problematic instance is patent in a now familiar structure, (20).

(20)

Although J and H are not part of the same CU (the latter is part of a CUdominated by L), J commands H. Empirically, we want the relation in (20) tohold in cases of antecedence, where J tries to be H’s antecedent (every boythinks that [his father] hates him). The question is, if J and H are in different“syntactic dimensions” – after L is spelled out – how can J ever relate to H?

The logic of the system forces an answer that is worth pursuing: there areaspects of the notion of antecedence that are irreducibly nonderivational. Thismight mean that antecedence is a semantic or pragmatic notion; either way, weare pushing it out of the architecture seen thus far – at least in part. The hedge is

M

…J

N

KL

H… … …

D E R I V A T I O N S

56

Page 68: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

needed because we still want antecedence to be sensitive to command, even if itdoes not hold within the CUs that determine derivational cascades. As it turnsout, the dynamically split system has a bearing on this as well.

Essentially, we want to be able to say that J in (20) can be the antecedent ofH, but H cannot antecede anything within K. The radical and the conservativeversions of the proposal deal with this matter differently, as follows.

For the conservative view, recall that although L in (20) is not a syntacticobject after Spell-out, it does have internal structure, and its information is notlost. To see this in detail, suppose that before Spell-out, L had the internal struc-ture of his father, that is, {his, {his, father}}. After Spell-out, the structurebecomes {his, �his, father�}, equivalent to {his, {{his}, {his, father}}}. By (6b), wecan identify {his} and {his, father} as terms (and see Note 4). This is an importantfact because, although the linearized object is not a syntactic object, it containsterms, which the system can identify if not operate with: they do not constitute alicit structure.14 The point is, if the relation of antecedence is based on the iden-tification of a term like {his}, even the linearized structure does the job, inacces-sible as it is to any further syntactic operation. (This highlights, I believe, thefact that accessibility is not the same as interpretability.)

But consider the converse situation, where in spite of H’s being a term in(21), it cannot be the antecedent of L or K.15

(21)

This suggests that �’s antecedent be characterized as in (22).

(22) Where � is a term in a derivational cascade D, a term, � is �’santecedent only if � has accessed interpretation in D.

This determination of �’s antecedent is derivational, unlike the notion term in(6), which is neutral with respect to whether it is characterized derivationally ornot. It is worth noting that (22) is suggested as part of a definition, not ofantecedence, but of antecedent of �.16 The formal aspects of the notion in (22)are compatible with the system presented thus far, but its substantive character– that only a term that accesses LF in D can be an antecedent of the terms in D– does not follow from the architecture, at least in its conservative shape.

Under the radical version of the MSO architecture, the internal structure ofthe spelled-out phrase is not relevant, since in this instance the phrasal archi-tecture of the syntactic object need not be destroyed (what guarantees inaccessi-bility is the fact that the phrase has been sent to performance). In turn, thisversion has an intriguing way of justifying (22).

M

N

KL

J

H… …

M U L T I P L E S P E L L - O U T

57

Page 69: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

As noted earlier, a problem for the performative approach is how to associ-ate cashed-out structures to the positions where they make the intended sense.Antecedence as a process may be intricately related to this association problem.Simply put, the system ships structure X to the interpretive components; later, itcomes up with structure Y, within which X must meaningfully find its place.This presupposes an addressing technique, so that X “knows” where in Y itbelongs; by hypothesis, agreement is the relevant technique. It is natural, then,that X as a whole should seek a place within a part of Y. Now consider (22), andlet � be a term within an active structure Y, and � an already cashed-out term(either X itself or part of it). Why should � be the antecedent of � only if �accesses interpretation in Y’s derivational cascade?

To answer this question, observe, first of all, that the performative version ofMSO makes slightly more sense if the system works “top-down” than if it works“bottom-up.” Chomsky (2000) discusses this type of system and correctly pointsout that it is perfectly reasonable within present assumptions; Drury (1998)develops one such alternative. The only point that is relevant here is whether aseries of noncomplements are sent to performance starting from the root of thephrase marker or from its foot. The logic of the system always forces a givennoncomplement to access performance prior to the CU it associates with.Suppose this “top-down” behavior is generalized, so that the first noncomple-ment after the root of the phrase marker is shipped to performance first, thesecond noncomplement next, and so on, until finally the remaining structure iscashed out.

With regard to antecedence, then, (22) amounts to this: for � to be �’santecedent, � must have been sent to performance before � – in fact, in aderivational cascade that is “live” through the address mechanism of agreement.In other words, antecedence presupposes agreement, which is very consistentwith the well-known diachronic fact that agreement systems are grammaticaliza-tions of antecedence/pronoun relations (see Barlow and Fergusson 1988).

The intuition is that agreement is merely a pointer between two phrasemarkers, one that is gone from the system, and one that is still active in syntacticterms. Material within the cashed-out phrase marker is “out of sight”; thesystem only sees the unit as a whole for conceptual – intentional reasons (as thelabel that hooks up the agreement mechanism), and perhaps the phonologicaledges (under adjacency among cascades) for articulatory – perceptual reasons.Consequently, just as prosodic adjustments can take place only in the visibleedges of the cashed-out material, so antecedence can be established only via thevisible top of the phrase that establishes agreement with the syntactically activephrase (cf. (21)).

The fact that the variable bound by the antecedent can be inside a cashed-outnoncomplement (as in (20)) is perfectly reasonable if this variable serves no syn-tactic purpose vis-à-vis the antecedent. Differently put, whereas the syntacticallyactive part of the structure needs to know where the antecedent is, it does notneed to know precisely where the variable is, so long as it is interpretable withinthe cashed-out structure. This makes sense. The antecedent is a unique element

D E R I V A T I O N S

58

Page 70: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

that determines the referential or quantificational properties of an expression;in contrast, the variable (a) is not unique (many semantic variables can be asso-ciated with a given antecedent) and (b) serves no purpose beyond its own directassociation to some particular predicate – it determines nothing for other partsof the structure.

In sum, if the radical MSO view is correct, antecedence is a semantic processthat is paratactically instantiated through the transderivational phenomenon ofagreement: the antecedent must syntactically agree (abstractly, of course; theagreement may or may not be realized morphologically with the structure thatcontains its associated variable). There is no structural restriction on the vari-able,17 although semantically it will be a successful variable only if it happens tomatch up with the agreement features of its antecedent.

5 How noncomplements can move

Consider next a question we left pending: why (13), repeated here, is perfect.

(23) [[which professor] [did you say [t left]]]

Let us first evaluate this example from the perspective of the radical version ofMSO. Strictly speaking, the phrase which professor is never directly connectedto the structure above it – not even to the predicate left. Rather, it agrees withthe relevant connecting points, which are presumably occupied by some categor-ial placeholder [D] (much in the spirit of ideas that go back to Lebeaux 1988). Itis [D] that receives a �-role, moves to a Case-checking position, and eventuallyends up in the wh-site – which must mean that [D] hosts thematic, Case, andwh-information, at least. It is thus not that surprising, from this perspective, thatthe wh-feature should be accessible to the system even after the spelling out ofwhich professor (wherever it takes place), since what stays accessible is not anelement within which professor, but an element the [D] category carries allalong, which eventually matches up with the appropriate features of which pro-fessor, as in (24).

(24) [which professor]i … [[D]i [you say [[D]i left]]]

An immediate question is why the “minitext” in (24) is not pronounced asfollows:

(25) [[D]i [you say [[D]i left]]] … [which professor]i

Reasonably, though, this relates again to the phenomenon of antecedence, andin particular the familiarity/novelty condition; in speech, information that setsup a discourse comes before old or anaphoric information (see Hoffman 1996for essentially this idea).

A second question relates to the Condition on Extraction Domain (CED)effect account. Why can (12b) not now be salvaged as in (26a)? Strictly, (26a)cannot be linearized, since the subject of see you is too complex. But suppose weproceed in two steps, as in (26b).

M U L T I P L E S P E L L - O U T

59

Page 71: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(26) a. [who]i … [[a critic of [D]i] see you]b. [who]j … [a critic of [D]j]i … [[D]i see you]

There is a minimal, yet important, difference between (26b) and (20), where wedo want J to relate to H as its antecedent. Whereas there is a relation ofgrammar that tries to connect who and [D] (the equivalent of its trace) in (26b),no relation of grammar connects an antecedent J to a bound variable H in (20).In other words, we want the long-distance relation in (26b) to be akin to move-ment, but clearly not in (20). But how is long-distance movement captured inthe radical version of MSO, if we allow associations like the one in (24), wherewhich professor has never been inside the skeletal phrase?

The key is the [D] element, which moves to the relevant sites and associatesvia agreement to whatever phrase has been cashed out. In (26b), the [D] insidea critic of must associate to who (alternatively, if of who is a complement ofcritic, this element moves out directly – but see Note 14). Now, how does whoassociate to the matrix C position? If who associates inside a critic of, then itdoes not associate in the matrix C; conversely, if who associates in the matrix C,as a question operator, then it cannot associate inside a critic of. The only wayan element like who can grammatically relate to two or more positions at once –that is, to �-, Case, or wh-positions – is if all these positions are syntactically con-nected, in which case it is the [D] element that moves through them and eventu-ally associates to who. This, of course, is what happens in the perfect (12a),repeated here.

(27) [who]i … [[D]i [you see [a critic of [D]i]]]

Here again, agreement uniqueness is at play, as I speculatively suggestedregarding the ungrammaticality of (19). This important point can be statedexplicitly as follows:

(28) Agreement CriterionA phrase � that determines agreement in a phrase � cannot at thesame time determine agreement in a phrase .

This criterion is tantamount to saying that agreement is a rigidly unique address.It may well be that (28) follows from deeper information – theoretic matters,but I will not pursue that possibility here.18

The conservative version of MSO can also account successfully for (23),although (naturally) with assumptions that do not bear on agreement considera-tions and instead introduce very different operational mechanics. As before, theissue is to somehow have access to which professor, even though this phrasemust also be sent to Spell-out if it attaches as a noncomplement. This statementlooks contradictory; but whether it is or not depends on exactly how the detailsof movement are assumed to work.

Consider whether the two steps involved in movement – copying somematerial and then merging it – must immediately feed one another, within agiven derivation. Suppose we assume that move is a collection of operations, as

D E R I V A T I O N S

60

Page 72: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

several researchers have recently argued (see, e.g. Kitahara 1994; Nunes 1995).Thus, movement of a complex phrase marker may proceed in several steps – forexample, as in (29).

(29) a. Copy one of two independently merged phrases

b. Spell out the lower copy as trace

c. Merge the trace

d. Merge the higher copy (possibly in a separate derivation)

The key is the “in parallel” strategy implicit in (29a,b); the rest of the steps arestraightforward. So let us see whether those initial steps can be justified.

Technically, what takes place in (29a,b) is, at least at first sight, the same aswhat takes place in the formation of a phrase marker as in (30).

(30) a. Numeration: {the, a, man, saw, woman, …}b.

Prior to merging [the man] and [saw [a woman]], the system must assemblethem in separate, completely parallel derivational spaces; there is no way of

the

manthe

saw

asaw

a woman

M

N

KL[Ø]

L

L

L[Ø]

K

N

L

L[Ø]

K

copyL

L

K

M U L T I P L E S P E L L - O U T

61

Page 73: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

avoiding this, assuming Merge and standard phrasal properties of DPs and VPs.(29a) capitalizes on this possibility; instead of copying lexical items from thenumeration, as in (30), in (29a) the system copies the items from the assembledphrase marker.

In turn, (29b) employs the option of deleting phonetic material, thus makingit unavailable for PF interpretation. It is reasonable to ask why this step isinvolved, but the question is no different from that posed by Nunes (1999), con-cerning why the copy of K in (31) is not pronounced when K is moved.

(31)

Why is who did you see not pronounced who did you see who? After all, if move-ment is copying plus deletion, why is deletion necessary, particularly at PF?

Nunes’s answer capitalizes on the LCA, by assuming that identical copies areindeed identical. Hence, Kayne’s linearization question has no solution; forinstance, in the above example does who command, or is it commanded by,you? It depends on which who we are talking about. One is tempted to treateach of these as a token of a lexical type, but they are not; each who (other thanthe lexically inserted occurrence) emerges as a result of mere derivationaldynamics. Then there is no solution unless, Nunes reasons, the system deletesone of the copies on its way to PF (the place where linearization is required inChomsky’s system). If only one copy of who is left, the linearization answer istrivial: in standard terms, the remaining copy precedes whatever it commands.19

(29b) has the same justification as Nunes’s copy deletion. Note that if thesystem does not spell out the lower copy of L as a trace, when it reaches thestage represented in (29d), it will not be able to determine whether L commandsor is commanded by all other elements in the phrase marker, and thus thisobject will not collapse into a valid PF realization.

In effect, then, there is a way to keep something like which professor accessibleeven if it starts its derivational life as a subject, by making a copy of it in advanceand having that copy be the one that merges in the ultimate Spell-out site, theother(s) being spelled out as trace(s). A question remains, however: why can thisprocedure not provide a gambit for escaping the ungrammaticality of CEDeffects? For example, what prevents the following grammatical derivation of (26)?

1 Assemble see you and a critic of who.2 Copy who in parallel.3 Realize the first copy of who as a trace.4 Merge a critic of t to see you and all the way up to the C projection.5 Attach the stored copy of who as the specifier of C.

A way to prevent this unwanted derivation capitalizes on the desire to limit theglobality of computational operations, as argued for in Chomsky (2000) and ref-erences cited there (see also Chapter 4). Step 2 in the derivation is clearly very

a. [K…[…K…]…]

b. [K…[…[Ø]…]…]

D E R I V A T I O N S

62

Page 74: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

global: at the point of merging who, the system must know that this element willbe attracted further up in the phrase marker – in a completely different deriva-tional cascade. Crucially, the system cannot wait until the matrix C appears inorder to make a copy (in a parallel derivational space) of who, thereby makingthe already attached copy of who silent (i.e. a trace); in particular, it cannotsimply go back to the site of who and add the instruction to delete after it hasabandoned the “cycle” a critic of who, since that operation would be counter-cyclic. It is literally when the “lower” who attaches that the system must knowto take it as a trace (cf. (29)), which entails that the system must have access tothe C that attracts who, even when it has not yet left the numeration.

Let us again consider all the relevant examples side by side (CUs are boxed,trace copies are parenthesized).

(32)

(32a) is straightforward. Before the movement of which professor, the sentenceinvolves a single CU, within which C trivially attracts the necessary wh-feature;after which professor pied-pipes along with this wh-feature, a new CU emerges,which is of no particular interest. (32b) and (32c) are more complicated, sincethey involve two CUs prior to wh-movement. The issue is how they differ.

We saw earlier that the derivation in (32c) cannot proceed cyclically if it isallowed to go all the way up to the CP level, then to return to the lower whichprofessor and delete it. Rather, at the point when which professor attaches, thesystem must know that it is being (overtly) attracted by C and hence must copyit in parallel and attach the initial copy as a trace. The same is true of which pro-fessor in (32b), but there C and (the entire phrase) which professor are at leastpart of the same CU, whereas in (32c) (the entire phrase) which professor ispart of the CU of a critic of which professor, and C is not. This must be the key:as expected, only elements within the same CU can relate.

But then, is (33) still not a problem?

(33) a critic of which professor saw you

(34)

The difference between (32c) and (33)/(34) is this. In the former, the systemmust decide to copy which professor as a trace while in the CU of a critic of

C ( a critic of which professor ) see youa critic of which professor

which professora. C you see a critic of (which professor)

which professorb. C you say ( which professor ) left

which professorc. C a critic of (which professor) see you*

M U L T I P L E S P E L L - O U T

63

Page 75: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

which professor. In the latter, it is not which professor but a critic of which pro-fessor that is copied as a trace; hence, the system can reach the copying decisionwhile in the CU where C is merged – that is, locally. At the same time, deletionof a critic of which professor is a cyclic process if we define the “cycle” withinthe confines of a CU that has not been abandoned.

To make matters explicit, I state the following principle:

(35) Principle of Strict CyclicityAll syntactic operations take place within the derivational cycles ofCUs.

In other words, the cascades of derivational activity that we have seen all alongare responsible for limiting the class of activities the system engages in, in purelyoperational terms. Cross-cascade relations of any sort – Attract, Move, back-tracking for deletion purposes, or presumably any others – are strictly forbiddenby (35); a derivation that violates (35) is immediately canceled.

6 Beyond derivations

In both versions of MSO, CUs are crucial. This is explicitly encoded in (35), forthe conservative view, and is trivially true in the radical view, where only CUsexist in competence grammar – unification of CUs being left for performance. Ifthe present model is anywhere near right, this fact can be used as a wedge toseparate various sorts of phenomena; essentially, cyclic ones are syntactic,whereas noncyclic ones are paratactic, or perhaps not syntactic at all. We lookedat two of the former: cliticization across CUs in the PF component, and theestablishment of antecedent relations in the LF component. Even though thesephenomena were suggested not to be strictly derivational, the derivationalresults were taken to importantly limit the class of possible relations involved ineach instance – adjacency of cascades for PF, “top” of CUs for LF – as if syntaxcarved the path interpretation must blindly follow. But are there situations inwhich syntax leaves no significant imprint on representational shapes?

Presumably that would happen, within present assumptions, whenever a sys-tematic phenomenon simply does not care about command, or even exhibitsanticommand behavior. Weak crossover may well be one such instance. I wantto suggest the possibility of analyzing a typical weak crossover effect, as in(36b), as a violation of the condition on novelty/familiarity, which I take to bepragmatic.

(36) a. His friend knocked on the door. A man came in.b. his friend killed a man

The familiar his cannot antecede the novel a man in (36b) any more than it canin (36a). This is so, of course, only if we adopt the null hypothesis that thenovelty or familiarity of a given file is assumed not just across separate sen-tences (36a), but also intrasententially (36b). (This must be the case, virtually bydefinition, for the radical version of MSO, for which each separate CU is a text.)

D E R I V A T I O N S

64

Page 76: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

In order to extend this sort of analysis to the examples in (37), we must pos-tulate that the operators which trigger weak crossover involve an existencepredicate of the sort postulated by Klima and Kuroda in the 1960s (seeChomsky 1964), which gives them a characteristic indefinite or existential char-acter.20

(37) a. his friend killed everyoneb. who did his friend kill

That is, the logical form of everyone must be as coded in its morphology: everyx, one (x). Something similar must be said about who; this would be consistentwith the morphological shape of such elements in East Asian languages (see,e.g. Kim 1991; Watanabe 1992; and references cited there). Then the existenceelement will induce a novelty effect with regard to the familiar pronoun, asdesired.

The point I am trying to establish is simple. Postsyntactic machinery may beneeded to account for some familiar phenomena. The fact that they involve LFrepresentations in the Government-Binding model does not necessarily force usto treat them as LF phenomena in the present system – so long as we treat themsomehow (see Chapter 8). I strongly suspect that Condition C of the BindingTheory is another such phenomenon – as are, more generally, matters pertain-ing to long-distance processes that are extremely difficult to capture in purelysyntactic terms (e.g. some kinds of anaphora, unbounded ellipsis under paral-lelism).

7 Conclusions

The system I have programmatically sketched in this chapter is much moredynamically derivational than its alternative in Chomsky (1995c) (although itapproximates the one in Chomsky (2000), to the point of being conceptuallyindistinguishable). That the system is derivational, and that it is dynamically (orcyclically) so, are both interesting ideas in their own right, with a variety of con-sequences for locality and the class of representations the architecture allows.Curiously, one consequence (best illustrated in Weinberg 1999) is that the gapbetween competence and performance is partly bridged, radically so in oneversion of the program. This has a repercussion for competence: it provides arationale for the existence of agreement.

M U L T I P L E S P E L L - O U T

65

Page 77: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

4

CYCLICITY AND EXTRACTIONDOMAINS†

with Jairo Nunes

1 Introduction

If something distinguishes the Minimalist Program of Chomsky (1995b, 2000)from other models within the principles-and-parameters framework, it is theassumption that the language faculty is an optimal solution to legibility con-ditions imposed by external systems. Under this perspective, a main desideratumof the program is to derive substantive principles from interface (“bare output”)conditions, and formal principles from economy conditions. It is thus natural thatpart of the minimalist agenda is devoted to reevaluating the theoretical appar-atus developed within the principles-and-parameters framework, with the goal ofexplaining on more solid conceptual grounds the wealth of empirical materialuncovered in past decades. This chapter takes some steps toward this goal byderiving Condition-on-Extraction-Domains (CED) effects (in the sense ofHuang 1982) in consonance with these general minimalist guidelines.

Within the principles-and-parameters framework, the CED is generallyassumed to be a government-based locality condition that restricts movementoperations (see Huang 1982 and Chomsky 1986a, for instance). But once thenotion of government is abandoned in the Minimalist Program, as it involvesnonlocal relations (see Chomsky 1995b: Chapter 3), the data that wereaccounted for in terms of the CED call for a more principled analysis.

Some of the relevant data regarding the CED are illustrated in examples(1)–(3). Example (1) shows that regular extraction out of a subject or an adjunctyields unacceptable results; (2) shows that parasitic gap constructions struc-turally analogous to (1) are much more acceptable; finally, (3) shows that if thelicit parasitic gaps of (2) are further embedded within a CED island such as anadjunct clause, unacceptable results arise again (see Kayne 1984; Contreras1984; Chomsky 1986a).

(1) a. *[CP [which politician]i [C� did�Q [IP [pictures of ti] upset thevoters]]]

b. *[CP [which paper]i [C� did�Q [IP you read Don Quixote [PP beforefiling ti]]]]

(2) a. [CP [which politician]i [C� did�Q [IP [pictures of pgi] upset ti]]]b. [CP [which paper]i [C� did�Q [IP you read ti [PP before filing pgi]]]]

66

Page 78: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(3) a. *[CP [which politician]i [C� did�Q [IP you criticize ti [PP before[pictures of pgi] upset the voters]]]]

b. *[CP [which book]i [C� did�Q [IP you finally read ti [PP after leavingthe bookstore [PP without finding pgi]]]]]

Thus far, the major locality condition explored in the Minimalist Program is theMinimal Link Condition stated in (4) (see Chomsky 1995b: 311).

(4) Minimal Link ConditionK attracts � only if there is no �, � closer to K than �, such that Kattracts �.

The unacceptability of (5a), for instance, is taken to follow from a Minimal LinkCondition violation: at the derivational step represented in (5b), the interroga-tive complementizer Q should have attracted the closest wh-element who,instead of attracting the more distant what.

(5) a. *[I wonder [CP whati [C� Q [IP who [VP bought ti]]]]]b. [CP Q [IP who [VP bought what]]]

The Minimal Link Condition is in consonance with the general economy consid-erations underlying minimalism, in that it reduces the search space for computa-tions, thereby reducing (“operative”) computational complexity. However, ithas nothing to say about CED effects such as the ones illustrated in (1)–(3). In(1a), for instance, there is no wh-element other than which politician that Qcould have attracted.

In this chapter we argue, first, that CED effects arise when a syntactic objectthat is required at a given derivational step has become inaccessible to the com-putational system at a previous derivational stage; and second, that the contrastsbetween (1) and (2), on the one hand, and between (2) and (3), on the other, aredue to their different derivational histories. These results arise as by-products oftwo independent lines of research on the role of Kayne’s (1994) Linear Corres-pondence Axiom (LCA) in the minimalist framework: the Multiple Spell-Outsystem of Chapter 3, which derives the induction step of the LCA by eliminatingthe unmotivated stipulation that Spell-out must apply only once, and Nunes’s(1995, 1998) version of the copy theory of movement, which permits instances ofsideward movement (i.e. movement between two unconnected syntacticobjects) if the LCA is satisfied.

The chapter is organized as follows. In Section 2, we show how the standardCED effects illustrated in (1) can be accounted for within the Multiple Spell-Out theory proposed in the previous chapter. In Section 3, we show that side-ward movement allows constrained instances of movement from CED islands,resulting in parasitic gap constructions such as (2). In Section 4, we provide anaccount of the unacceptability of constructions such as (3) by reducing the com-putational complexity associated with sideward movement in terms ofChomsky’s (2000) cyclic access to subarrays. Finally, a brief conclusion is pre-sented in Section 5.

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

67

Page 79: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

2 Basic CED effects

Any account of the CED has to make a principled distinction between comple-ments and noncomplements (see Cattell 1976 for early, very useful discussion).Kayne’s (1994) LCA has the desired effect: a given head can be directly lin-earized with respect to the lexical items within its complement, but not withrespect to the lexical items within its subject or adjunct. The reason is trivial.Consider the phrase-marker in (6), for instance (irrelevant details omitted).

(6)

It is a simple fact about the Merge operation that only the terminal elements inboldface in (6) can be assembled without ever abandoning a single derivationalworkspace; by contrast, the terminal elements under DP and PP must first beassembled in a separate derivational space before being connected to the rest.

One can capitalize on this derivational fact in various ways. Let us recastKayne’s (1994) LCA in terms of Chomsky’s (1995b) bare phrase-structure andsimplify its definition by eliminating the recursive step, as formulated in (7).1

(7) Linear Correspondence AxiomA lexical item � precedes a lexical item � iff � asymmetricallyc-commands �.

Clearly, all the terminals in boldface in (6) stand in valid precedence relations,according to (7). The question is how they can establish precedence relationswith the terminals within DP and PP, if the LCA is as simple as (7).

Chapter 3 suggests an answer, by taking the number of applications of therule of Spell-out to be determined by standard economy considerations, and notby the unmotivated stipulation that Spell-out must apply only once. Here wewill focus our attention to cases where multiple applications of Spell-out aretriggered by linearization considerations (see Chapter 5 for other cases andfurther discussion). The reasoning goes as follows. Let us refer to the operationthat maps a phrase structure into a linear order of terminals in accordance tothe LCA in (7) as Linearize.2 Under the standard assumption that phrasal syn-tactic objects are not legitimate objects at the PF level, Linearize can be viewedas an operation imposed on the phonological component by legibility require-ments of the Articulatory – Perceptual interface, as essentially argued by Hig-ginbotham (1983b). If this is so and if the LCA is as simple as (7), the

the man

after that fact

VP

DP V'

PPV'

APremained

proud of her

D E R I V A T I O N S

68

Page 80: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

computational system should not ship complex structures such as (6) to thephonological component by means of the Spell-out operation, because Lin-earize would not be able to determine precedence relations among all thelexical items. Assuming that failure to yield a total order among lexical itemsleads to an ill-formed derivation, the system is forced to employ multiple ap-plications of Spell-out, targeting chunks of structure that Linearize can oper-ate with.

Under this view, the elements in subject and adjunct position in (6) can belinearized with regards to the rest of the structure in accordance with (7) inthe following way: (i) the DP and the PP are spelled out separately and inthe phonological component, their lexical items are linearized internal to them;and (ii) the DP and the PP are later “plugged in” where they belong inthe whole structure. We assume that the label of a given structure provides the“address” for the appropriate plugging in, in both the phonological and theinterpretive components.3 That is, applied to the syntactic object K� {, {�, �}},with label and constituents � and � (see Chomsky 1995b: Chapter 4), Spell-out ships {�, �} to the phonological and interpretive components, leaving K onlywith its label. Since the label encodes the relevant pieces of information thatallow a category to undergo syntactic operations, K itself is still accessible to thecomputational system, despite the fact that its constituent parts are, in a sense,gone; thus, for instance, K can move and is visible to linearization when thewhole structure is spelled out. Another way to put it is to say that once theconstituent parts of K are gone, the computational system treats it as a lexicalitem. In order to facilitate keeping track of the computations in the followingdiscussion, we use the notation K� [ ��, ��] to represent K after it has beenspelled out.

An interesting consequence of this proposal is that Multiple Spell-Out ofseparate derivational cascades derives Cattell’s (1976) original observation thatonly complements are transparent to movement. When Spell-out applies to thesubject DP in (6), for instance, the computational system no longer has access toits constituents and, therefore, no element can be extracted out of it. Let us con-sider a concrete case, by examining the relevant details of the derivation of (8),after the stage where the structures K and L in (9) have been assembled by suc-cessive applications of Merge.

(8) *Which politician did pictures of upset the voters?

(9) a. K� [vP upset the voters]b. L� [pictures of which politician]

If the LCA is as simple as in (7), the complex syntactic object resulting from themerger of K and L in (9) would not be linearizable, because the constituents ofK would not enter into a c-command relation with the constituents of L. Thecomputational system then applies Spell-out to L, allowing its constituents to belinearized in the phonological component, and merges the spelled-out structureL� with K, as illustrated in (10).4

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

69

Page 81: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(10)

Further computations involve the merger of did and movement of L� to [Spec,TP]. Assuming Chomsky’s (1995b: Chapter 3) copy theory of movement, thisamounts to saying that the computational system copies L� and merges it withthe assembled structure, yielding the structure in (11) (the deletion of the lowercopy in the phonological component is discussed in Section 3).

(11) [TP [pictures �pictures, of, which, politician�] [T� did [vP [pictures �pictures,of, which, politician�] [v� upset the voters]]]]

In the next steps, the interrogative complementizer Q merges with TP and didadjoins to it, yielding (12).

(12) [CP did�Q [TP [pictures �pictures, of, which, politician�] [T� did [vP [pictures

�pictures, of, which, politician�] [v� upset the voters]]]]]

In (12), there is no element that can check the strong wh-feature of Q. Crucially,the wh-element of either copy of L� [pictures �pictures, of, which, politician�]became unavailable to the computational system after L was spelled out. Thederivation therefore crashes. Under this view, there is no way for the computa-tional system to yield the sentence in (8) if derivations unfold in a strictly cyclicfashion, as we are assuming. To put it in more general terms, extraction out of asubject is prohibited because, at the relevant derivational point, there is literallyno syntactic object within the subject that could be copied.

Similar considerations apply to the sentence in (13), which illustrates theimpossibility of “extraction” out of an adjunct clause.

(13) *Which paper did you read Don Quixote before filing?

Assume for concreteness that the temporal adjunct clause of (13) is adjoined tovP. Once K and L in (14) have been assembled, Spell-out must apply to L,before K and L merge; otherwise, the lexical items of K could not be linearizedwith respect to the lexical items of L. After L is spelled out as L�, it merges withK, yielding (15). In the phonological component, Linearize applies to the lexicalitems of L� and the resulting sequence will be later plugged in the appropriateplace, after the whole structure is spelled out. The linear order between thelexical items of L and the lexical items of K will then be (indirectly) determinedby whatever fixes the order of adjuncts in the grammar.5

(14) a. K� [vP you read Don Quixote]b. L� [PP before PRO filing which paper]

upset the voters

VPv

vP

v 'L' � [pictures�pictures, of, which, politician�]

D E R I V A T I O N S

70

Page 82: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(15)

What is relevant for our current discussion is that after the (simplified) structurein (16) is formed, there is no wh-element available to check the strong wh-feature of Q and the derivation crashes; in particular, which paper is no longeraccessible to the computational system at the step where it should be copied tocheck the strong feature of Q. As before, the sentence in (13) is underivablethrough the cyclic derivation outlined in (14)–(16).

(16) [CP did�Q [TP you [vP [vP read Don Quixote] [before �before, PRO, filing,which, paper�]]]]

Finally, let us consider (17a). Structures like (17a) have recently been taken toshow that cyclicity cannot be violated. If movement of who to [Spec, CP] wereallowed to proceed prior to the movement of � to the subject position, (17a)should pattern like (17b), where who is extracted from within the object, con-trary to fact. If cyclicity is inviolable, so the argument goes, who in (17a) musthave moved from within the subject, yielding a CED effect (see Chomsky1995b: 328; Kitahara 1997: 33).

(17) a. *whoi was [� a picture of ti]k taken tk by Billb. whoi did Bill take [� a picture of ti]

A closer examination of this reasoning, however, reveals that it only goesthrough in a system that takes traces to be grammatical primitives. If the traceof � in (17a) is simply a copy of �, as shown in (18), the copy of who insidethe object should in principle be able to move to [Spec, CP], incorrectlyyielding an acceptable result. Crucially, the copy of who within the subject doesnot c-command the copy within the object and no intervention effect shouldarise.

(18) [CP Q [TP [� a picture of who] was taken [� a picture of who] by Bill]]

Before we discuss how the system we have been exploring, which assumes thecopy theory of movement, is able to account for the unacceptability of (17a), letus first consider the derivation of (19), where no wh-movement is involved.

(19) Some pictures of John were taken by Bill.

In (20), the computational system makes a copy of some pictures of John, spellsit out and merges the spelled-out copy with K, forming the object in (21).

(20) a. K� [TP were [VP taken [some pictures of John] by Bill]]b. L� [some �some, pictures, of, John�]

(21) [TP [some �some, pictures, of, John�] [T� were [VP taken [some picturesof John] by Bill]]]

vP

[vP you read Don Quixote ] L' � [before�before, PRO,filing, which, paper�]

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

71

Page 83: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Under reasonable assumptions regarding chain uniformity, the elements insubject and object positions in (21) cannot constitute a chain because they aresimply different kinds of syntactic objects (a label and a phrasal syntacticobject). Assume for the moment that lack of chain formation in (21) leads to aderivational crash (see next section for further discussion). Given the perfectacceptability of (19), an alternative route must be available.

Recall that under the Multiple Spell-Out approach, the number of applica-tions of Spell-out is determined by economy. Thus, complements in general donot need to be spelled out in separate derivational cascades because they can belinearized within the derivational cascade involving the subcategorizing verb –that is, a single application of Spell-out can linearize both the verb and its com-plement. In the case of (21), however, a licit chain can only arise if the NP in theobject position has been independently spelled out, so that the two copies canconstitute a chain. This leads us to conclude that convergence demands mayforce Spell-out to apply to complements, as well.

That being so, the question then is whether the object is spelled out in (20a)before copying takes place or only after the structure in (21) has been assem-bled. Again, we may find the answer in economy: if Spell-out applies to somepictures of John before it is copied, the copies will be already spelled out and noapplications of Spell-out will be further required for the copies.6 The derivationof (19) therefore proceeds along the lines of (22): the NP is spelled out beforebeing copied in (22a) and its copy merges with the whole structure, as shown in(22b); the two copies of the NP can then form a licit chain and the derivationconverges.

(22) a. [TP were [VP taken [some �some, pictures, of, John�] by Bill]]b. [TP [some �some, pictures, of, John�] [T� were [VP taken [some �some,

pictures, of, John�] by Bill]]]

Returning to (17a), its derivation proceeds in a cyclic fashion along the samelines, yielding the (simplified) structure in (23). Once the stage in (23) isreached, no possible continuation results in a convergent derivation: the strongwh-feature of Q must be checked and neither copy of who is accessible to thecomputational system. The approach we have been exploring here is thereforeable to account for the unacceptability of (17a), while still adhering to the viewthat traces are simply copies and not grammatical formatives.

(23) [CP was�Q [TP [a �a, picture, of, who�] [VP taken [a �a, picture, of,who�] by Bill]]]

To summarize, CED effects arise when a given syntactic object K that would beneeded for computations at a derivational stage Dn has been spelled out at aderivational stage Di prior to Dn, thereby becoming inaccessible to the computa-tional system after Di. Under this view, the CED is not a primitive condition onmovement operations; it rather presents itself as a natural consequence in aderivational system that obeys strict cyclicity and takes general economy consid-erations to determine the number of applications of Spell-out.7

D E R I V A T I O N S

72

Page 84: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

The question that we now face is how to explain the complex behavior ofparasitic gap constructions with respect to the CED, as seen in the introduction,if the deduction of the CED developed above is correct. This is the topic of thenext sections. Notice, for instance, that we cannot simply assume that parasiticgap constructions bypass some condition X that regular extractions obey; infact, we are suggesting that there is no particular condition X to prevent extrac-tion and, therefore, no way to bypass it either. Before going into the analysisproper, we briefly review Nunes’s (1995, 1998) analysis of parasitic gaps in termsof sideward movement, which provides us with the relevant ingredients toaddress the issue of CED effects in parasitic gap constructions.

3 Sideward movement and CED effects

With the incorporation of the copy theory into the Minimalist Program, Movehas been conceived of as a complex operation encompassing: (i) a suboperationof copying; (ii) a suboperation of merger; (iii) a procedure identifying copies aschains; and (iv) a suboperation deleting traces (lower copies) for PF purposes(see Chomsky 1995b: 250). Nunes (1995, 1998) develops an alternative versionof the copy theory of movement with two main distinctive features.

First, his theory takes deletion of traces in the phonological component to beprompted by linearization considerations. Take the structure in (24b), forinstance, which is based on the (simplified) initial numeration N in (24a) andarises after John moves to the subject position.

(24) a. N� {arrested1, John1, was1}b. [Johni [was [arrested Johni]]]

The two occurrences of John in (24b) are nondistinct copies (henceforthrepresented by superscripted indices) in the sense that both of them arise fromthe same item within N in (24a). If nondistinct copies are truly “the same” forpurposes of linearization, (24b) cannot be mapped into a linear order.8 Giventhat the verb was, for instance, asymmetrically c-commands the lower copy ofJohn and is asymmetrically c-commanded by the higher copy, the LCA shouldrequire that was precede and be preceded by John, violating the asymmetrycondition on linear orders (if � precedes �, it must be the case that � does notprecede �). The attempted linearization of (24b) also violates the irreflexivitycondition on linear orders (if � precedes �, it must be the case that ���); sincethe upper copy of John asymmetrically c-commands the lower one, John wouldbe required to precede itself. Simply put, deletion of traces in the phonologicalcomponent is forced upon a given chain CH in order for the structure contain-ing CH to be linearized.9

The second distinctive feature of Nunes’s (1995, 1998) version of the copytheory, which is crucial for the following discussion, is that Move is not taken tobe a primitive operation of the computational system; it is rather analyzed asthe mere reflex of the interaction among the independent operations describedin (i)–(iv) above. In particular, this system allows constrained instances of

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

73

Page 85: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

sideward movement, where the computational system copies a given constituent� of a syntactic object K and merges � with a syntactic object L, which has beenindependently assembled and is unconnected to K, as illustrated in (25).10

(25)

Let us consider how a parasitic gap construction such as (26a) can be derivedunder a sideward movement analysis, assuming that its initial numeration is theone given in (26b) (irrelevant items were omitted).

(26) a. Which paper did John file after reading?b. N� {which1, paper1, did1, John1, PRO1, Q1, file1, after1, reading1, v2,

C1}

(27) shows the step after the numeration N in (26b) has been reduced to N� andK has been assembled. Following Munn (1994) and Hornstein (2001), weassume that what Chomsky (1986a) took to be null operator movement in para-sitic gap constructions is actually movement of a syntactic object built from thelexical items of the numeration. From the perspective we are exploring, thatamounts to saying that the computational system spells out which paper in(27b), makes a copy of the spelled-out object (see Note 6), and merges it with Kto check whatever feature is involved in successive cyclic A�-movement, yield-ing L in (28a). The computational system then selects the preposition after andmerges with L, forming the PP in (28b).

(27) a. N�� {which0, paper0, did1, John1, PRO0, Q1, file1, after1, reading0, v1,C0}

b. K� [CP C PRO reading [which paper]]

(28) a. L� [CP [which �which, paper�]i C PRO reading [which �which,paper�]i]

b. M� [PP after [CP [which�which, paper�]i C PRO reading [which�which,paper�]i]]

Consider now the stage after file is selected from the numeration, as shown in(29). Following Chomsky (2000), we assume that the selectional/thematic prop-erties of file must be checked under Merge. However, possible continuations ofthe derivational step in (29) that merge file with the remaining elements ofthe reduced numeration N� in (27a) do not lead to a convergent derivation;under standard assumptions, John should not be able to enter into a �-relationwith both file and the remaining light verb, or check both the accusative Caseassociated with the light verb and the nominative Case associated with did.Once lexical insertion leads to crashing, the system must resort to (sideward)

a. [K…�i…] �i [L…]

[K…�i…] [M�i[L…]]

CopyMerge

D E R I V A T I O N S

74

b.

Page 86: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

movement, copying which paper from L and merging it with file, as shown in(30).11 The wh-copy in (30b) may then “mind its own business” within deriva-tional workspace P, independently of the other copies inside M. This is theessence of the account of parasitic gaps in terms of sideward movement.

(29) a. M� [PP after [CP [which�which, paper�]i C PRO reading [which�which,paper�]i]]

b. O�file

(30) a. M� [PP after [CP [which�which, paper�]i C PRO reading [which�which,paper�]i]]

b. P� [VP file [which �which, paper�]i]

It is important to note that sideward movement of [which �which, paper�] in(29)–(30) was possible because M had not been spelled out; hence, the computa-tional system had access not only to M itself, but also to the constituents of M.The situation changes in subsequent derivational steps. As discussed in Section2, a complex adjunct must be spelled out before it merges with a given syntacticobject; hence, the computational system spells out M as M� in (31a) and mergesM� with the matrix vP, as represented in (31b).

(31) a. M�� [after�after, [which�which, paper�]i, C, PRO, reading,[which�which, paper�]i �]

Further computations involve lexical insertion of the remaining items of thenumeration and movement of John and did, resulting in the (simplified) struc-ture represented in (32).

(32) [CP did�Q [IP John [vP [vP file [which �which, paper�]i] [after �after, [which

�which, paper�]i, C, PRO, reading, [which �which, paper�]i �]]]]

The copies of [which �which, paper�] inside the adjunct clause in (32) are notavailable for copying, because the whole adjunct clause has already been spelledout; however, the copy in the object of file is still available to the computationalsystem and, therefore, it can move to check the strong wh-feature of Q, yieldingthe (simplified) structure in (33), where the copies are numbered for ease ofreference.

b. vP

vP M'

[VP John file[which�which, paper�]i]

[after�after, [which,�which, paper.�]i,C, PRO, reading,

[which�which, paper�]i�]

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

75

Page 87: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(33)

Let us now focus on the computations related to the deletion of wh-traces of(33) in the phonological component. As discussed before, the presence of mul-tiple nondistinct copies prevents linearization. In the phonological component,the trace of the wh-chain within M is then deleted before Linearize applies to Mto yield M�, as shown in (34).

(34) M�� [after�after, [which�which, paper�]3, C, PRO, reading,[which�which, paper�]4 �]

After Spell-out applies to the whole structure in (33) and the previously spelled-out material is appropriately plugged in, two wh-chains should be further identi-fied for trace deletion to take place: the “regular” chain CH1 � (copy1, copy2)and the “parasitic” chain CH2 � (copy1, copy3).12 Identification of CH1 is trivialbecause copy1 clearly c-commands copy2; hence, deletion of copy2 is withoutproblems. Identification of CH2 is less obvious, because M is no longer a phrase-structure after being linearized. However, if c-command is obtained by the com-position of the elementary relations of sisterhood and containment, as proposedby Chomsky (2000: 31) (see also Epstein 1999), copy1 does c-command copy3 in(33), because the sister of copy1, namely C�, ends up containing copy3 after thelinearized material of M is properly plugged in.13 The phonological componentthen deletes copy3, yielding (35). Finally, Linearize applies to (35) and the PFoutput associated with (26a) is derived.14

(35) [CP [which �which, paper�]1 did�Q [IP John [vP [vP file [which �which,paper�]2] [after after, �[which�which, paper�]3, C, PRO, reading,[which�which, paper�]4 �]]]]

Assuming that derivations proceed in such a strictly cyclic fashion, the contrastbetween unacceptable constructions involving “extraction” from within anadjunct island such as (13) and parasitic gap constructions such as (26a), there-

vP M'

[VP file[which�which, paper�]2]

[after�after, [which�which, paper�]3,C, PRO, reading,

[which�which, paper�]4�]

vPT

T'John

TPdid�Q

CP

C'[which�which, paper�]1

D E R I V A T I O N S

76

Page 88: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

fore, follows from their different derivational histories. In the unacceptablecase, the clausal adjunct has already been spelled out and its constituents are nolonger available for copying at the derivational step where Last Resort wouldlicense the required copying (see Section 2). In the acceptable parasitic gap con-structions, on the other hand, a legitimate instance of copying takes place beforethe clausal adjunct is spelled out (see (29)–(30)); that is, sideward movement, ifappropriately constrained by Last Resort, provides a kind of escape hatch formovement from within adjuncts.15

Similar considerations apply to parasitic gaps inside subjects. Let us considerthe derivation of (36a), for instance, which starts with the numeration N in(36b).

(36) a. Which politician did pictures of upset?b. N� {which1, politician1, did1, pictures1, of1, upset1, Q1, v1}

Suppose that after the derivational step in (37) is reached, K and L merge. Noconvergent result would then arise, because there would be no element in thenumeration N� in (37a) to receive the external �-role assigned by the light verbto be later introduced; in addition, if either K or the wh-phrase within K movedto [Spec, vP], they would be involved in more than one �-relation within thesame derivational workspace, leading to a violation of the �-Criterion.16

(37) a. N�� {which0, politician0, did1, pictures0, of0, upset0, Q1, v1}b. K� [pictures of [which politician]]c. L�upset

The computational system may instead spell out the wh-phrase, make a copy ofthe spelled-out object, and merge it with upset (an instance of sideward move-ment), as shown in (38). Each copy of which politician in (38) will now partici-pate in a �-relation, but in a different derivational workspace, as in (30).

(38) a. K� [pictures of [which �which, politician�]i]b. M� [upset [which �which, politician�]i]

In the next steps, the light verb is selected from the numeration N� in (37a) andmerges with M in (38b), and the resulting structure merges with K after K isspelled out, yielding the (simplified) structure in (39). Further computationsthen involve merger and movement of did, and movement of the spelled-outsubject to [Spec, TP], forming the (simplified) structure in (40).

(39) [vP [pictures �pictures, of, [which �which, politician�]i �] [v� upset [which

�which, politician�]i]]

(40) [CP did�Q [TP [pictures �pictures, of, [which �which, politician�]i �]k T [vP

[pictures �pictures, of, [which �which, politician�]i �]k [v� upset [which

�which, politician�]i]]]]

Among the three copies of which politician represented in (40), only the one inthe object position of upset is available for copying; the other two became inac-cessible after K in (37) was spelled out. The computational system then makes a

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

77

Page 89: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

copy of the accessible wh-element and merges it with the structure in (40),allowing Q to have its strong feature checked and finally yielding the structurein (41).

(41)

In the phonological component, deletion of the trace of the chain involving[Spec, TP] and [Spec, vP] in (41) ends up deleting copy3, because copy3 sitswithin [Spec, vP]. As for the other wh-copies, since copy1 c-commands bothcopy2 and copy4 after the linearized material is plugged in (see discussionabove), the chains CH1 � (copy1, copy2) and CH2 � (copy1, copy4) can be identi-fied and their traces are deleted, yielding (42) below.17 (42) is then linearizedand surfaces as (36a). Again, an apparent extraction from within a subject wasonly possible because Last Resort licensed sideward movement before the com-putational system spelled out the would-be subject.

(42) [CP [which �which, politician�]1 did�Q [TP [pictures �pictures, of, [which

�which, politician�]2 �]k T [vP [pictures �pictures, of, [which �which,politician�]3 �]k [v� upset [which �which, politician�]4]]]]

Although sideward movement may permit circumvention of CED islands in thecases discussed above, its output is constrained by linearization, like any stan-dard instance of upward movement. That is, the same linearization considera-tions that trigger deletion of traces are responsible for ruling out unwantedinstances of sideward movement (see Nunes 1995, 1998 for discussion). Takethe derivation sketched in (43)–(45), for instance, where every paper is spelledout and undergoes sideward movement from K to L. As is, the final structure in(44) cannot be linearized: given that the two instances of every paper are nondis-tinct, the preposition after, for instance, is subject to the contradictory require-ment that it should precede and be preceded by every paper. In the casesdiscussed thus far, this kind of problem is remedied by trace deletion (deletionof lower chain links). However, trace deletion is inapplicable in (44); given thatthe two copies do not enter into a c-command relation, they cannot be identified

T

T'

TPdid�Q

CP

T'[which�which, politician�]1

vP

upset

v '

[pictures�pictures, of,[which�which, politician�]2�]k

[which�which,politician�]4

[pictures�pictures, of,[which�which, politician�]3�]k

D E R I V A T I O N S

78

Page 90: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

as a chain.18 Thus, there is no convergent result arising from (44) and the para-sitic gap construction in (45) is correctly ruled out.

(43) a. K� [PP after reading [every �every, paper�]i]b. L� [VP filed [every �every, paper�]i]

(44) [TP John [vP [vP filed [every �every, paper�]i] [after �after, reading, [every

�every, paper�]i�]]]

(45) *John filed every paper without reading.

To sum up, the analysis explored above is very much in consonance with mini-malist guidelines in that it attempts to deduce construction specific propertiesfrom general bare output conditions (more precisely, PF linearization); it limitsthe search space for deletion of copies (it can only happen within a c-commandpath), and it does not resort to the non-interface level of S-Structure to rule out(45), like standard GB analysis do (see Chomsky 1982, for instance).19 Withrespect to the main topic of this chapter, the lack of CED effects in acceptableparasitic gaps is argued to follow from the fact that Last Resort may licensesideward movement from within a complex category XP, before XP is spelledout and its constituents become inaccessible to the Copy operation. In the nextsection, we will see that when parasitic gap constructions do exhibit CEDeffects, this is due to general properties of the system’s design, which strives toreduce computational complexity.

4 Sideward movement and cyclic access to the numeration

Let us finally examine the unacceptable parasitic gaps constructions in (46),which illustrate the fact that parasitic gaps are not completely immune to CEDeffects.

(46) a. *Which book did you finally read after leaving the bookstorewithout finding?

b. *Which politician did you criticize before pictures of upset thevoters?

Under one derivational route, the explanation for the unacceptability of thesentences in (46) is straightforward. The PP adjunct headed by without in (46a),for instance, must be spelled out before merging with the vP related to leaving,as represented in the simplified structure in (47a) below; hence, the constituentsof this PP adjunct are not accessible to the computational system and sidewardmovement of which book from K to L is impossible. Likewise, sideward move-ment of which politician from X in (48a) to Y in (48b) cannot take placebecause the subject in (48a) has been spelled out and its constituent terms areinaccessible for copying; hence, the unacceptability of (46b).

(47) a. K� [leaving the bookstore [without �without, PRO, finding, which,book�]]

b. L� read

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

79

Page 91: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(48) a. X� [IP [pictures �pictures, of, which politician�] upset the voters]b. Y�criticize

This account of the unacceptability of the parasitic gap constructions in (46) hascrucially assumed that the computation proceeds from a “subordinated” to a“subordinating” derivational workspace; in all the cases discussed so far, side-ward movement has proceeded from within an adjunct or subject to the objectposition of a subordinating verb. This assumption is by no means innocent. Inprinciple, the computational system could also allow sideward movement toproceed from a “subordinating” to a “subordinated” derivational workspace,while still adhering to cyclicity. Suppose, for instance, that we assemble thematrix VP of (46a), before building the VP headed by finding, as representedin (49).

(49) a. K� [read [which book]]b. L�finding

Given the stage in (49), which book could undergo sideward movement from Kto L, and M in (50b) would be formed (irrelevant details omitted). Further com-putations after M was spelled out and merged with K would then yield the (sim-plified) structure in (51).

(50) a. K� [read [which �which, book�]i]b. M� [after PRO leaving the bookstore [without �without, PRO,

finding, [which �which, book�]i �]]

(51)

The relevant aspect of (51) is that, although the wh-copy inside PP is not access-ible to the computational system, the wh-copy in the object position of read is. Itcould then move to check the strong feature of Q and deletion of the lower wh-copies would yield the (simplified) structure in (52), which should surface as(46a).

vP PP

read [which�which, book�]i [after�after, PRO, leaving,the, bookstore,

[without�without, PRO, finding,[which�which, book�]i�]�]

vPT

T'you

TPdid�Q

CP

D E R I V A T I O N S

80

Page 92: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(52) [CP [which�which, book�]i did�Q [TP you [vP [vP read [which �which,book�]i] [after �after, PRO, leaving, the, bookstore, [without �without,PRO, finding, [which �which, book�]i �] �]]]]

Thus, if sideward movement were allowed to proceed along the lines of(49)–(50), where a given constituent moves from a derivational workspace W1 toa derivational workspace W2 that will end up being embedded under W1, thereshould never be any CED effect in parasitic gap constructions and we wouldincorrectly predict that (46a) should be acceptable.

Similar considerations apply to the alternative derivation of (46b) sketched in(53)–(56) below. In (53)–(54), which politician moves from the object position ofcriticize to the complement position of the preposition; further (cyclic) computa-tions then yield the (simplified) structure in (55), in which the wh-copy in thematrix object position is still accessible to the computational system, thus beingable to move and check the strong feature of Q. After this movement takes place,the whole structure is spelled out and the lower copies of which politician aredeleted in the phonological component, as shown in (56). The derivation outlinedin (53)–(56) therefore incorrectly rules in the unacceptable parasitic gap in (46b).

(53) a. X� [criticize [which politician]]b. Y�of

(54) a. X� [criticize [which �which politician�]i]b. Z� [of [which �which politician�]i]

(55)

(56) [CP [which �which, politician �]i did�Q [TP you [vP [vP criticize [which

�which, politician�]i] [before �before, [pictures � pictures, of, [which

�which, politician�]i �], upset, the, voters �]]]]

The generalization that arises from the discussion above is that sideward move-ment from a derivational workspace W1 to a derivational workspace W2 yieldslicit results just in case W1 will be embedded in W2 at some derivational step. In

vP PP

criticize [which�whichpolitician�]i

[before�before,[pictures�pictures, of,

[which�which, politician�]i�],upset, the, voters�]

vPT

T'you

TPdid�Q

CP

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

81

Page 93: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

the undesirable derivations sketched in (49)–(52) and (53)–(56), sideward move-ment has proceeded from the “matrix derivational work space” to a subordi-nated one. Obviously, the question is how this generalization can be derivedfrom independent considerations.

Abstractly, the problem we face here is no different from the one posed byeconomy computations involving expletive insertion in pairs such as (57),originally noted by Alec Marantz and Juan Romero. The two sentences in (57)share the same initial numeration; thus, if the computational system had accessto the whole numeration, economy should favor insertion of there at the pointwhere the structure in (58) has been assembled, incorrectly ruling out thederivation of the acceptable sentence in (57b).

(57) a. The fact is that there is someone in the room.b. There is the fact that someone is in the room.

(58) [is someone in the room]

Addressing this and other similar issues, Chomsky (2000) proposes that rather thanworking with the numeration as a whole, the computational system actually workswith subarrays of the numeration, each containing one instance of either a comple-mentizer or a light verb. Furthermore, according to Chomsky’s 2000 proposal,when a new subarray SAi is selected, the vP or CP previously assembled based onsubarray SAk becomes frozen in the sense that no more checking or thematic rela-tions may take place within it. Returning to the possibilities in (57), at the pointwhere (58) is assembled, competition between insertion of there and movement ofsomeone arises only if the active subarray feeding the derivation has an occurrenceof the expletive; if it does not, as is the case of (57b), movement is the only optionand the expletive is inserted later on, when another subarray is selected.

This strongly derivational approach has the relevant components for a prin-cipled account of why sideward movement must proceed from embedded toembedding contexts. If the computational system had access to the wholenumeration, the derivation of the parasitic gap constructions in (46), forinstance, could proceed either along the lines of (47) and (48) or along the linesof (49)–(52) and (53)–(56), yielding an undesirable result because the latterincorrectly predict that the sentences in (46) are acceptable. However, if thecomputational system works with one subarray at a time and if syntactic objectsalready assembled become frozen when a new subarray is selected, theunwanted derivations outlined in (49)–(52) and (53)–(56) are correctlyexcluded. Let us consider the details. Assuming that numerations should bestructured in terms of subarrays, the derivation in (49)–(52) should start withthe numeration in (59) below, which contains the subarrays A–F, each deter-mined by a light verb or a complementizer.

(59) N� {{A Q1, did1},{B you1, finally1, v1, read1, which1, book1, after1},{C C1, T1},{D PRO1, v1, leaving1, the1, bookstore1, without1},

D E R I V A T I O N S

82

Page 94: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

{E C1, T1},{F PRO1, v1, finding1}}

The derivational step in (49), repeated here in (60), which would permit theundesirable instances of sideward movement, is actually illicit because itaccesses a new subarray before it has used up the lexical items of the active sub-array. More specifically, the derivational stage in (60) improperly accesses sub-arrays B and F of (59).20

(60) a. K� [read [which book]]b. L�finding

Similarly, the step in (53), repeated here in (62), illicitly activates subarrays Band D of (61), which is the structured numeration that underlies the derivationin (53)–(56).

(61) N� {{A Q1, did1},{B you1, v1, criticize1, which1, politician1, before1}{C C1, T1},{D pictures1, of1, v1, upset1, the1, voters1}}

(62) a. X� [criticize [which politician]]b. Y�of

The problem with the derivations outlined in (49)–(52) and (53)–(56), therefore,are not the instances of sideward movement themselves, but rather the deriva-tional steps that should allow them. By contrast, lexical access in the deriva-tional routes sketched in (47) and (48), repeated below in (64) and (66), mayproceed in a cyclic fashion from the structured numerations in (63) and (65),respectively, without improperly activating more than one subarray at a time.However, as discussed above, sideward movement of which book in (64) orwhich politician in (66) is impossible because these elements have already beenspelled out and are not accessible to the computational system.

(63) N� {{A Q1, did1},{B you1, finally1, v1, read1, after1},{C C1, T1},{D PRO1, v1, leaving1, the1, bookstore1, without1},{E C1, T1},{F PRO1, v1, finding1, which1, book1}}

(64) a. K�[CP C [TP PRO T [vP [vP leaving�v the bookstore] [without �without, C, PRO, T, finding�v, which, book�]]]]

b. L� read

(65) N� {{A Q1, did1},{B you1, v1, criticize1, before1}{C C1, T1},{D pictures1, of1, which1, politician1, v1, upset1, the1, voters1}}

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

83

Page 95: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(66) a. X� [CP C [TP [pictures � pictures, of, which, politician�] T [vP [pictures

� pictures, of, which, politician� [v� upset�v the voters]]]]]b. Y�criticize

The analysis of CED effects in parasitic gap constructions developed here cantherefore be understood as providing evidence for a strongly derivationalsystem, where even lexical access proceeds in a cyclic fashion.21

5 Conclusion

This chapter has attempted to provide a minimalist analysis of classical extrac-tion domains, in terms of derivational dynamics in a cyclic system. The mainlines of research which provide a solution to the relevant kind of islands are (i) acomputational system with multiple applications of Spell-out; and (ii) a decom-position of the Move operation into its constituent parts, taking seriously theidea that separate copies are real objects and can be manipulated in separatederivational workspaces (sideward movement).

Extraction domains are opaque because, after Spell-out, the constituentterms of a given chunk of structure, while interpretable, are no longer accessibleto the rest of the derivation. At the same time, this opacity can be bypassed if anextra copy of the moving term manages to arise before the structure containingit is spelled out, something that the system in principle allows. However, thispossibility is severely limited by other computational considerations. Forexample, Last Resort imposes that the extra copy be legitimated, which separ-ates instances where this copy is made with no purpose other than escaping anisland (a CED effect) from instances where the copy is made in order to satisfya �-relation (a parasitic gap construction). In the second case, the crucial copycan be legitimated prior to the Spell-out of the would-be island, thus resulting ina grammatical structure. Moreover, we have shown how sideward movementcan only proceed, as it were, forward within the derivational history. That resultis straightforwardly achieved in a radically derivational system, where the veryaccess to the initial lexical array is done in a strictly cyclic fashion.

Although we find these results rather interesting, we do not want to finishwithout pointing out some of our concerns, as topics for further research. Ourwhole analysis relies on the assumptions that copies are real, and as such can bemanipulated as bona fide terms within the derivation. If so, it is perplexing that,for the purposes of linearization different copies count as one, which drives agood part of the logic of the chapter. Of course, we can make this be the case bystipulating a definition of identity, as we have (token in the numeration asopposed to occurrence in the derivation); but we do not know why that defini-tion holds. Second, it is fundamental for the account of island effects thatspelled-out chunks be inaccessible to computation. However, chain identifica-tion can proceed across spelled-out portions, also in a rather surprising way.Once again, we can make things work by making c-command insensitive to any-thing other than the notion of containment; but we do not know why that

D E R I V A T I O N S

84

Page 96: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

should be, or why c-command should hold, to start with, of chains. Finally, itshould be noted that cyclic access to the numeration is key in order to keep theproper order of operations; we have no idea why the relevant derivationalcycles should be the ones we have assumed, following Chomsky (2000). All wecan say with regards to all these questions is that we have suspended our disbe-lief, just to see how far the system can proceed within assumptions that arefamiliar.

C Y C L I C I T Y A N D E X T R A C T I O N D O M A I N S

85

Page 97: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

5

MINIMAL RESTRICTIONS ONBASQUE MOVEMENTS†

1 Introduction

The Minimalist Program has no general account of islands. In part, this isbecause the system is designed in such a streamlined fashion B and with theassumption that computational mechanisms exist to meet the requirements ofexternal interfaces – that little room is left for the apparently ad hoc considera-tions involved in formulating island conditions. In other words, nobody finds itelegant to speak of some category or another creating a barrier for movement,let alone removing such a barrier when necessary.

In recent years, several attempts have been made to address at least someisland restrictions within the Minimalist Program, with various degrees ofsuccess. In this chapter, I use some of the results discussed in Chapter 3,whereby it is argued that island effects arise as a consequence of a dynamicderivational system in which Spell-out – just as any other rule in the system –applies as many times as is necessary for a derivation to converge. In short, astructure to which Spell-out has applied becomes opaque for syntactic computa-tion, thus turning into an island.

Within these premises, this chapter studies a problematic paradigm fromBasque syntactic studies, restrictions on question formation. The phenomenonhas gained much attention throughout the last century because it deals with anordering limitation in a language that looks otherwise rather free with respect tothe position of verbal dependents; curiously, a question word must be left-adjacent to the verb. In present terms, an analysis is possible which has somerather important consequences for minimalism.

2 Wh-movement in Basque

2.1 Basic Basque syntax1

Basque is an underlyingly SOV, generally head-last language. It exhibits overtcase morphology, the main cases being ergative ((e)k), absolutive (∅) dative((r)i), and genitive (ko/ren):

(1) Jonek Mireni Getxoko ogia bidali dio.J.-E M.-D G.-G bread-the/a-A sent 3-have-3-3“Jon has sent Miren bread from Getxo.”

86

Page 98: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Nominal dependents get genitive (G) case; indirect objects, dative (D) case;base subjects, ergative (E) case; and base objects (and derived subjects of unac-cusatives), absolutive (A) case, in a typical ergative Case system.

Regardless of aspectual demands, the majority of Basque verbs are followedby an auxiliary which encodes subject, object and indirect object agreement(shown as numbers for person in glosses). Their sequential order is generally�A(bsolutive).AUXILIARY.D(ative).E(rgative)�, and agreement associationto arguments is standard: absolutive for monadic predicates, absolutive plusergative for diadic ones, and an extra dative for triadic ones (as in (1)) as well ascausative constructions (as in (3) below). Auxiliary selection is standard, too:

(2) a. Jon etorri da.J.-A arrived is-3“Jon has arrived.”

b. Jonek Miren maite du.J.-E M.-A love 3-have-3“John has loved Mary.”

c. Aizkolariak lan egin du.lumberjack-the/a-E work make 3-have-3“The lumberjack has worked.”

Unaccusatives select a form of izan “be”; transitives, a form of ukan “have”;intransitives (unergatives) select a form of ukan as well, exhibiting agreementwith two arguments (the absolutive one being a default, third person singular).2

Reasonably, as a result of this rich agreement system, Basque allows pro-drop in all three major argument positions:3

(3) a. Jaun kuntiak zezen bati Bereterretxe harrapaerazi zion.

Mr. count-the-E bull one-D B.-A hit-make 3-have-3-3

“The count has made a bull hit Bereterretxe.”b. pro pro pro harrapaerazi zion.

“He has made it hit him.”

To a large extent, pro-drop may also be a major source of the apparent “freeword order” of Basque. Such a view would be very consistent with the fact that,unlike verbs, nouns (which lack the pro-drop system) are in fact quite rigid inthe linear ordering of their dependents:

(4) a. Hargaineko sorgin zaharrakH.-G witch old-pl.“The old witches from Hargain”

b. * zaharrak sorgin Hargainekoc. * Hargaineko zaharrak sorgind. * sorgin zaharrak Hargainekoe. * sorgin Hargaineko zaharrakf. * zaharrak Hargaineko sorgin

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

87

Page 99: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

In contrast, major sentential constituents can appear in just about any order(with information-theoretic consequences). Witness the alternatives to (2b):

(5) a. Miren maite du Jonek.b. Maite du Jonek Miren.c. Miren Jonek maite du.d. Maite du Miren Jonek.e. Jonek maite du Miren.

It is then plausible that all the examples above involve right- and left disloca-tions of phrases which “double” a pro element (all of this is meant pre-theoretically at this point). I will assume the representations in (6) in order tocapture the variations in (5):

(6) a. [pro Miren maite du] Jonek.b. [[pro pro maite du] Jonek] Miren.c. Miren [Jonek pro maite du].d. [[pro pro maite du] Miren] Jonek.e. [Jonek pro maite du] Miren.

Despite the ordering possibilities in (5), when wh-movement has occurred, thewh-phrase must be left-adjacent to the main verb. Thus, (7a/b) are possible, but(7c/d) are not:4

(7) a. Zer bidali dio (Jonek) (Mireni)?what-A sent 3-have-3-3 J.-E M.-A“What has Jon sent Miren?”

b. (Jonek) (Mireni) zer bidali dio?c. * Zer Mireni bidali dio (Jonek)?d. * (Mireni) Zer Jonek bidali dio?

2.2 A plausible analysis and some problems

The fact in (7) has received a considerable amount of attention since it was firstsystematically discussed in Altube (1929). Within current assumptions, the stan-dard account is that (7) exhibits a Verb second (V2) effect (Ortiz de Urbina1989). An element like zer “what” in (7a) occupies the Spec of CP, and a verblike maite occupies the head of C, arguably guaranteeing the observed adja-cency. From this perspective, (7b/c) would be the equivalent of the impossibleEnglish sentence *what to Mary has John sent?

It is not my intention to question the technical details of the V2 analysis, butrather to offer an alternative within minimalism. For the sake of completeness, Ishould mention two difficulties the V2 approach must deal with, and whichmake the relevant analysis somewhat unnatural.

The first problem is that Basque complementizers appear clause-finally:5

(8) Mirenek [[Jon etorri de]] la esan duM.-E Jon-A arrived is-3-that said 3-have-3“Miren has said that Jon has arrived.”

D E R I V A T I O N S

88

Page 100: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

No adjacency between the (rightmost) head of C and its (leftmost) specifier canbe guaranteed under these circumstances. Ortiz de Urbina is thus forced toassume that, contrary to appearances, Basque complementizers are first and theobserved post-verbal element is like a clitic. As evidence for his proposal, heclaims that some Basque complementizers, e.g. nola are clause initial:6

(9) Mirenek [nola [Jon etorri de] n] esan duM.-E how J.-E arrived 3-is-if say 3-have-3“Miren has said how Jon has arrived.”

Perhaps a more plausible analysis for (9) is to treat nola as a (pseudofactive)element that occupies the Spec of CP, while the C head -n (underlined in (9))occupies the appropriate rightmost site. Apart from preserving the generality ofhead positions in the language, this analysis would allow us to account for thecontrasts in (10) involving long-distance wh-movement:7

(10) a. Nor esan du etorri dela Mirenek?who-A said 3-have-3 arrived 3-is-that M.-EWho has Miren said has arrived?

b. * Nor esan du nola etorri den Mirenek?who-E said 3-have-3 how arrived 3-is-if M.-E

In spite of observing the required adjacency between the moved wh-phrase andthe verb, (10b) is out, unlike (10a). The facts could be nicely accounted for if the“escape hatch” of the lower CP, i.e. Spec of CP, is open in (10a) but is filled bynola in (10b).8 While it is true that Ortiz de Urbina could make the same claimwithin his V2 analysis (nola, in fact, occupying the Spec of CP), my point is thatnola constitutes no independent evidence to conclude that Basque complemen-tizer heads are clause initial.

A second difficulty for the V2 analysis comes from the fact that, unlike stan-dard V2, the Basque phenomenon under discussion is not restricted to the root,in two different senses:

(11) a. Ez dakit zer (*Jonek) bidali dion.not know-1 what-A J.-E sent 3-have-3-3-if“I don’t know what Jon has sent.”

b. Nork esan du ardoa bidali diola?who-E said 3-have-3 wine-(*the)-A sent 3-have-3-3-that“Who has he/she said has sent (*the) wine?”

(11a) shows the effect in the complement of a question verb, whose associatedwh-element (zer “what”) must be adjacent to the embedded verb (bidali“sent”). (11b) demonstrates that even bridge verbs – whose associated C struc-ture is only used as an “escape hatch” – induce the relevant adjacency effect, inthe following way. Even though a definite reading of ardoa “the wine” is pos-sible in (11b), an indefinite reading is not. This definiteness effect can beaccounted for if we assume that the embedded object is left-dislocated to theintermediate periphery (between the verb of saying and the embedded clause)

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

89

Page 101: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

when this object is definite, but such dislocation is not possible when the objectreceives an indefinite interpretation. If so, the indefinite object in (11b) occupiesits normal preV position, and the ungrammaticality of the example (with a defi-nite object) falls into the pattern seen before: ardoa breaks the adjacencybetween the embedded verb and the moving question word.

These are not standard V2 effects. In fact, the V2 literature uses contexts like(11a) to test for the absence of V2 behavior (in languages where the phenome-non shows up in embedded clauses, though not with question verbs). As for(11b), the cyclic effect seen there is nothing like normal V2, which typically onlyaffects the ultimate landing site of the wh-phrase. Ortiz de Urbina is well awareof all these issues, and thus compares the phenomena in (11) to the Spanishfacts discussed by Torrego (1984):

(12) No sé que (*Juan) envió.not know.1 what J. sent.3/past“I don’t know what Juan sent.”

The comparison is well taken, but it is less clear that either set of facts (theBasque or the Spanish ones) can be analyzed as traditional V2. More import-antly, it falls short of an explanation, insightful though the correlation clearly is.

Laka (1990) argues convincingly that Basque does have standard V2 effectsinvolving auxiliary movement in negative or emphatic contexts:9

(13) a. Miren ez du Jonek maite!M.-A not 3-have-3 J.-E love“Miren (is who) Jon hasn’t loved!”

b. Arantza (ba) du Jonek maite!A.-A indeed 3-have-3 J.-E love“Arantza (is who) Jon indeed has loved!”

c. Nor ez du Jonek maite?who-A not 3-have-3 J.-E love“Who (is who) Jon hasn’t loved?”

Laka shows that this negative, emphatic construction exhibits the archetypicalV2 pattern. The verb is displaced only in the matrix (contra what we saw in(11)), and only the auxiliary element appears in second position (contra what wehave seen in all examples above, where the main verb and the auxiliary, in thatorder, appear adjacent to the wh-word). I include this comment on Laka’sobservation to clarify that I am not claiming absence of V2 effects for Basque.Rather, standard V2 is not clearly at issue for the cases in which a wh-elementmust be adjacent to the verb (nor, for that matter, is V2 obviously at issue in theSpanish counterparts).

In what follows I present an analysis of the Basque facts which extends to thesort of Spanish facts reported in (12), in effect denying that the relevant phe-nomenon is a V2 effect.

D E R I V A T I O N S

90

Page 102: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

3 An alternative analysis

3.1 Toward a different proposal

The general view in this chapter was first outlined in Laka and Uriagereka(1987; hereafter, L&U), within Chomsky’s (1986a) Barriers framework. I willrecast the analysis in minimalist terms, but for presentational purposes I sketchit in the pre-minimalist model assumed in L&U.

The major intuition of the analysis is that the observed adjacency betweenthe moved wh-phrase (in the Spec of CP) and the main verb is a PF phenome-non. In other words, even though the elements in question are string-wise adja-cent, they are not necessarily structurally adjacent. The main factor thatconspires to yield this surface phenomenon is the fact that Basque is a pro-droplanguage. Hence, for sentences like ez dakit zer bidali dion “I don’t know whathe sent to him/her,” it may well be that one or more null categories intervenebetween the wh-phrase zer and the verb bidali, roughly as follows:

(14) [Ez dakit [zer [… pro … bidali dion]]]not know-1 what-A sent 3-have-3-3-if

(14) is pronounced with PF adjacency between zer and bidali, but in fact severalelements occur between these two items.

If this view of the (superficial) adjacency is correct, then the adjacency is notreally what needs to be explained. That is to say, in a language in which everyargument of the verb can be realized by pro, PF adjacency between the wh-element and the verb is not surprising. Rather, the problem is the alternative to(14) in (15), where the relevant pros are replaced with overt constituents.Assuming, for now at least, that the only difference between a normal argumentand pro is their pronunciation (the structures being otherwise identical), why is(15) not as good as (14)?

(15) *[Ez dakit [zer [Jonek/Mireni … bidali dion]]]not know-1 what-A J.-E M.-D sent 3-have-3-3-if

L&U propose a characterization of the notion barrier – based on Fukui andSpeas (1987) – the effect of which is to make an XP with an overt specifier abarrier, while leaving XPs whose specifier is unexpressed transparent to move-ment.10 I take this intuition as a guiding analysis, but rather than restating thefacts as in L&U, I will attempt to provide an explanation for why the morpho-logical “heaviness” of specifiers should trigger barrierhood. Before moving tothe analysis, however, there is an extension of the data which argues for the ade-quacy of stating the basic generalization in terms of barriers.

3.2 Extending the data

How can we distinguish the V2 analysis from a barriers approach? One possibil-ity is in terms of hypothetical examples in which the PF adjacency is broken.The V2 analysis predicts that such examples should not exist in a language in

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

91

Page 103: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

which the C Spec and head are both expanded to the left (as is Basque in Ortizde Urbina’s hypothesis). In contrast, the barriers approach leaves room for thatpossibility, all other things being equal. In particular this sort of example shouldexist if no barrier (technically, a category with a lexical specifier) is crossed.

There are some examples of the relevant sort, for instance:

(16) Zergatik zaldunak herensugea hil zuen?why knight-the-E dragon-the-A killed 3-had-3“Why has the knight killed the dragon?”

Instances like this were noted in Mitxelena (1981); Ortiz de Urbina mentionsseveral, comparing them to similar Spanish instances noted by Torrego (1984):

(17) Por qué el caballero mató al dragón?for what the knight killed.3/past to-the dragon“Why did the knight kill the dragon?”

The correlation with Spanish is reasonable, but we must understand why wh-phrases like zergatik/por que “why” do not induce the alleged V2 effect. (18) istelling:

(18) a. Zergatik (Jonek) esan du garagardoa edango duela?

why J.-E say 3-have-3 beer drink-fut. 3-have-3-that

“Why has Jon said that he will drink beer?”b. Por qué (Juan) dice que beberá cerveza?

for what J. say.3 that drink.will.3 beer“Why does John say that he will drink beer?”

Examples of this sort were introduced in Uriagereka (1988b) to illustrate thefollowing property. (18a, b) are both grammatical with or without the interven-ing matrix subject; but when the subject is overt (e.g. “John”) only a matrixreading is possible for the wh-phrase, which thus asks for John’s reason forsaying what he said rather than his reason for drinking beer (a reading which ispossible when the intervening subject is not pronounced, in both Basque andSpanish).

A barriers explanation for these contrasts is rather natural, if we assume, as isplausible, that wh-phrases like why are IP adjuncts. As an adjunct to IP in anunembedded context, movement of why to the C projection never crosses anybarrier, even if IP takes the overt subject as its specifier; IP excludes the adjunct.However, if the IP which why modifies is embedded, then why must a fortioricross the matrix IP were it to raise to the matrix C projection. If the matrix IPhas a lexical specifier, it will be a barrier for movement of why from the lowerclause; on the other hand, if the matrix IP has a null specifier, it is by hypothesistransparent to movement of why from the embedded clause. In this way, wepredict the impossibility of why modifying the embedded clause when thematrix IP has an overt specifier.

D E R I V A T I O N S

92

Page 104: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

It is unclear what the V2 analysis could say with respect to these facts. First,examples like (18) show that elements like why, when carefully examined, do infact exhibit the alleged V2 effect (in this instance, a reading that is otherwisepossible is prevented in the domain that concerns us). But then, how can oneaccount for (16) or (17)? Given the possible readings in (18), the V2 approach isforced into a rather ad hoc analysis to capture the difference between (16)/(17)on the one hand and (18) on the other.11

L&U also note that (19) is marginally possible:12

(19) ? Nor horregatik etorriko litzake?who-A because.of.this come-Asp 3-have-3“Who would come because of this?”

The sharp contrast in grammaticality between (19) and the paradigmaticungrammatical instances seen so far is hard to explain in V2 terms, wherenothing should separate the moved wh-phrase in the CP Spec from the (hypo-thetically) moved verb in C. In contrast, for a barriers analysis, it does matterwhether the intervening material is sufficient to create a barrier. This is arguablynot the case for pure adjuncts like horregatik, a “because” type element which,as an adjunct to IP, does not serve as the specifier to any category and thus willnot be a barrier. The theory predicts that in these instances no barriers would becrossed upon moving over horregatik.

Other types of adjuncts behave differently from causal (pure) adjuncts. Forexample, temporal adjuncts block movement:

(20) Nor (*orduan) etorriko litzake?who-A then come-Asp 3-have-3“Who would come then?”

This again follows, under the assumption that temporal adjuncts are specifiersrather than adjuncts. Of course, agreement in this instance is not obviouslyovert, and hence the claim about overt specifiers is necessarily more abstract inthis case. Nonetheless, there is significant evidence in the literature that inter-mediate temporal and aspectual phrases may involve temporal specifiers in theappropriate sense, even in English (see, e.g. Thompson 1996).

This treatment of some adjuncts is also very much in line with the fact in (21),which the theory predicts as well:

(21) Noiz (*zaldunak) hil zuen herensugea?when knight-the-E killed 3-have-3 dragon-the-A“When has the knight killed the dragon?”

Unlike a true adjunct (16)/(17), noiz “when” is sensitive to the presence of anintervening subject. This suggests that temporal adverbs are within IP, as speci-fiers of some internal projection(s). Again, it is unclear how the traditional V2analysis can capture the subtleties of these behaviors.

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

93

Page 105: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

3.3 A difficulty

The approach we have seen accounts for the absence of barriers between themoved wh-element and its trace. It does not matter what the relevant categoriesthat take specifiers are – so long as the specifiers are overt (instead of pro), thetheory predicts an immediate barrier. Thus, observe:

(22) [Zer [pro (*Mireni) t bidali dio]]what-A M.-D sent 3-have-3-3“What has he/she sent to Miren?”

As we saw for subjects, intervening indirect objects also cause problems for wh-movement. By hypothesis, this must mean that indirect objects also serve asproper specifiers to some category – now a standard assumption to make, which(22) offers evidence for if the present analysis is on track.

Nonetheless, we must extend the claim even further to give a full account ofthe facts, since indeed no lexical specifier can intervene between the moved wh-phrase and the trace, nor between this trace and the verb. Notice that, at thispoint, it is not at all obvious why the latter statement should be true.

Of course, claiming that direct objects also serve as specifiers to some cat-egory is no more controversial than making the claim for indirect objects.13 Thepuzzling difficulty, though, is that a direct object creates a barrier for subjectmovement:

(23) [Nork [t [(*ogia) bidali dio]]]who-E bread-the-A sent 3-has-3-3“Who has sent (him/her) the bread?”

Why should the presumably lower specifier occupied by the object ever matterfor the movement of the subject?

A solution to this puzzle comes from denying that (23) is how subject move-ment proceeds. If the subject had to extract from a position lower than that ofthe object in its specifier, then we would expect the intervention effect:

(24) [Nork [… [(*ogia)[t … bidali dio]]]]

(24) is reminiscent of the analysis that Rizzi (1990: 62 and ff.) provides forItalian subject extraction.14 According to this view, the standard subject posi-tion, where it checks Case, is not a possible extraction site, essentially becauseof the “heaviness” of Agr. Under this perspective, subject extraction is forcedinto a roundabout movement, i.e. from inside the VP (details to follow).

The L&U analysis predicts why the subject should not be able to extract fromits Case position: the subject itself serves as the full lexical specifier of IP, para-doxically inducing the very barrier that it must move over.15 Of course, if thesubject extracts from its VP internal position, it will be sensitive to any otherintervening lexical specifiers. This view of subject extraction captures the adja-cency between the trace of the moved element and the verb, just as the initialproposal guarantees the adjacency between the trace and its antecedent wh-phrase. At this point, we have a full, though still mere, description of the facts:

D E R I V A T I O N S

94

Page 106: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

wh-movement from VP internal positions is sensitive to the presence of overtspecifiers, which block it, unlike their null counterparts.

3.4 Beyond a description

As a first step in understanding the description above, there are two importantquestions to ask. First, can we provide principled motivation for the cross-linguistic differences in extraction? Specifically, the Germanic languages typ-ically allow movement over a fully specified IP:

(25) I wonder what [John sent t]

In contrast, the pro-drop Romance languages are not so easily analyzed, andlead us to proposals along the lines of Rizzi’s in which the “heaviness” of Agr isresponsible for different extraction patterns: Romance languages (and Basque,extending the claims to all agreeing heads) have extra barriers where Germanicones do not. But this observation concerning morphological differences is notan explanation; we must recast the intuition as a precise analysis. In part wehave. We could define barriers as categories of a certain morphological weight,as in Uriagereka (1988a). But such an approach is unsatisfactory if we adoptminimalist assumptions, in particular the idea that mechanisms of the computa-tional system must be seen to follow from interface conditions or from concep-tual necessity. In this case, if we define barriers as categories of a certainmorphological weight, we are forced to ask: why does “heaviness” of agreementmake such a difference in the emergence of barriers?

The second question to ask, to transcend our description, concerns new prob-lems posed by the present approach: how does the trace of the wh-element in anon-Case Position become visible (or how does it check Case)? Similarly, if wh-movement across a category with a lexical specifier is impossible in the relevantlanguages, why is A-movement possible in the simplest of sentences?

(26) [Jonek [Mireni [ogia [t t t bidali dio]]]]J.-E M.-D bread-the-A sent 3-have-3-3“Jon has sent Miren the bread.”

To make their barriers analysis work, L&U were forced to adopt the now mini-malist assumption that arguments move to Case specifiers. But if such a view iscorrect, then some of the movements in (26) will have to cross overt specifiers,unless some alternative route is found. In the remainder of this chapter, I shalladdress these issues within the Minimalist Program.16

4 Barriers in the minimalist system

4.1 Minimalist assumptions

The Minimalist Program attempts to derive all restrictions on derivationsfrom either bare output conditions (natural interactions with extra-syntactic

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

95

Page 107: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

components) or virtually conceptually necessary properties of the linguisticsystem. Within the latter rubric, Chomsky (1995b) includes economy, under-stood as a fundamental property of the natural world. Economy manifests itselfin a variety of ways, both in the design of the system and its structural units, andin the computation of derivations and their alternatives.

For example, the very definition of a proper chain presupposes that itinvolves shortest links (the Minimal Link Condition, MLC). Such a convergencecondition is an instance of economy in the computation of derivations: a chainlink which is not the shortest possible is not valid and thus the derivation doesnot converge. One way Chomsky (1995b: 297) seeks to formulate the MLC is byviewing movement, the mechanism which generates chains, as a consequence ofAttract. That is, a given phrase marker K with feature F attracts the nearestavailable feature F� which meets the demands of F. Movement of phrase marker� containing F� is a side-effect; although F only really attracts F�, � must forsome reason take a “free ride” with F�. As such, movement is taken to be ratherlike (so-called) pied-piping of a preposition when its object moves.

Apart from economy in its structural design, the system is taken to manifest akind of internal computational economy. Thus, given alternative derivationsstarting in the same array of lexical items, the derivation involving fewest stepseliminates the alternatives. In this instance, we crucially choose among compet-ing, convergent candidates.

These two sorts of explanations (conditions on convergence and optimalityraces) are the only kinds of accounts that minimalism favors.17 This means weonly allow ourselves to propose a restriction of the sort “Movement across alexical specifier is impossible” if an account cannot be otherwise derived fromconditions on convergence or optimality races. In other words, such an ad hocrestriction is clearly neither a criterion for ranking derivations nor an obviouslynatural condition on convergence, and therefore does not meet the demands ofminimalism in its purest form.

4.2 The essential elements for an analysis

It would seem, then, that we are left without any room for expressing barrier-type restrictions. However, if we take seriously the idea that movement is trig-gered by Attract, it is possible to blame the system’s dynamics for the observedeffects, vis-à-vis movement across lexical specifiers.

Given the assumption that a primary mechanism of the computational systemis the attraction of some feature F� to a feature F, we are in a position to recog-nize that Move � is an ancillary operation triggered by Attract. In other words,while the bare requirement of the computational system is simply the displace-ment of F� to F – and such movement of a feature would be characteristic ofcomputations in the “covert” syntax, as in (27a) – cases of “overt” movementare arguably related to the morphological structure of the category � which con-tains the attracted feature. Under this view, � is not a well-formed PF objectonce F� moves to F. Hence, while Attract strips F� away from �, Move brings �

D E R I V A T I O N S

96

Page 108: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

back to the domain where F� has landed (27b), thus allowing for the repair ofthe morphological integrity of the lexical association (�, F�):

(27)

I suggest that this analysis of movement, as generalized “pied-piping,” providesa way to account for why overt specifiers induce barriers while a similar nullelement (pro) does not. In order to understand the connection between barriersand overt specifiers, it is necessary to make explicit certain assumptionsinvolved in viewing movement as an ancillary operation.

To begin with, as outlined above, the ancillary movement of � is a con-sequence of the morphological integrity of the (�, F�) relation; notice, however,that proposing ancillary movement to establish configurational closenessbetween � and F� does not, in itself, provide a precise statement about how themorphological integrity of � is re-established. To that end, suppose there is amorphological operation which repairs the (�, F�) relation:

(28)

Such an operation makes sense if the reason � moves to the same domain as theattracted F� is to repair some damage to the lexical integrity of �.

Proposing the operation in (28) implicates two additional assumptions. First,assuming that morphological operations proceed as the derivation unfolds, thepair (�, F�), as well as the configuration whose checking domain these elementsoccupy (KP in (28)), will have to be shipped to the morphological component atthe point (28) applies. Suppose that what it means to be “shipped to the mor-phological component” is in fact a function of Spell-out, much as in Chomsky’s(1995b) original proposal. Thus, the morphological repair of (�, F�) in (28)requires that the minimal structure containing � and F�, i.e. KP, is spelled out atthat point in the derivation.

But given the assumption that the morphological repair implicates Spell-out,we must ask what happens in the derivation when KP is indeed spelled out. If

KP

K

t… …

RepairLH(K)

F' H(K)

a. K

H(K) L

�… …

F'… …Attract

b. KP

� K

t… …Move

LH(K)

F' H(K)

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

97

Page 109: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

we take the original intuition behind Spell-out at face value, then once the PFand LF features contained in KP are stripped away and shipped to the respec-tive interfaces, KP is no longer available to overt syntactic computation giventhat it has, derivationally speaking, gone to PF and LF:

(29)

Then we can reach two conclusions about specifiers. They are created due to amorphological association between an attracted feature and the category it wasstripped away from. The morphological repair of (�, F�), an operation invokedafter the Spell-out of the category containing � and F�, creates a sort of “giantcompound” which is no longer transparent to any further syntactic operation.18

Before moving directly to the connection between overt specifiers and bar-riers, let us first be completely explicit about the details of the above assump-tions. First of all, the suggestion in (29) is an example of a “dynamically split”model of the sort argued for in Chapter 3, which goes back to early work byBresnan, Jackendoff, Lasnik, and others. For empirical and theoretical reasonsthat are beyond the scope of this discussion, it is both possible and desirable toaccess Spell-out in successive derivational cascades as the derivation unfolds(see Chomsky 2000).19

Allowing Spell-out to apply more than once is, in fact, the null hypothesis, ifwe take Spell-out to be a mere derivational rule rather than a level ofrepresentation (such as S-structure). Nevertheless, assuming that Spell-out hascost, economy alone favors a tendency for the rule to apply as infrequently aspossible. This must mean that the morphological repair strategy in (28) is a con-dition for convergence. If without it the derivation crashes, it is irrelevant to askhow costly an alternative derivation with a single Spell-out would be – there isno such alternative.

A second important assumption concerns the cyclicity of the system, feedingmorphology cyclicly as the derivation proceeds. The assumption is nothing but aversion of the Distributed Morphology proposal in Halle and Marantz (1993),needed in this instance so as not to allow us to proceed derivationally beyond aprojection of X only to come back to X and readjust it as the derivation maps toPF (though see Section 5, when Lexical-relatedness is brought to bear on theseissues). In other words, the morphological repair strategy in (28) must be asmuch a “syntactic” (cyclic) process as movement or deletion.

to LFto PF

KP

K

t… …

LH(K)

F'

continue merging

D E R I V A T I O N S

98

Page 110: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

A third (set of) assumption(s) relates to the fact that we separate those lan-guages in which a lexical specifier blocks movement from those languages inwhich it does not. Following Rizzi’s (1990) suggestion, we should recognize thisdistinction as another consequence of the independently needed pro-drop para-meter, extended to any functional category capable of licensing pro. In brief,only “heavy” categories which support pro go into Spell-out upon hitting alexical specifier. This of course raises the question of why “light” categoriespose no such problems. Given the logic of the system, the answer must be thatthe morphological repair in (28) only affects “heavy” categories, or alternatively(and perhaps relatedly) that “light” categories involve their specifier in terms ofdirect selectional requirements, instead of invoking the morphological repairabove.20 If so, a light X will not force an application of Spell-out, which entailsthat if a category � moves to the specifier of X in such a language, it will notinduce a “barrier” effect.

Given the system outlined above, we are now in a position to address thecentral question at hand. The reason why a category specified by pro does notconstitute a barrier is direct under one last assumption: pro is a feature. If pro issimply F, an F with no related morpho-phonological content, then the attractionof F to some head will not induce the ancillary operation (27b), no applicationof Spell-out is necessary, and consequently no barrier emerges.

Notice that this analysis of pro has an attractive consequence vis-à-vis mini-malism: given minimalist assumptions, there is no obvious room in the theoryfor a pro specifier.21 However, under the analysis proposed here, we can see thatpro is not actually a specifier in the configurational sense. Rather, in a languagewith a heavy feature, specifiers are created as a consequence of the morphologi-cal repair operation, and since there is no need for any morphological repair fora would-be pro specifier, we need not complicate the system by trying to moti-vate the existence of pro specifiers. But even if pro cannot exist as a specifier, itcan exist as the attracted F.

Needless to say, whether this account is plausible depends on the soundnessand naturalness of the four assumptions just mentioned: Multiple Spell-Out, Dis-tributed (cyclic) Morphology, heavy categories (which license pro) requiring agiven morphological repair when involving overt specifiers, and pro understoodas a feature. The first two are natural extensions of a model understood to be adynamically derivational system. The other two are more specific assumptionsconcerning the pro-drop parameter; one is an extension of the tradition of pro asa pronominal feature on a functional category (where the subject is somehowencoded through agreement);22 the other is an extension of the proposal thatmovement is an epiphenomenon, i.e. movement emerges as the consequence ofvarious sub-operations, as discussed in Chomsky (1995b: Chapter 4).

In sum, the analysis explains why overt specifiers, in languages with agree-ment, induce a barrier to movement: the morphological integrity of the lexicalitems force the Spell-out of the structure they are specifiers of, subsequentlyrendering that structure unavailable to further syntactic computation (15). Incontrast, corresponding structures without an overt specifier (14) do not induce

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

99

Page 111: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Spell-out (because of economy), and thus such structures are transparent tomovement. Languages without agreement are immune to these matters, henceallowing movement across lexical specifiers (25).

4.3 Typological distributions

Typologically, the following picture emerges. (I set aside whatever triggersMove independently of Attract – Extended Projection Principle effects – andconcentrate on pro-drop.) Possible parameters involved in the Attract processarise with respect to the sub-labels of the attracting head and the attractedfeature itself. The attracting head may or may not be heavy; the former caseforces overt movement (assuming that movement which is unnecessary by PFprocrastinates). The attracted feature may or may not be pronounced, theformer case forcing the ancillary operation that leads to morphological repair,and hence early Spell-out. Given this view, we can think of a heavy attractinghead – in relevant languages – as nothing but an Agr element added as a featureof the attracting head to the lexical array. An unpronounced attracted feature isjust pro.

Why does Agr license pro? Because a heavy attracting head must attract afeature F, and pro is nothing but a possible (in fact, null) manifestation of F.Note that, from this perspective, we should not find instances in which pro isgenerally obligatory, without ever presenting a corresponding overt version witha full (pronominal) specifier. This accounts for the otherwise peculiar fact thatalthough there are languages with overt but not null pronouns (English,French), and both overt and null pronouns (Basque, Spanish), there do notseem to be languages with null but not overt pronouns.

Interestingly, there is no logical necessity for pro to be licensed through Agr:a language may exist with no heavy target for pro, and this element may surviveits base-generated position as an unattracted feature with no PF reflex. This ispresumably at the core of the null categories witnessed in East Asian languageslike Japanese, which do not have agreement correlates:

(30) a. [Osamu-ga [Keiko-o [aisiteru]]]Osamu-S Keiko-o love“Osamu loves Keiko.”

b. [pro [pro [aisiteru]]]“He/she loves him/her.”

Jaeggli and Safir (1989) speak of two major distributions for pro: languages withrich agreement have it, and so do languages with no agreement. This generaliza-tion fits nicely in the present picture, the relevant parameter being whether prois attracted out of VP or whether it instead remains within VP, as a revieweraptly suggests.

Apart from their typological significance, Japanese facts are also importantwith respect to the issue of barrierhood: we predict that a lexical specifier in thesort of language where pro is not “licensed” morphologically does not induce a

D E R I V A T I O N S

100

Page 112: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

barrier, in spite of the fact that this specifier appears in free variation with pro(27a/b). Japanese topicalization argues for this result:

(31) [Keiko-o [Osamu-ga [t [aisiteru]]]]Keiko-o Osamu-S love“Keiko, Osamu loves.”

Importantly, Keiko-o can topicalize over Osamu-ga, thus presumably over an IPwhich may (and does in this instance) take a lexical specifier.

5 Issues on A-movement

5.1 What is involved in a cycle?

As we have seen, overt specifiers in languages with heavy agreement will inducebarriers to movement. But if the analysis is correct, then we encounter thefollowing problem: these specifiers will themselves create barriers for oneanother. In other words, in simple sentences like (26), which involve multiplemovements to the various specifier positions, movement should be blocked.This section provides a solution to this puzzle.

There is an easy and a difficult part to the solution. The easy part is the logicof this proposition: we have to somehow prevent the application of morphologi-cal operation (28) when dealing with A-movement across “the next” specifier. If(28) does not apply, the system will not create the “giant compound” in (29),and consequently no barrier will be induced. As a result, A-movement canproceed over the relevant specifier as it would over the corresponding pro,according to fact. The difficult part, of course, is explaining why the logic of thatcrucial proposition should in fact hold.

The answer lies at the core of the A/A� distinction. Providing a principledmotivation for such a distinction has always been difficult, but it is more sowithin minimalism, where notations are forced to follow from bare output con-ditions. Chomsky’s (1995b) suggestion for the relevant split is the notionL(exical)-relatedness: A-positions are L-related, A� positions are not. I returnshortly to a minimalist characterization of L-relatedness, but assuming it fornow, let us see how it bears on the present puzzle. To begin, I assume (32):

(32) The L-relatedness LemmaMorphology treats a domain of L-relatedness as part of a cycle.

The motivation for viewing (32) as a lemma will be made explicit in the nextsection. For now, assume that (32) is true. If so, morphological operation (28)will not apply, all other things being equal, when local A-movement is takingplace, simply because it does not have to. Let us consider this in some detail.

The intuition behind the dynamically split model of Multiple Spell-Out is thatthe system spells-out in a cyclic fashion – and only in a cyclic fashion. From thisperspective, the PF/LF split is architecturally cyclic. Interpretation (at both thePF and LF interfaces) occurs cyclically through the course of the derivation. In

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

101

Page 113: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

this sense, the grammar is heavily derivational. Each cycle is spelled out in turn,until the last cycle goes by. Why are only cycles spelled-out? Because nothingelse needs to, hence will not in an optimal system.

This approach places a heavy burden on motivating the notion “cycle.” Theonly new idea (vis-à-vis cyclic systems of the 1970s) proposed in Chapter 3 isthat, given reasonable structural restrictions, the cycle is an emergent propertyof derivations.

For example, if Kayne’s (1994) LCA holds only of units of structure whereMerge has applied exhaustively, and thus where command holds completely,then each such maximal unit will constitute a cycle. Let us see why:23

(33) a. Command unit: b. Not a command unit:{s, {s, {t …}} } {s, { {t, {t, {c …}}}, {s, {s, {t …}} } } }s →↑← {t …} {t, {t, {c …}}} →↑← {s, {s, {t …}} }said that … t →↑← {c …} s →↑← {t …}

the critic … said that …Exhaustive application Non-exhaustive application of of Merge to same object Merge, to two separately assembled

objects

If we assemble [the critic…] said that . . . in (33b) prior to the spelling out of thecritic, there will not be a way of linearizing the resulting structure, and the deriva-tion will crash. The alternative derivation spells out the critic, and then merges theresulting “giant compound.” The point is only this. As a result of a simple conver-gence condition, a cycle has suddenly emerged under a maximal command unit.

But as illustrated in this chapter, not just maximal command units force earlySpell-out, again under reasonable assumptions. For the data under investigationhere, the issue has been morphological operation (28), invoked in languageswith heavy agreement after a process of feature attraction. In these cases, theemergent cycle appears to be the result of a combination of heavy morphologyand the transformation which strips a feature from a category, thus forcing asubsequent repair. In all of these instances, failure to Spell-out results in aderivation that crashes at PF, and thus is not a valid competitor in the optimalityrace. All other things being equal, however, Spell-out would not apply if it neednot apply (the derivation with less Spell-out wins).

Now (32) becomes significant. If morphology can access (for repair) succes-sively L-related specifiers in a single cycle, there will not be any need for thesystem to incur the cost of (a cycle of) Spell-out; as a consequence. Everythingwithin an L-related domain will be syntactically transparent for the purposes offurther movement. As noted earlier, this is the easy part of the reasoning. Thehard part is justifying (32) as a lemma.

5.2 L-relatedness

In order to explore the role of L-relatedness in this context, consider theproblem in (34), noted independently by Alec Marantz and Juan Romero:

D E R I V A T I O N S

102

Page 114: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(34) a. [A ballroom was _ [where there was a monk arrested]]b. [There was a ballroom [where a monk was _ arrested]]

These two examples start in the same lexical arrays, which Chomsky (1995b:Chapter 4) calls “numerations.” Interestingly, (34b) involves a local movementof a monk even though there is an option of inserting there, as in (34a). This isan important example because Chomsky argues for other similar cases, as in(35), that the derivation with insertion outranks the one with movement:

(35) a. [there was believed [_ to be a monk arrested]]b. *[there was believed [a monk to be _ arrested]]

Note that the evaluation which ranks (35a) over (35b) is reached cyclically. Inother words, even though movement is invoked in both derivations (there in(35a), a monk in (35b)), the key difference is that at the point in the derivationwhen a monk is moved in (35b), there could have been inserted, and thus thederivation which does not invoke movement wins the optimality race then andthere. (Technically, optimality races are computed with regards to the “deriva-tional horizon” which remains after committing to the same partial numera-tions.) But if the option of insertion should rule out the derivation withmovement, then why does (34b) not lose to (34a)?

It is easy to see that the problem would not arise if we manage to divide thederivations of the problematic examples in (34) into separate sub-derivations,i.e. separate cycles, so that each evaluation of economy is based on differentpartial numerations. That is, suppose the partial numeration involved in com-puting the embedded clause in (34a) were (36a), while the one for the embed-ded clause in (34b) were (36b):

(36) a. {there, was, a, monk, arrested, …}b. {was, a, monk, arrested…}

Given this approach, there would be nowhere to compare the derivations of(34a–b): the two could not possibly have the same derivational horizon becausethey do not start in the same (partial) array.

The result in (36) can be ensured as in Chomsky (2000), by simply adding aproviso telling the derivation how to access the numeration. Importantly, we donot want to lose the analysis of the facts in (35), which means that however werefine accessing a numeration, there must be a unique access for (35) (unlikewhat we saw for (34)/(36)). So we assert (37):

(37) The minimal set of lexical items that result in a convergent structureconstitutes a partial access to the numeration.

(34) is dealt with by assuming a separate access to the numeration for theembedded clause. In contrast, a sub-derivation including only the elementswhich form *a monk to be arrested produces an illicit partial object (in thisinstance, in terms of Case). Thus, the array of items {to, be, a, monk, arrested}cannot be accessed on their own, which forces the two derivations in (35) into

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

103

Page 115: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

the same optimality race. Of course, this problem does not arise for (36), whereeach derivation contains the independently convergent sub-derivations of . . .there was a monk arrested and . . . a monk was arrested.24

One important consequence of this cyclic access to the numeration is that, asa result of it, a kind of “natural domain” has emerged in the system. Supposethe notion L-relatedness, and by extension the A/A� distinction, is understoodin terms of a cyclically determined derivational space. In other words, we mayas well use (37) to define L-relatedness:

(38) L-relatednessA cyclically accessed sub-numeration defines a domain ofL-relatedness.

Though of course not necessary, (38) is very plausible. The basic idea is that theminimal set of objects which stand in a relation of mutual syntactic dependencyhas a special status in the grammar. If thematic relations associated to a givenverb are syntactic dependents, then such relations will immediately be relevant.At the same time, we know of interesting mismatches between thematic andCase structures, precisely in domains like the one involved in (35). In thoseinstances, L-relatedness will go beyond theta-relatedness, in an expected direc-tion. Quite simply, then, A-positions can be defined as Chomsky (1995b:Chapter 3) suggests:

(39) A-position XP is in an A-position if L-related to an argument-takingunit.

Once (39) is assumed, and shown to be both necessary and plausible, the diffi-cult part of motivating the L-relatedness Lemma is in place: (32) follows fromthe trivial interaction of (37) and the conditions under which the system invokesSpell-out. L-relatedness is a direct consequence of a mechanism to accessnumerations in a cyclic fashion (37), which as we saw is independently needed(on empirical grounds). In turn, the system only goes into (early) Spell-out,hence closing off a cycle, to meet convergence requirements (e.g. morphologicaloperation (28) or for maximal command units as in (33)). Therefore, given thatthe cycle emerges as a result of derivational dynamics, there simply is no cyclewhile L-related objects are involved. Not surprisingly, cyclic access to thenumeration determines the domain of cyclic application of syntactic operations(e.g. early Spell-out and access to morphology).25

I should add a note of caution. Elements in C, at least in English and Basque,cannot be L-related to V, or everything proposed here will reduce to vacuity.However, even though C cannot be L-related to V in English and Basque, thereis nothing in principle which requires such a condition. In fact, in various lan-guages, elements related to the C system do behave as A-positions, as has beenindependently noted by many. The issue, then, is to parameterize the system, avery interesting matter that I turn to next.

D E R I V A T I O N S

104

Page 116: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

6 Case matters

Assuming the analysis as presented, immediate questions arise about variousdetails concerning Case assignment.

6.1 The Case of pro

It may seem, given what we have assumed about pro, that languages with andwithout this element involve very different sorts of Case relations. After all, theassumption is that pro is nothing but a neutralized category/head/feature, whichstarts its derivational life as a sister to a verbal projection, and ends up beingattracted to a head H where its Case/agreement is checked as a sub-label of H.In contrast, regular arguments are potentially cumbersome phrases which raisethrough movement, ending up as specifiers, i.e. obviously not simple heads.

This puzzle, however, is only apparent. The driving force behind the systemis, in all instances, feature attraction. From this perspective, pro, as nothingmore than a feature, is the ideal solution for a system which is driven by theattraction of one feature to another. It is only when an argument is not pro (andneeds to engage in a pre-Spell-out process) that attract induces the ancillarymechanisms, including Move, which ultimately induce a barrier. But despite thepossible occurrence of the ancillary mechanisms, feature checking takes place atthe head level, and therefore is identical across languages; the parametricchoices arise with respect to the earliness of this checking and the necessity ofthe ancillary processes.

This state of affairs has a curious consequence: if a derivation could choosebetween pro and a lexical alternative, use of pro entails much less derivationalcost. Generally, this choice is not there; for example, when a lexical NP ischosen from the lexical array, there is no issue of “replacing” it with pro – thesewould be two totally different derivations. However, it is conceivable that aderivation with pro and one with an overt pronoun compete with one another.26

All other things being equal, the system should prefer the derivation with pro.Only if processes which are inexpressible through pro – such as emphasis with aPF correlate – are relevant to the derivation would pro not be chosen, inasmuchas it would yield a non-convergent derivation. Of course, this fits nicely in thetendency we see in pro-drop languages to avoid overt pronouns, except for pur-poses of emphasis.

6.2 The Case of t

Consider next how the wh-trace gets its Case checked. The question goes wellbeyond the present analysis. How does any wh-element check Case? Notice thatwe cannot simply say that the wh-phrase moves through the specifier of someCase position, and then moves again to its final destination, for the followingreason. In the present version of minimalism, Case is not checked in a specifierof anything; movement to a specifier is ancillary to attraction to the correspond-ing head. Suppose then that the wh-feature moves through the checking domain

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

105

Page 117: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

of some Case position, and then moves again. But this is not trivial either: feat-ural movement involves head-adjunction, a morphological operation thatfreezes the resulting structure, much in the same way that we saw for entirephrase-markers that undergo a repair strategy at Spell-out. Once the head andits adjuncts have been shipped to morphology, we cannot just excorporate thewh-feature leaving the rest of the word behind. That operation simply cannot bestated in standard minimalist terms.

There are a couple of conceivable solutions that languages can try to “solve”this structural difficulty. In certain languages, the site that checks Case (forexample, a v element encoding Accusative) may end up moving all the way upto C, where it can check the wh-feature it previously incorporated as a sub-labelupon Case-checking the wh-feature. This would amount to saying that, in such alanguage, C is L-related to the V/T projections, and may well be a valid strategyfor wh-movement.

Returning to Basque, where as I said we have to assume C is not L-related tothe V/T projection, a different strategy must be operating. The logic of theanalysis entails that the wh-feature and the Case-feature of some lexical itemare not contained in the same bag of features. There are reasons to suppose thatthis is the general instance. Semantically, a wh-question has three main parts: aquantifier, a variable and a predicate. Wh-movement is concerned with thequantificational part, whereas Case is concerned either with the predicate or thevariable, depending on exactly what the role of Case is in the grammar. We maythen follow the suggestion, which Chomsky (1964) attributes to Klima, thatthese semantic distinctions correspond directly to the actual syntactic form ofwh-elements. Basically, who is shorthand for which one person, and what forwhich one thing, and so on. If so, attracting the wh-feature and attracting theCase-feature can be seen as attracting two different feature bags. This does notquite completely address the issue, though, since one may ask why only the wh-feature demands a repair strategy analogous to that in (28) in its C specifier, theposition where phonetic realization takes place. In other words, the question isstill why the Case feature, in this instance, is not also in demand of such a repair,hence forcing phonetic realization in the site of Case assignment.

The latter question is one all minimalist analyses face, if they assume Featureattraction. A plausible answer is this: the ancillary operation and correspondingrepair strategy takes place only when necessary, i.e. when the attracted featurehas morphological content. Intuitively, only an affixal attracted feature consti-tutes a morphological missing part of wherever it originated (the wh-phrase). Aphonetically null feature may be attracted without affecting the morphologicalintegrity of its source, and hence does not require the source to pied-pipe insearch of the missing feature:

(40) a. [[wh[C][you[[D[v]][saw who]]]]

b. [who[wh[C]][you[[D[v]][saw t]]]]

wh- is morphologicalD- is null

D E R I V A T I O N S

106

Page 118: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

C attracts the wh-feature and v attracts the D feature of who, which is possible ifthey are in two different bags (40a). Assuming wh- is morphological, it willtrigger the ancillary wh-movement in (40b); this is not the case (given thepresent hypothesis) if the D feature is not morphological.

A question to then ask is how the moved who in (40b) has its Case featurechecked if it never moves through a Case-checking site. Note that when whomoves up, its D feature has already been attracted to the v domain; hence, it is notwho that moves to C, but who minus the appropriate Case-checking feature. Thiswould not be a possibility were it not true that the Case-checking feature is null ina language like English, trivially materializes in the v specifier without a morpho-logical realization. If so, we expect different facts in other languages. There shouldbe some where wh- is not morphological, but D is. In such languages, we shouldfind an OV order, but not overt wh-movement, as in Japanese. We may also findlanguages where neither feature is morphological, in which case we expect VOorder and no overt wh-movement. Chinese may illustrate this. Finally, we oughtto find languages where both features are morphological, but derivations in suchlanguages would immediately encounter the difficulty posed in this section. Howcan related features be checked in two different categories, with morphologicalrealization in both? An immediate possibility that comes to mind are languagesthat resort to resumptive pronouns (e.g. Swedish), which may solve the difficultyby, in effect, splitting the operator-variable relation almost entirely. Alternatively,other languages might exploit the possibility of an L-related C-projection, andhence Case checking from this unexpected site, external to IP. It is worth explor-ing whether Chamorro may constitute one such language.

6.3 The Case of Chamorro

Chamorro exhibits wh-agreement with traces (Chung 1994: 9):

(41) Hafa pära u-fa’tinas si Juan t?what fut wh-OB.Agr-make Juan“What is Juan going to make?”

Why should agreement be sensitive to wh-traces? Chung (1994: 11) provides afunctionalist analysis: Assuming the grammar needs to indicate that there is anunbound wh-trace in some specified domain, Case serves to narrow the possi-bilities of the location of the trace within that domain. Within minimalism,however, even if we agreed with this conclusion, we could not use it as the causeof the phenomenon since lexical arrays are insensitive to interpretive demands.But we can turn the idea around, supposing that something like Chung’s conclu-sion is obtained as a consequence of a purely formal process. In a language inwhich both Case and wh-features are morphological, they must combine theattracting forces observed in (40a), directly resulting in local wh-agreement.

There is more. We saw that a language with those hypothetical characteristicsmust involve overt checking of both sets of relevant features (with the conse-quent repair strategies involving specifiers). The only way this will be possible in

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

107

Page 119: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Chamorro is if V is L-related outside IP. This correlates nicely with Chung’sobservation that Chamorro is VSO even in embedded clauses, thus, irrespectiveof V2 considerations. We may take this systematic order as an indication of verbmovement past the subject position, for which the hypothesized L-relatedness“out of IP” is desirable as a motivation.27

Finally, consider what is perhaps the most surprising fact about Chamorro:wh-agreement is exhibited long distance:

(42) Hafa ma’a’a ñao-ña i palao’an [t pära u-fa’nu’i si nana-ña t]?what wh-OBL.afraid-Agr the girl fut wh-OBJ.Agr-show

mother-Agr“What is the girl afraid to show her mother?”

Note, in particular, that the matrix verb exhibits wh-agreement with the movedelement. Remarkably, also, the wh-agreement does not exhibit the Case mor-phology of the moved item (here, OBJ), but instead, as Chung notes (p. 14), themorphology appearing in the main verb corresponds to the Case of the embed-ded clause (here, OBL). This suggests that the scenario we predict does hold:the moved wh-phrase agrees, as it proceeds through the intermediate C Spec,with the C head, this agreement signaling both Case and wh-feature checking.Subsequently, the C head incorporates into the V that takes it as an argument.Because of this, the morphology on the matrix verb is not that of the wh-item,but is instead that of the CP headed by C.28

6.4 A residual Case

There is a final wrinkle which the Basque data pose, as does, for that matter,Rizzi’s (1990) analysis of subject extraction in Italian. It is one thing to say thatobject or indirect object traces are derived as in (40). It is quite another thing tosay that subject traces are obtained that way. If subject wh-traces are directlyattracted from their VP internal position, what occupies the subject position insatisfaction of the Extended Projection Principle? We cannot really answer pro,as Rizzi in fact did, because pro is by hypothesis just another feature, and itseems unlikely that the Extended Projection Principle is ultimately a featuralrequirement. This suggests either that there is no such thing as an ExtendedProjection Principle (rather, the “EPP” should be seen as parametric), or elsesome null topic is satisfying the requirement in pro-drop languages. I will notdecide among these possibilities here.29

7 Extensions and speculations

It is fair to ask whether the analysis proposed here systematically extendsbeyond Basque. Likewise, we should consider whether structures other thanverbal arguments are sensitive to the sort of locality considerations discussed inthis chapter.

D E R I V A T I O N S

108

Page 120: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

7.1 wh-movement in Hungarian

An instance in the literature with similar descriptive properties to those ofBasque wh-movement can be seen in Hungarian. Thus:

(43) Kit (*Janos) t szeret?who-ACC. J.-NOM like-3“Who does he/Janos like?”

Kit “who” is fine when left-adjacent to the main verb szeret “like-3”; when theadjacency is broken, the result is bad.

Many works have discussed this effect, some of them explicitly comparing itto the Basque data.30 Many of the descriptive generalizations that we havereached concerning Basque would seem to hold in Hungarian as well.

For instance, Hungarian exhibits agreement (and associated pro-drop), notjust with subjects, but also with objects, just like Basque. Consider this exampleadapted from Brody (1990):

(44) Nem utalomnot hate-1-3“I don’t hate him.”

The reader may have noted that I have not glossed the examples in (43) withobject agreement markers. I have taken this practice from the literature, since it isgenerally assumed that true agreement in Hungarian is with definite, third-personobjects (as in (43)). Quantificational and indefinite elements do not exhibit thisovert agreement morphology. Thus, we compare, e.g. szereti “he likes definite” toszeret “he likes indefinite.” However, we may decide, based on the theoreticalmotivation of extending the analysis developed here to Hungarian, that Hungar-ian involves a form of (abstract) agreement even for indefinite objects.

Given agreement projections in Hungarian, we predict specifiers to inducepotential barriers. Note, though, that Hungarian uses the indefinite verbal formfor wh-traces, as (43) shows. If this indefinite (object) form did not involveagreement, the present analysis would predict that a subject wh-phrase need notbe adjacent to a verb when a direct object intervenes:

(45) [Wh [t Object Verb]]

The only motivation we had to rule out the analogous Basque example in (23)was to deny an extraction of precisely this sort.

In the Rizzi-style analysis in (24) – schematically repeated in (46) – subjectextraction is across an object, thus sensitive to morphological detail along thelines explored in this chapter:

(46) [wh [… [Object [t Verb] …]]]

But then we had to ask which languages proceed in this roundabout fashion,and, again following Rizzi, we suggested that these are the pro-drop languages,refining (46) to (47):

(47) [wh [pro [Object [t Verb]]]]

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

109

Page 121: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

It then follows that, since examples of the form in (45) are bad, Hungarian mustinvolve extraction as in (47), entailing a pro subject even when the pronouncedobject is “indefinite” (thus selecting the indefinite agreement pattern).

More generally, if agreement were not involved in indefinites in Hungarian,we would expect questions of the form in (48):

(48) *[wh [indefinite subject [t Verb]]]

But these are generally impossible. In a nutshell, there does not seem to be anywh-/verb adjacency which depends on whether intermediate material is (in)defi-nite, a situation which would, in present terms, force a similar agreement acrossthe board, regardless of morphological manifestation. (We reached a similarconclusion about apparent adjuncts in Basque (see (20)), where adjuncts aretreated as specifiers regardless of overt agreement.)

Of course, it could also be that the Hungarian phenomenon is just differentfrom the Basque one. There are reasons for and against this conclusion. On thesimilarity side we find that, over and above the core adjacency facts with wh-movement, Hungarian exhibits exceptions of the sort seen in Section 3.2:

(49) Miert Janos ment haza?why J.-NOM go-past-3 home“Why did Janos go home?”

Actually, Kiss (1987: 60) notes that miert “why” is ambiguous between a VP(reason) reading and IP (cause) reading. It is only in the IP (cause) reading thatsentences like (49) are acceptable, which is what we predicted for similarinstances in Basque and Spanish given that IP adjuncts are excluded by IP.

Notably, also, the whole paradigm studied in this chapter extends to focaliza-tion in both languages. Thus, compare these focalized instances ((50a) in Hun-garian and (50b) in Basque):

(50) a. PETER (*Marit) szereti.P.-NOM M.-ACC like-3-3“Peter likes Mari.”

b. PERUK (*Miren) atsegin du.P.-E M.-A like 3-have-3“Peru likes Miren.”

There are several reasons to suppose, with Brody (1990), that focalizationinvolves a category below C. Thus, consider (51):

(51) Ez dakit zergatik HONI eman behar diozun.not know-1 why THIS-D give must 3-have-3-2-if“I don’t know why TO THIS ONE you must give it.”

In this embedded question, zergatik “why” is presumably in the Spec of C, andcrucially the focused phrase is after the wh-word (and thus still satisfies the adja-cency requirement with the verb). This fact extends to Hungarian, where (49) isnaturally interpreted with focus on Janos. Now, for our purposes, it is not neces-

D E R I V A T I O N S

110

Page 122: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

sary that the target of wh-movement and the target of focalization be the same –so long as movement is involved in both instances and sensitive to barrier con-siderations of the sort discussed before.

Examples like (49) or (51), involving both a wh-phrase and a focalizedphrase, can only be constructed with an adjunct in C. Once again, the reason forthis is that only with an adjunct does the wh-phrase not need to be adjacent tothe verb, as we saw in (16). The fact that Basque and Hungarian pattern alikesuggests the convenience and desirability of a unified treatment.

On the other hand, there are also important differences between the two lan-guages. For example, long distance wh-movement in Hungarian looks more likethe process witnessed in Chamorro than anything seen in Basque. Compare:

(52) a. Janos irta hogy jon.J.-NOM write-past-3-3 that come-3“Janos wrote that he would come.”

b. Janos, akit mondtak hogy jon.J.-NOM who-ACC say-past-3 that come-3“Janos, whom they said would come.”

(52a) shows a regular verb with a complement clause exhibiting object agree-ment with this clause. When wh-movement out of the embedded clause takesplace, the matrix verb does not exhibit definite agreement (Kiss 1987: and ff.).31

This is reminiscent of (42), and may find a similar explanation. The moved wh-phrase agrees in wh-features with the intermediate C. However, unlike in (42),C in this instance (hogy) does not incorporate to the main verb mondtak “say-past-3.” Nonetheless, the agreement between the moved wh-phrase and C suf-fices to turn the embedded CP into a “not-definite” phrase, thus forcingindefinite agreement in the matrix verb. The question of course is why themoved wh-phrase invokes agreement with the intermediate C. By parity of rea-soning (with the analysis of Chamorro), agreement involving the intermediate Cin (52) may plausibly be because the V projection is L-related “outside IP.”

The possibility that C is L-related to the V projection in Hungarian, but notin Basque, may also account for the following contrasts:

(53) a. Mireni, nork eman dio zer?Miren-D who-E give 3-have-3 what-A“To Miren, who gave what?”

b. Marinak, ki mit adott/mit ki adott?M.-D who-NOM what-ACC give/what-ACC who-NOM give“To Mari, who gave what?”

Basque does not generally allow multiple wh-questions in pre-IP position,unlike Hungarian (where, as (53b) shows, no ordering obtains among themoved wh-elements). It may be that Hungarian (as Slavic and EasternRomance languages that allow multiple moved wh-questions) tolerates (53b)only through a C L-related to V.

If the L-relatedness of C to V is relevant in instances like (52b) and (53b), it

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

111

Page 123: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

is very likely relevant for examples of the format in (48) as well, and may beargued to be responsible even for (50a), contrary to what we expect in theBasque example in (50b). But I must leave any detailed discussion of this claimfor future research.

7.2 wh-islands

Just as we plausibly encounter transparent movement across pro in languagesother than Basque, a version of the process may be witnessed in other construc-tions. Of particular interest are wh-islands, which Chomsky (1995b: 295) sug-gests should reduce, in minimalist terms, to the MLC (which he partially retractsfrom in Chomsky 2000). That is unlikely for three reasons.

First, wh-island violations like (54a) are nowhere near as bad as MLC viola-tions like (54b):

(54) a. ?*[What do you wonder [why [John bought t]]]b. *[John is likely [that it seems [t to be smart]]]

Second, although it is arguably true that the matrix C in (54a) violates the MLC(it attracts the wh-feature of what over that of why) it is equally true that thewh-island effect arises even when the attracting feature moving over a wh-phrase is not obviously a wh-feature (Lasnik and Saito 1992):

(55) ?*[this car [I wonder [why [John bought t]]]]

And third, the wh-island effect is known to be, somehow, parameterized (Rizzi1982). Thus, compare (54a) to the perfect Spanish sentence in (56), of the sortdiscussed by Torrego (1984):

(56) ¿A quién no sabes por qué dio Juan t un beso?to whom not know-2 for what gave-3/past J. a kiss“Who do you wonder why John gave a kiss?”

It is implausible that the MLC is not violated in (56), if it is violated in (54a),and it is hard to see how to parameterize this.

One needs to argue that, despite appearances, (54a) is not an MLC violation,on a par with (55), which clearly does not violate the MLC. One possible wayaround the difficulty is in terms of the notion “closeness,” as defined inChomsky (1995b: 299). For Chomsky, � is closer to X than � only if � com-mands �. It is conceivable that attracted wh-features never enter into commandrelations with one another if they are within categorial D elements:

(57) … [CP [DP [D… Wh…]…] [DP [D…Wh…]…]]…

Depending on the complexity of DP, the upper D commands the lower – butwh, the feature within D, does not obviously command anything.

Second, we should try to argue that wh-island violations arise when the CPthat a wh-phrase lexically specifies is forced to undergo partial Spell-out. Thiswould be the case, according to our general reasoning, if C is morphologically

D E R I V A T I O N S

112

Page 124: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

heavy and attracts a wh-feature, with a consequent ancillary operation for overtwh-phrases which involves the Spec of C. It is the ensuing repair strategy thatwould freeze the CP. As a consequence, nothing should be able to move overthe fully specified CP, be it a wh-phrase (as in (38a) or any other category (as in(39)).

As for those languages which do not invoke a wh-island violation in instanceslike (54a) or (55), one possibility is that, by analogy with (25), a wh-phrase inthe Spec of C is there for a selectional requirement of the sort involved in theExtended Projection Principle (whatever that is), and not for reasons of featureattraction. This allows (in principle) for more than one type of movement to Cwhich may be descriptively adequate, particularly given multiple wh-movementof the sort in (53b).

A more interesting possibility arises when considering the Galician (58b),which is much better than its Spanish counterpart in (12), and even its Galicianversion in (58a):

(58) a. Non sei qué (*ti ou eu) lle enviamos.not know.1 what you or I him send.past-ind.l-PL“I don’t know what you or I sent him.”

b. No sei qué (ti ou eu) lle enviemos.not know.1 what you or I him send.pres-subj.l-PL“I don’t know what (you or I) to send him.”

(58b) is relevant because the embedded clause involves both clear overt agree-ment and wh-movement, possibly over an overt specifier; hence, it should be outaccording to everything I have said, yet it is fine. The key, though, appears to bein whether the “intervening” agreement is indicative or subjunctive, a fact thatis known to extend to many other Romance languages. It is reasonable tosuppose that (58a) is good because the subjunctive verb is L-related to theembedded C, thus somehow extending the cycle to the next domain up.32 If thisis true, one could apply the L-relatedness Lemma, and thus not go into earlySpell-out upon hitting the specifier of IP. More generally, languages may usethis sort of strategy, perhaps also involving L-relatedness between the embed-ded C and the matrix V, to avoid certain wh-island violations. A case in pointmay be the C incorporation hypothesized for Chamorro in (42).

8 Conclusions

wh-movement in Basque serves to test both the Barriers framework and theMinimalist Program, in both cases with good results. The analysis has shownthat Basque behaves as expected, and provides interesting clues concerningrestrictions on wh-movement across a lexical specifier in languages with richagreement. It is here that the Minimalist Program seems superior. Under theassumption that intervening lexical specifiers are moved for ancillary reasons(the real computational mechanism driving syntax being the process of featureattraction), their blocking effect is not surprising. Technically, the result follows

M I N I M A L R E S T R I C T I O N S O N B A S Q U E M O V E M E N T S

113

Page 125: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

from a morphological repair strategy which must proceed cyclically, forcingthe system into early Spell-out. The proposed mechanisms do not block A-movement. Assuming that domains of Lexical-relatedness belong to the samecycle, the grammar does not need to invoke early Spell-out. This distinguishes A-movement from the less bounded wh-movement. The analysis confirms theview of the derivational system as dynamically split (in principle feeding PF andLF in successive derivational cascades, up to convergence and within optimalityconsiderations); it also confirms the central role of morphological processeswithin the system at large.

D E R I V A T I O N S

114

Page 126: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

6

LABELS AND PROJECTIONS

A note on the syntax of quantifiers†

with Norbert Hornstein

1 Introduction

Determiners can be conceived as predicates of sets (see Larson and Segal 1995:Chapter 8). Thus a quantifier like all or some in (1) can be thought of as a rela-tion between the set of whales and the set of mammals, as shown below, whereB� {x | x�a whale} and A� {x | x�a mammal}.

(1) a. All whales are mammals.a�. ALL (B) (A) iff B is a subset of Ab. Some whales are mammals.b�. Some (B) (A) iff the intersection of A and B is non-empty

Natural language determiners are conservative (see Keenan and Stavi 1986). Adeterminer is conservative if for any sets B and A that are its arguments thesemantic value of “ (B)(A)” is the same as the semantic value of “ (B)(A�B).”

The conservativity of natural language determiners can be seen by consider-ing the examples in (1). (1a) is true iff the set of whales is a subset of the set ofmammals, iff the set of whales is a subset of the set of mammals that are whales.Similarly (1b) is true iff the intersection of the set of whales and mammals is notempty, iff the intersection of the set of whales with the intersection of the set ofwhales that are mammals is non-empty.

Natural language determiners come in two varieties: strong determiners likeevery and weak ones like some (we are setting aside adverbs of quantification).It is not surprising that weak determiners should be conservative, as they areintersective. This means that the truth of a proposition introduced by a weakdeterminer only relies on properties of elements in the intersection of the twosets. This is illustrated by considering (1b) once more.

If some whales are mammals is true then so is some mammals are whales.Thus, for a weak determiner , “ (B)(A)” is true iff the set B�A has someparticular property. Note that intersecting B and A yields the same members asfirst intersecting B and A and then intersecting this whole thing with B again,i.e. B �A� (B�A)�B. So here (B �A)� ((B�A)�B). As this is the conser-vativity property, it should be now clear why intersective determiners areconservative (which is not to say that weak determiners cannot be treated asbinary in the sense to be discussed below, see Section 9).

115

Page 127: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

In short, given that all that counts semantically for weak determiners is keyedto common members of the evaluated sets, and given that intersection simplyyields common members, then intersecting the same sets again and again cannothave an effect on the outcome. Also, intersection is obviously symmetric. So notonly do repeated intersections of the same sets yield no advantage to singleintersections, but the order of the intersections makes no difference either.A�B is the same as B�A and ((A�B)�C) is the same as ((B�C)�A) and((C�A)�B) etc.

None of this extends to strong determiners. They too are conservative.However, the arguments of a strong determiner are not generally “interchange-able.” Thus, all mammals are whales is clearly not equivalent to (1a). In the caseof strong determiners, it is critical that one distinguish its arguments and orderthem in some way, “ (B, A).” Indeed, the observation that strong determinersare conservative only follows if this ordering is respected. Thus, in (1a), whalesis B and mammals is A. All whales are mammals is true iff all whales aremammals that are whales is true. However, all whales are mammals is notequivalent to all mammals are mammals that are whales nor to all whales that aremammals are whales. The order of the arguments makes a difference and thisreflects the fact that strong quantifiers are not intersective.

One of our central concerns in this chapter is understanding where that orderingcomes from. The position we will be taking is simple: the ordering arises as a struc-tural fact about how predicates and arguments are composed in natural languages.

It is well known that verbs distinguish internal from external arguments. Exter-nal arguments are thematically dependent on the compositional structure of theverb plus the object. This explains, for instance, why there are so many V�Objectidioms but no Subject�V idioms. If determiners are relations that take predicatesas arguments, then it makes sense to think that the mechanisms for combiningthese predicates with their arguments should be the same as those the grammarexploits elsewhere, perhaps at some “higher order” of structuring. In effect, weshould expect to find the internal/external argument distinction in such cases.

Assume that this is indeed correct. What is the internal argument of a strongdeterminer and what its external argument? The first question has a readyanswer. The internal argument is the NP that the determiner takes as comple-ment. In (1a), whales is the internal argument of all, in effect, its object.

What is harder to clarify is how the external argument is determined. In VPs,the external argument is a verbal specifier, which permits a compositional assign-ment of a �-role to the expression in this position. Then by parity of reasoning,are mammals in (1a) should be (at whatever level this is relevant) the specifier ofall. The question at issue is whether there is some direct way of getting thisresult, given the standard resources that grammatical theory makes available.

In what follows we explore the following option. The structural distinctionbetween internal and external arguments is related to labeling procedures, as it isthese that allow one to structurally distinguish complements from specifiers.From this perspective, and given usual assumptions, it is not true that a VP likeare mammals is the specifier of a determiner like all. Rather, VP is the comple-

D E R I V A T I O N S

116

Page 128: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

ment of I, which takes the projection of all as its specifier. However, we could askthe question whether this state of affairs holds throughout the entire derivation.

Could it be that at some point in a derivation structures, in effect, “repro-ject”? Could a determiner like all take either VP or some projection containingit (say, I�) as its specifier?

These questions presuppose that labels are not indelible. This makes sense iflabels are the consequence of relations established throughout the derivation.Seen this way, labels are analogous to “bar” levels or “relations of grammar”such as subject and object: real and substantive, but not primitive, and indeedcapable of changing as the derivation unfolds. From this derivational perspect-ive, it is entirely natural for “reprojections” to arise.

2 Move, project and reproject

Assume a bare phrase structure (BPS) approach to phrasal constituency, andconsider a phrase in which an expression �, � maximal, moves and targets a cat-egory K (i.e. a traditional “substitution” operation). Chomsky (1995b) arguesthat K must project to yield structure (2a). In contrast, the structure in (2b)(where � projects) is taken to be ill-formed; consider why.

(2) a. [K � [K K0…[…�…]…]]b. [� � [K K0…[…�…]…]]

There are at least two ways to rule out (2b). First, one can render it illicit interms of chain uniformity. If the lower copy of is maximal, then the upper copyis not in (2b), given a functional approach to bar levels. By projecting, �becomes non-maximal and the chain � fails to be uniform with respect to barlevel. If chains must be uniform then (2b) is illicit.

A second way of preventing (2b) is in terms of Greed, Attract and checkingdomains. For Chomsky (1995b) movement occurs when some feature of � isattracted by some feature of K that needs checking for the derivation to con-verge. If � projects, however, then � is no longer in the checking domain of K0.This would prevent the feature that did the attracting from being checked. Thusmoving would be illicit as it violates Greed.

Regardless of whether either of these approaches is correct, we would like toconsider the possibility that (2b) is legitimate; we will systematically call it a“reprojection” of (2a). There is some benefit, we believe, in allowing this. Inwhat follows we (i) outline what that benefit is, (ii) spell-out the details of“reprojection,” (iii) consider some of its various implications, returning inparticular to matters of chain uniformity, and (iv) emphasize that reprojectiononly applies to a class of determiners.

3 The argument structure of DPs

We have observed in the introduction that natural language determiners areconservative, and that the conservativity of strong determiners relies on being

L A B E L S A N D P R O J E C T I O N S

117

Page 129: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

able to impose an order on a determiner’s arguments. Specifically, it relies ondistinguishing a determiner’s “internal argument” (the B argument in (1a�))from an “external argument” (the A argument in (1a�)). None of this applies toweak determiners as they are intersective and, for semantic purposes, can betreated as unary, i.e. taking but a single argument. We further observed that theinternal/external argument distinction has proven to be empirically useful in dis-tinguishing the arguments of verbal predicates. It would be optimal if what weobserve in the case of verbs could be somehow extended to determiners. (Theanalogy between determiners and verbs is illuminatingly discussed in Larsonand Segal (1995); our argument here is extensively based on their observations.)

The internal/external argument distinction finds its natural home in �-theory.The internal argument of a verb is a sister of the verb head, the internal �-rolebeing “assigned” to “DP1” in a configuration like (3):

(3) [V V DP1]

In contrast, the external �-role is “compositionally” assigned by the verb plusthe internal argument. It is assigned to “DP2” in configurations roughly as in (4).

(4) [V(P) DP2 [V(�) DP1]]

Observe that, given these configurations, standard �-role assignment is locallyconfined to the projection of V. The internal argument is in the immediate pro-jection of the head while the external argument is in a more remote projectionof the head.

Of course, determiners do not involve standard arguments in the �-theoreticsense (they do not exhibit lexical selection requirements, are not sensitive tothematic hierarchies, etc.). Nonetheless, suppose that, structurally at least, thesemantic arguments of a determiner are also established in the local fashionmentioned above; i.e. in the projection of the head that takes those arguments.What does this imply for strong determiners?

Recall that in a sentence like (1a), repeated here as (5a), whales is theinternal argument while mammals is the external one. The standard structure ofthis sentence at LF is (5b), all whales having been moved from its underlying �-position to Spec IP, in order to check various features.

(5) a. All whales are mammals. (� (1a))b. [IP [DP All whales] [I� are [SC t mammals]]]

From (5b) it is clear that whales is the internal argument of all – it is the imme-diate sister of the determiner. However, what is less clear (structurally) is whatthe external argument is.

To make determiners parallel to verbs, we would like all to take as its exter-nal argument (are) mammals. However, all is in no structural position to do so ifwe assume that external arguments are established within the domain of headsthat take them as such. Plainly, in (5b), all is in the domain of I0 where what wewant is for (are) mammals to be in the domain of all.

Interestingly, this impasse can be resolved (while holding faithfully to the

D E R I V A T I O N S

118

Page 130: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

parallel with verbs) by assuming that the DP projects its label in (5b). The struc-ture would be as in (6).

(6) [DP [D� All whales] [IP are [SC t mammals]]]

(6) is isomorphic to (4). The external argument of all� in (6) is the I� (now an IP)are mammals. The internal argument is whales. The assumption that argumentsare established within the domain of the predicate that takes them, in this caseall, is also retained.

The only important difference between the DP and the VP case is that theinternal/external structural difference is determined under Merge for Vs, butunder Move for strong Ds. (This might also relate to why the VP projectionplausibly employs a v-shell, whereas it is not obvious that the DP reprojectiondoes; the addition of v does not affect our discussion.)

The same feature checking seen above forces strong DPs out of their base VPpositions at LF. A sentence like (7a) would have the LF (7b).

(7) a. John likes every book.b. [IP Johni I0 [DP [D� every book]j [VP ti likes tj]]]

If every book failed to raise out of the VP and reproject, it would not find anexternal argument to which it could locally relate. Recall that inside the VPstructure the DP must get a �-role. This requires that likes project, but this rea-soning does not affect the move to the outer, Case related, verbal specifier. Bythe time every book moves there, likes has established its �-relations with everybook and John within its projection. This allows every to reproject and meet itsparticular requirements vis-à-vis its external argument after moving.

4 The general form of reprojection

Determiners do not leave their base configuration in order to meet their seman-tic demands. Nonetheless, an element which has moved for some bona fide syn-tactic reason (Case, agreement, etc.) can assume the structural propertiesnecessary to obtain an appropriate interpretation. The relevant representationsare interpretable in our examples precisely in terms of reprojection.

Reprojection is not a transformation, or for that matter a computational pro-cedure. It is a representational coding of the particular requirements that differ-ent dependents of a syntactic object happen to assume. For example, aquantificational element may have moved to, thus creating, the Spec of aninflectional element for Case reasons. Once there, what is the label of the cat-egory dominating the moved quantifier? The label qua Case is a projection ofthe inflectional element; however, qua the quantifier itself, the label shouldrather be taken as a projection of the quantifier. Assuming in terms of Greedthat the movement-inducing Case relation is relevant before the other relation,there is good sense in saying that the second label is a reprojection.

Viewed that way, reprojection is a matter of derivational perspective. Ifthe derivation demands that the syntactic properties of some category X beaccessible to the system, then X is taken to label (that is, type) the construction.

L A B E L S A N D P R O J E C T I O N S

119

Page 131: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

If later on, the system needs to interpret some aspects of a category Y, sister toX, then Y is taken to label/type the construction. In our core instances, whenthe system needs to access I, then the relevant construct is typed as IP; but whenthe system needs to access the DP that has been attracted to I instead, and cru-cially the appropriate configuration is met, then the whole thing is typed as aDP. This will have important consequences, as we attempt to show below.

There is a derivational ordering implicit in our previous comments which isintentional. The labeling of a phrase-marker may change from narrow syntacticrepresentations (8a) to the final LF ones (8b), even if its constituent structureremains constant. This has non-trivial side-effects.

(8)

For example, we do not want (8b) to hold at the point of Spell-out, or we wouldincorrectly predict, in terms of Kayne’s (1994) LCA, that a strong determinerdoes not linearize in its normal subject position. (That is, the specifier of a struc-ture precedes the rest of this structure, for both Kayne (1994) and Chomsky(1995b). After reprojection, the I� turns out to be the specifier of D, but weclearly do not want to say that I� precedes D.) Intuitively, at the point of Spell-out there is no reason why the system should interpret labels as in (8b), since itis only semantic demands on the determiner that entail the reprojection – butwe must show how the system guarantees this.

It is also worth pointing out that the structural imbalance of (8a) (where DPis a “capped off” projection) is shifted in (8b) (where XP is “capped off”). Char-acteristically, “capped off projections” induce Huang’s Condition on ExtractionDomain (CED) effects, preventing transformations across (see Chapters 3 and4). If representational changes as in (9) really occur, we expect there to be con-texts which do not induce islands in narrow syntax, but do at LF:

(9) a. b.

ZP

WP

XP

t… …

X'DP

D YP

Narrow syntax: LF component:

relation across no barriers

ZP

WP

DP

t…

XPD'

D YP

relation across a barrier

a. XP

X'DP

D YP

b. DP

XPD'

D YP

D E R I V A T I O N S

120

Page 132: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

We may also observe that reprojection might be impossible in some contexts.Adjunction of D to X forces the creation of a category segment, represented inbetween inverted commas in (10) below. Plausibly, category segments must belabeled in terms of the category they are segments of, and thus cannot assumethe label of the item that induced the projection (i.e. must have label X, not D).If so, reprojection would be banned in adjunction sites, since as it were it wouldnot be a “clean” reprojection.

(10)

Reprojection may also affect the details of “reconstruction.” The question is: “Isit or is it not the same to reconstruct a chain through the clean path implied inregular structures than to do it through the somewhat altered path arising inreprojection?”

In sum, if it exists, besides giving us a syntax for generalized quantification,reprojection should have plenty of structural effects. It should induce LF islands(Case A), which emerge only in contexts where the labels involved belong tobona fide categories, not segments (Case B), and perhaps affect specific recon-structions (Case C). In what follows we consider data that suggest that each ofthese possibilities actually arises.

5 Quantifier induced islands (Case A)

Honcoop (1998) presents a convincing summary of a part of the literaturedevoted to contexts which are transparent with regards to some, though not allquantificational dependencies that involve operator-variable expressions. Forinstance:

(11) a. *Nobody gave every child a red cent.b. Nobody gave two children a red cent.c. What did nobody give every child?

There are two separate issues involved in these quantifier induced (QI) islands:(i) the constructions that test them, and (ii) the quantifiers that provoke them.Consider each in turn.

QI islands manifest themselves in “split” constructions, which characteristi-cally involve a quantificational expression that attempts to establish a relationwith another element, typically an indefinite. A canonical instance of a split con-struction is the relation between a downward entailing quantifier and a negativepolarity item (NPI), as in (11) above. Other such constructions analyzed in the

a. b.Adjunction: Failed reprojection:

?

X'D'

D YP

"X'"

X'DP

D YP

L A B E L S A N D P R O J E C T I O N S

121

Page 133: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

rapidly growing literature include the “what . . . for” split of Germanic andSlavic languages, partial wh-movement and possibly also some instances of mul-tiple questions.

Split constructions cannot take place across certain weak islands, induced bysome quantifiers. Clearly, all weak quantifiers in their standard use do notinduce these islands. However, strong quantifiers exhibit a somewhat strangebehavior. Most induce the relevant islands, but some do not. In particular,names, definite descriptions, demonstratives and kind-denoting plurals, allowsplit constructions across them:

(12) a. Nobody gave Billy a red cent.b. Nobody gave the child a red cent.c. Nobody gave that child a red cent.d. Nobody gives children a red cent.

These all contrast with strong quantifiers as in (11a), yet pattern with them, andagainst weak quantifiers, in triggering a “definiteness effect”:

(13) a. *There is every child here.b. There are two children here.c. *There is/are Billy, the/that child, children here.

Honcoop acknowledges this much, and discusses some possible semantic waysin which the right cut can be made. In our terms, the issue is in terms of repro-jection, and our challenge is to demonstrate, first, that only a subset of strongquantifiers reproject; and second that, in spite of this, all strong quantifiersexhibit a definiteness effect as in (13).

Before we return to that second goal and to a detailed presentation of whyislands should arise in our system, we would like to point out that it is naturalthat names, definite descriptions, demonstratives, and kind denoting plurals,should not reproject. Remember that reprojection is a need only when we havea binary quantifier (involving internal and external arguments). The relevantquestion, then, is whether the elements we are now considering should be takenas binary. The issue hardly arises for names, at least in the classical view. As fordefinite and demonstrative descriptions, in most instances there is not muchreason to make an internal/external argument distinction; witness:

(14) a. The/these men are mammals.b. The/these mammals are men.

Whenever (14a) is appropriate and true so is (14b). But this is the distinguishingmark of an intersective determiner. As noted earlier, intersective determinerscan be treated as unary, which suggests that definite descriptions and demon-stratives are (or at least can be treated as) unary. If so, they should not have toinvoke an external argument, and therefore the syntax of reprojection. Thenwhatever accounts for why split constructions are impossible across reprojectionshould be irrelevant when names and articles are involved (see Pietroski 1999

D E R I V A T I O N S

122

Page 134: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

for much related discussion). Presumably the same can be said about kind-denoting expressions, although we will not argue for this now.

6 The emergence of LF islands (an account of Case A)

Whatever account one proposes for QI islands, it must ensure that the basicfacts in (11) are preserved:

(15)

The relation R between X and Y cannot be absolutely prevented. R must bepossible if either Q is unary or R takes place overtly (even if Q is binary). Thisrelativity has made these facts reasonable candidates for semantic analyses ofthe sort Honcoop reviews and presents. Our syntactic analysis also addressesthe relativity head on.

Consider the chains involved in reprojected structures:

(16)

In (16a) DP has moved, subsequently reprojecting and thus turning intoD� (16b). As a consequence, a chain involving D� and DP links is notuniform, in the sense of Chomsky (1995b). The uniform chain would involve dif-ferent occurrences of identical DP links, which do exist in (16b). However,the upper link includes the lower link. Formally, the top object is {D, {D�, XP}}(for D the label and D� and XP the constituents), and XP includes the lowerDP (formally: {X, {X, {…DP…}}}). That is thought to be an impossible chain,where links (that is, category occurrences of the moved item) do not stand ina command relation. However, prior to reprojection the chain in questionwould certainly be appropriately identified. In (16a) a uniform chain canbe easily observed, involving the upper and lower DP copies in a commandrelation.

Chain identification, thus, must plausibly take place prior to reprojection. Inturn, that very process of chain identification, assuming it involves some compo-nent of grammar, will create an island for processes that take place afterwards.To see this, compare once again the two types of syntax we are working with,enriched so as to include some neighboring context (we use the notation“XP|DP” to indicate the “reprojection of XP as DP”):

b.

DP

XPD'

D YP

XP

X'DP

D YP

QUANTIFIER PRIOR TO AND AFTER REPROJECTION

DP… … DP… …

a.

[…X…[Q…Y…]]

R

L A B E L S A N D P R O J E C T I O N S

123

Page 135: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(17)

For a chain to be identified, the task the grammar faces is that of singling out anobject involving the upper and lower occurrences of DP, technically{{DP,XP},{W,DP}}, where the phrasal context identifies each occurrence. Theminimal amount of structure the grammar needs to operate on, in order to allowsaid identification, is XP, which contains both occurrences (both phrasal contexts)entering into the chain relation. We have circled that area in (17b), intending todenote the creation of a sort of “cascade” of derivational processes in this way.

Following a similar idea for the overt syntax discussed in Chapter 3, weassume that “cascades” are interpretive cycles, which gives meaning to the ideathat the system has “identified a given chain.” This is crucial, if the parts of asyntactic unit so interpreted become inaccessible to the computational system.The intuition is this: a structure X that has been “cashed out” for interpretationis, in a real sense, gone from computation. The terms of X surely must still beinterpretable, but they are literally inaccessible to the syntactic engine.

Covert operations across cascades are thus predicted to be impossible. Forinstance, suppose Z in (17b) were trying to syntactically relate to any elementwithin the cashed out XP|DP (as in split constructions). There would be no wayof stating such a relation, and as a result either the derivation would crash (if therelation is needed for convergence higher up in the phrase-marker) or else aninterpretation based on the impossible relation would simply not obtain, and therelevant meaning would be unavailable.

QI islands all reduce to the situation just described. For instance, let Z in(17b) be an NPI licenser, and suppose it is trying to relate down to an NPI,across XP|DP – as would be the case if the specifier of XP were taken by astrong quantifier. Trivially, now, the split relation in question would not obtain.

Why do intransitive determiners not create the same interpretive cascades?They do not have to. Contrast (17a) with the already discussed (17b). Thesystem does not have to create an interpretive cascade here, since the chain{{DP,XP},{W,DP}} can wait until further up in the phrase-marker to be identi-fied. This is because no reprojection obtains with the intransitive element, andthus the chain is not forced into an “immediate identification” before the possi-bility of a uniform chain is destroyed after reprojection. Differently put: the

b.a. INTRANSITIVE Q

XP

XPD'

D YP

W DP… …

ZP

Z …

XP|DP

TRANSITIVE Q

XPD'|DP

D YP

W DP… …

ZP

Z …

D E R I V A T I O N S

124

Page 136: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

system in this instance does not need to go into an interpretive cascade thatseparates a chunk of structure from the rest; if so, then any syntactic connectionis still entirely possible in the undivided object, as desired.

Slightly more precisely, it should be economy that prevents an interpretivecascade when it is not needed (it is cheaper to have fewer cascades than more).If so, access to LF interpretation should be somewhat akin to morphology appli-cation in the other side of the grammar (see Chapter 5), both operations with acost that the system minimizes.

Note also that, for these ideas to go through, the system must be heavilyderivational and entirely cyclic. The computation cannot be allowed to waituntil after all LF (in particular, “split”) processes have taken place to, as it were,“come back” and identify chains. If that were possible, there would be no inter-pretive cascades, and thus no associated LF islands. Such a system, also, wouldneed further conceptual assumptions which seem hard to motivate in the Mini-malist Program. The simplest proposal, thus, is also empirically adequate.

It might seem as if, in our derivational model, chain identification must beseen as prior to certain LF processes. That would be contradictory, for presum-ably chain identification is the last bona fide LF function prior to handing struc-tures over to the interpretive components of the intentional system. However, ifwe follow the derivation through as it proceeds upwards in the tree, we will seethat chain identification never precedes any other LF process. Indeed, our pro-posal about LF islands relies on the fact that chain identification caps off acertain cascade of structure, making its parts inaccessible for further operations.Of course, further up in the phrase-marker it is possible for operations to con-tinue, so long as they need not access the parts of the cashed out structures –that is the whole logic behind a cyclic system.

7 Incorporated quantifiers (Case B)

We noted in Section 4 how adjunction, as involved for instance in incorporation,should create problems for our hypothesized reprojections. Kim (1998) observesthat, in Korean, certain underlying objects can appear case-less, which he attrib-utes to Noun-incorporation. Importantly, weak and strong determiners fare dif-ferently in this respect:

(18) a. Nwukwu wassni?someone came

b. *Motun salam wassni?All men came

And most crucially for our typology, names and definite descriptions align withweak, not strong determiners:

(19) a. Con wassni?John came

b. Ku salam wassni?the men came

L A B E L S A N D P R O J E C T I O N S

125

Page 137: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

At the very least, we can say that Kim’s observation follows closely the onesreported by Honcoop, in terms of the typology of items involved in each.However, Kim’s facts have nothing to do with islands. We find this ratherimportant, since it suggests that the proper treatment of these effects must gowell beyond their island-inducing qualities.

Once Kim has unearthed the Korean facts, it does not seem particularly diffi-cult to replicate them in English, in other instances that would seem to involvenoun-incorporation, or at any rate adjunction to N (which is what matters forour purposes):

(20) Stalinchildren-as-a-group

He is a two-people-gabbing hater.*most-children-gabbing�*every-child

�Surely other restrictions arise in compound formation (for instance involvingarticles, thus preventing *the/some-child-hater regardless of the weak/strongdistinction), but otherwise it would seem as if Kim’s observations generalize.

8 Clean reprojections (an account of Case B and aconsequence)

As we saw, it makes sense to say that adjunction structures cannot reproject, ifthey are not “clean,” involving complex segments as opposed to entire cat-egories. Suppose a transitive determiner were to incorporate in Korean. Subse-quently, if the incorporated phrase may not project, the derivation will result ina convergent representation with no semantic intelligibility, and no alternativederivation. In contrast, intransitive determiners face no such problem, and sincethey do not need the relevant reprojection, their interpretation is directly pos-sible.

That part of the analysis is straightforward enough, but we would like toextend it to less obvious instances involving existential constructions of the sortin (13) above. Observe that if every must reproject in (13), and if there preventsevery man from so doing, then we have a simple account for why strong DPslike those are barred from being “associates” to an expletive.

Chomsky (1995b) argues that associates (or at any rate, their crucial cat-egorial features) adjoin to the pleonastic at LF. By parity of reasoning withthe analysis of the Korean facts, the adjunction in question should preventan associate from reprojecting, since the reprojection would not be clean inthis instance either. This is fine if the associate is a unary quantifier, but againresults in semantic deviance if it is a binary quantifier, thus the definitenesseffect.

There is a wrinkle (or a prediction anyway). The key to our account of thedefiniteness effect is not whether a determiner is strong, but rather whether it istransitive, that is binary. This fares well with strong quantifiers, barred in the

D E R I V A T I O N S

126

Page 138: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

contexts in question. But it raises an issue about names, definite descriptions,demonstratives, generics, and other such elements which, according to our tests,are (or at least can be) intransitive, that is unary. Are they possible or imposs-ible in existential constructions?

There are well-known so-called exceptions to the definiteness effectsthat typically involve precisely the sorts of elements that, according to ourexplanation, should be syntactically possible in those contexts, if they are indeedunary:

(21) a. Who can play Hamlet? Well, there’s Olivier . . .b. What can we use for a prop? There’s the table . . .c. There’s this crazy guy in your office screaming.d. These days, there’s crooks/the prototypical crook all over the place.

These contexts are obviously exemplary, presentational, or quasi definitional,which clearly affects the definiteness effect. Nonetheless, it is rather clear thatstrong quantifiers proper are not welcome here, even in these prototypical cir-cumstances:

(22) a. Who can play Hamlet? #Oh, there’s everyone . . .b. What can we use for a prop? #There’s really most things in the

gym. . .c. #There’s/are most crazy guys in your office screaming.d. #These days, there’s all crooks all over the place.

In turn, true existentials (controlling for the prototypical contexts just men-tioned) are bad if they involve names, definite descriptions, demonstratives, orvarious generic expressions:

(23) a. #There’s Olivier on stage.b. #There’s the table you got me for a prop on stage.c. #There’s this door [speaker points] in my house.d. #There’s crooks/the prototypical crook in jail.

Still, nothing in the logic of our account prevents a definiteness effect here. Allwe have asserted is that, when the determiner associate is binary, it cannotreproject, resulting in an uninterpretable representation. When the determinerassociate is unary, it should be able to reproject, but something else may be pre-venting it from appearing in these existential contexts.

That must be the presuppositional character of names, definite descriptions,demonstratives, and generics, which makes them incompatible with theintrinsically non-presuppositional character of existential statements. Whatis wrong with the examples in (23), in our view, has nothing to do with thepresence of the pleonastic there; the issue is an expression that forces a non-presuppositional subject. This is consistent with the fact that the examples in(21), where the context is not existential, but exemplary, presentational or moregenerally prototypical and permits the unary elements in point. At the sametime, those very examples still forbid transitive determiners, as (22) shows. (This

L A B E L S A N D P R O J E C T I O N S

127

Page 139: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

presupposes a theory of the presuppositionality of names which is differentfrom whatever is involved in their LF position.)

9 Reconstruction (Case C)

To conclude our analysis of the factual consequences of reprojection, we wantto discuss reconstruction. The general reasoning deployed above deduces oneaspect of Diesing’s (1992) mapping hypothesis. Her proposal is (in part) thatstrong DPs must be outside the VP shell by LF to be properly interpreted.Diesing relates this to Heim’s (1982) tripartite structure for the proposition. Themapping principle provides an algorithm for taking an LF phrase marker andconverting it into a Heimian 3-part proposition. We believe we obtain a similarstructural conclusion without additional stipulations.

In order to reason this out, we have to make certain commitments aboutreconstruction. We will not have anything to say about A�-reconstruction here,but consider A-reconstruction. Suppose that, given an A-chain CH� {{DP,XP},{W,DP}}, reconstruction is interpretation of DP at any of its occurrences otherthan the pronounced one (typically, the highest). For example, if DP ispronounced in a Case position, reconstruction will be interpretation of its �-occurrence, inside VP.

Let us point out the following interpretive corollary:

(24) A D chain must be interpreted in the occurrence that reprojects.

(24) follows from the semantics of the determiners that reproject. If we were toreconstruct a determiner that reprojects, we simply would not be able to meetits semantic demands, the whole reason it reprojected to begin with.

If in contrast D does not reproject, there is no reason why it should notreconstruct. This is in spite of the fact that what carries binary determiners outof the VP shell, such as Case, is also carrying unary determiners. It does notmatter, if the unary determiner does not engage in any reprojection. As a con-sequence, in principle any of the occurrences of a DP chain headed by a unarydeterminer can be the source of interpretation.

In large part, that predicts the Diesing effect, at least as interpreted in Horn-stein (1995a). Diesing was speaking of weak vs. strong determiners, and as wesaw the distinction for us is rather unary (generally weak) vs. binary (generallystrong, except for names, definite and demonstrative descriptions, kind plurals),but we believe this can be accommodated.

Thus, as we saw for our treatment of the definiteness effect, names, definiteand demonstrative descriptions, and arguably kind plurals, can all be seen asintrinsically presuppositional, regardless of the shape of their syntactic support.If so, there is no obvious reason why the syntax should map these elements inVP external position.

An interesting question, also, is whether binary DP subjects are interpretedin their �-occurrence or rather in their Case occurrence. It is not easy to decide

D E R I V A T I O N S

128

Page 140: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

this semantically, since in both instances the DP is “outside VP,” in the veryspecific sense that it could take as its first argument the NP complement, and asits second argument either the “rest” of the VP (when in its (�-occurrence) orelse the I� (when in its Case occurrence).

The logic of our proposal forces an answer, given the facts. If binary DP sub-jects were allowed to be interpreted in their �-occurrence, why should theyreproject early, thus inducing an LF island? The reason we had for an earlybleeding of the derivation into LF was because, otherwise, if we waited afterreprojection, we would not be able to identify the chain that induced the repro-jection under command and uniformity conditions. However, it is not clear thata DP subject in its �-occurrence would have to be involved in any chain identi-fication other than the one its trivial occurrence already provides. After all,who cares if the Case occurrence is not interpreted? It is uninterpretable tobegin with.

That leaves us with a mystery. Why can it not be the case that the binary DPsubject is interpreted low? (Notice, in contrast, that binary DP objects do nothave that option, since that would not yield an interpretation, missing one of thearguments.) We do not want that possibility, but nothing that we have said pre-vents it as an option. Perhaps this relates to the Extended Projection Principlefor subjects, but we will not venture any speculations.

10 Quantifier raising (more details and a consequence of Case C)

The difficulty just posed has a surprising consequence for quantifier interac-tions. Consider (25):

(25) Many of the senators read every law proposal.

If this sentence is ambiguous between narrow and wide scope readings for everylaw proposal, then there must be a rule of Quantifier Raising, independent ofthe configurations that standard grammatical relations (Case, agreement) other-wise achieve. The problem is that many of the senators is, as per the reasoningwe reached immediately above, frozen in the subject site. The moment this istrue, then the only other known way that every law proposal will be interpretedhigher up in the structure is if some process promotes it there. That process,however, cannot be a standard rule of grammar motivated by some other syn-tactic considerations, under the assumption that the surface structure in (25) hasalready undergone the action of all such rules.

That conclusion can be confirmed by trapping a quantifier inside anotherwhich, in addition, reprojects:

(26) a. I don’t think that many of the senators (*ever) read every lawproposal.

b. What don’t you think that many of the senators (*ever) read?

The impossibility of the polarity item tells us that many of induces an island, as

L A B E L S A N D P R O J E C T I O N S

129

Page 141: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

expected if it is a strong binary quantifier. This must mean that many of repro-jects, crucially after having sent the structure that properly contains its chain toearly interpretation. Importantly, we have now placed another quantifier inobject position which may take scope outside of many of. In spite of the factthat the quantifier would not seem to involve overt movement, it is behaving asif it were (like what in (26b)). This is intriguing, because we seem to havecontradictory requirements: sending the reprojected structure to interpretationis what creates an island higher up in the structure, thus blocking the split con-struction. Yet the quantifier seems capable of escaping this island.

We can avoid the paradox if things proceed as in (27), where parenthesizedelements represent traces of movement. (27a) is the source structure. (27a) and(b) may apply in that order or the inverse, depending on one’s assumptionsabout overt A-movement (an issue that is immaterial now). Matters start beinginteresting after (27d), which takes place in the LF component. Note that QRhas to carry every outside the scope of many of. It is not enough to piggyback onother grammatical sites of every because the logic of the analysis forces theupper occurrence of many of to undergo interpretation, thus yielding the neces-sary island. Very significantly, also, QR must take place before chains are iden-tified, or else it would be as impossible as the split construction. Reprojectionsare presumably automatic after chains are identified (we do not code the repro-jection of every not to clog the picture).

(27) a. �-relations established:[many of [V every]]

b. Accusative Case checked:[every [many of [V (every)]]]

c. Nominative Case checked:[many of [every [(many of) [V (every)]]]]

d. QR of every to wide-scope site:[every [many of [(every) [(many of) [V (every)]]]]]

e. Chain of many of identified:[every [many of [(every) [(many of) [V (every)]]]]]

f. Reprojection of many of :[every [many ofi [(every) [(many ofi) [V (every)]]]]] X|DP (island)

This order of things is surprising, and suggests that either QR is an overt process(Kayne 1997) or else that what is involved in split constructions is a post-LFinterpretive phenomenon. Either way, what is important to note is that syntacticprocesses like QR and wh-movement do not seem to be sensitive to QI islands,which clearly puts them in a different component. Our approach appears to becompatible with either early QR or late split constructions, however one furtherconsideration may help us decide.

Given the way we reasoned out the induction of LF islands through interpre-tive cascades, after reprojection there should be no way that a reprojecteddeterminer can syntactically connect to its own trace in the �-position (the chainwould either not be uniform or not satisfy its regular command conditions):

D E R I V A T I O N S

130

Page 142: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(28)

This must mean that the relation between the determiner and the �-position isnot a syntactic relation, but rather involves direct binding from the determinerto the variable in the �-position.

There is another canonical instance where disconnected constituents find eachother, presumably in terms of the independently needed notion of “antecedence”:

(29) a. * Which person did you hire him after meeting t?b. Which person did you hire t after meeting him?

(29a) is a typical CED effect. However, as (29b) directly shows, the questionoperator can, from its commanding position, bind into the adjunct clause. What-ever prevents the chain relation in the first example does not prevent the corres-ponding antecedence relation in the second (see Chapters 3 and 4). Similarly, abinary quantifier may bind the variable in the �-position, even if it does notform a chain with it. Of course, it must be the head D that binds the variable,since only it commands the variable, given the syntax in (28b).

One cannot say, however, that split relations are also of this sort, preciselybecause they cannot take place in the conditions just described. So if post-LFprocesses involve “antecedence,” then split relations may not, at least notmerely. This pretty much entails that, in fact, there are not two options for thestate of affairs implied in (26), and instead QR takes place prior to the LFcomponent. If so, it would be immune to interpretive islands that arise afterearly chain interpretation prior to reprojection.

11 A note on unary determiners

To this point our main focus has been on binary determiners, which, we haveargued, require their arguments to be ordered. This ordering has been treatedas relatively similar to �-role assignment and, we have assumed, must be donewithin the domain of the determiner. This assumption is what drives repro-jection:

(30) D orders P if P is in the domain of D.

Until now, we have said nothing about how to interpret unary determinersexcept to observe that being intersective the arguments of such a determiner

b.a. Move prior to reprojection: After reprojection:XP

X'DP

D YP

t… …

XP

XPD'

D YP

t… …

NO CONNECTION

L A B E L S A N D P R O J E C T I O N S

131

Page 143: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

require no specific ordering. Given this, we can maintain that in contrast tobinary determiners, unary ones do not specify their arguments at all. What thenis the syntactic relation between a unary D and the NP it binds?

We can answer that by strengthening (30) as (31):

(31) D orders P if and only if P is in the domain of D.

(31) and the assumption that unary Ds do not order their arguments result inthe conclusion that in the weak reading of (32) many does not take men as anargument.

Plausibly then many is a D adjoined to the nominal whose variable it binds:

(32) a. many men leftb. [NP [D many] [NP men]]

Adjunction of the sort proposed in (32b) has a nice correlate, under the assump-tion that adjunction of X to Y does not alter fundamental structural relations:

(33)

That is, if “. . .” is the command domain of Y in (33a), “. . .” is still the commanddomain of X adjoined to Y (33b). This can be independently shown, given thefact that a verb raised to Tense, for instance, behaves as if it were in situ withregards to its arguments (the intuition behind Baker’s (1988) GovernmentTransparency Corollary). Then in (32a) many can command left and bind it, aswell as men, in virtue of being adjoined to the latter.

That permits us to provide a standard semantics for examples like (32a)which end up looking roughly as (34).

(34) many x: [(men x) (left x)]

Observe that this comes very close to analyzing the binding powers of unaryquantifiers as akin to those of adverbs of quantification such as occasionally orsometimes. Whether that means treating them adjectivally (as in the classicMilsark 1977 analysis), or in some other way, still consistent with the geometryin (39b) (as in Herburger 1997), is immaterial to us.

There is one important point that this proposal turns on. We are distinguish-ing the binding powers from the argument taking powers of a determiner. Alldeterminers, unary and binary, bind arguments and “close off” variables inpredicates. However, only binary determiners must order these arguments.Binding per se does not require reprojection (or even projection). Whatdoes require reprojection is ordering the arguments (akin to thematic struc-turing). Binary determiners order their predicate arguments and so must repro-ject. Unary ones do not so order their arguments and so require nothing ofthe sort.

Z

…Y

a. Z

…“Y”

b.

YX

D E R I V A T I O N S

132

Page 144: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

12 More data and a comparison with a semantic account

An account like Honcoop’s, or any related proposal based solely on the islandeffects observed throughout this chapter, will of course come short of extendingto the incorporation facts discussed in Sections 7 and 8. It is worth exploringwhether other facts can distinguish between our syntactic account and a seman-tic approach.

The logic of a proposal like Honcoop’s, or others alluded to in Honcoop’sthesis (including Szabolcsi and Zwart 1993, which started the series) goes likethis. A complex expression (A, B) cannot be split across an “island” domain Dbecause of some crucial semantic property of D. For example, for Honcoop Acannot relate to B across D because D disallows, more generally, all instances of“dynamic anaphora.” Thus compare:

(35) a. Every child brought a bike. *It got parked in the garden.b. A child brought a bike. It got parked in the garden.

A bike cannot hook up to the pronoun it in (35a), in the way it can in (35b); theonly difference between the sentences is in terms of their being introduced byevery as opposed to a, and we can give an explanation for these binding effectsin standard dynamic binding terms. The question is, can that be the generalexplanation for the contrast in (36)?

(36) a. *Nobody thought every child brought a damn bike.b. Nobody thought a child brought a damn bike.

Certainly, Honcoop’s account does work for this particular example, where therelevant dynamic binding is between nobody and the NPI, across a domainintroduced by either every or a. And of course that is compatible with every-thing we have said, which could be seen as nothing but the syntax for thosesemantics. However, is there a way to tease the two accounts apart?

One possibility may be to construct an example which satisfies dynamicbinding properties while still disallowing split constructions. Consider in thisrespect (37):

(37) a. Every child can bring a bike. It should, however, not be left in thegarden.

b. *Nobody thinks that every child can bring a damn bike.

It is known that the dynamic binding effect in (35) gets attenuated in modal con-texts, as seen in (37a). Nevertheless, a context of that sort has apparently noattenuating effect on QI islands, as (26b) shows. This is unexpected inHoncoop’s terms.

At the same time, consider (38):

(38) A child didn’t bring a bike. *It didn’t get parked in the garden.

Dynamic binding predicts the impossible sequence in (37). Negation acts as ablocker, just as binary quantifiers do; hence it is expected to produce a QI island

L A B E L S A N D P R O J E C T I O N S

133

Page 145: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

effect. This is not testable with negative polarity items (since they would getlicensed by the intervening negation itself). But negation is known to blockother split constructions. This is unexpected for us, unless we can show that –somehow – negation forces reprojection. We will not attempt this now.

13 Conclusions and further questions

This paper has explored a syntax for binary quantification, in terms of a straight-forward possibility that bare phrase-structures allow within the MinimalistProgram. In essence, independently moved, binary quantifiers are allowed toreproject after meeting whatever syntactic requirements carried them to wher-ever they are. As a result, the semantic arguments of a quantifier are also itsnatural syntactic dependents, in terms of domains of the sort already needed toaccount for the specifications of �-theory.

The proposal has immediate consequences for various LF processes, includ-ing the possibility of undergoing chain identification or reconstruction. In anutshell, reprojected structures drastically affect the shape of the LF phrase-marker, which in turn disallows certain crucial properties that normally obtain,for instance under command.

Assuming chains must be uniform as they are identified by the system, thereprojection of structures will necessarily follow chain identification. This chainidentification will result, inasmuch as it involves the minimal phrase-marker Xthat includes all links of the chain-to-be-identified, in the emergence of a barrierfor LF processes that involve material internal to X and some other elementoutside X.

Right there we see direct differences between QI islands and more standardstructural islands. A sentence headed by a wh-phrase, for instance, is an islandper se. QI islands, however, emerge because the system is forced to go into earlyinterpretation, or is not able to identify a chain and give an appropriate inter-pretation to a binary quantifier at the same time. This entails that not just the“specifier” branch is a QI island, but indeed the whole structure of dependentsof the quantifier is. Thus, observe that (39b) is no better than (39a):

(39) a. Nobody gave every [critic] [a present/*a red cent]b. Nobody gave every [[critic of the movie/*a damn movie]] [a book

to read]

(39a) is what we saw throughout; but in (38b) we placed the NPI inside the firstargument of the binary determiner every. After reprojecting it, that first argu-ment is in effect the complement of the determiner, and thus would seem to bein a transparent path with regards to the NPI licenser – but licensing fails. In ourterms it should, and it does because it is not just the specifier branch thatbecomes an island, but the whole reprojected DP, since it is the entire objectthat must go to early interpretation in order to identify its chain prior to repro-jection.

It is that very logic that forced us to consider a rule of QR in the previous

D E R I V A T I O N S

134

Page 146: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

section, and indeed one that surprisingly takes place prior to split processes atLF, since quantifiers appear to be able to move to their scope-taking positionsregardless of whether they do it across other binary quantifiers. We wanted tosalvage a treatment of QR restricted to LF, by placing split processes in a post-LF component, but we could not, since we need that post-LF component forantecedence, which is presumably what allows a determiner to relate to its own�-position inside the very island that it induces.

We compared our approach to a semantic one, and although it is quite pos-sible that the two could go hand in hand, we have found some data that suggestotherwise. In particular, we have seen how in certain constructions quantifierscan induce LF islands, yet still allow dynamic binding across. This is reminiscentof what we have just said about antecedence: it would appear that bona fidebinding is just not sensitive to LF islands, and instead it is other processes thatare: split constructions (though not QR). On the other hand we have seen thatnegation both induces an LF island and disallows dynamic binding across – yetis not an obvious binary quantifier.

One last concern we have that should be more systematically explored iswhat to do with strong quantifiers that do not have the trivial form of an article,such as at least one but no more than three children (Keenan 1987). This poses aserious question for us: what does it mean to reproject at least one but no morethan? Many answers come to mind (Boolean phrases, incorporation, parallelstructures), but they should be seriously studied. Related to this is the matter ofadverbs of quantification, which we have set aside, and hopefully also the diffi-culty posed by negation.

L A B E L S A N D P R O J E C T I O N S

135

Page 147: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

7

A NOTE ON SUCCESSIVECYCLICITY†

with Juan Carlos Castillo

1 Introduction

Most analyses of successive cyclic movement under the Minimalist Programhave centered around the notion of a wh-feature in the embedded Comp. Wesuggest that this feature is spurious, given that it has no semantic import and itsonly purpose is to trigger the very movement it tries to explain. We make use ofRichards’s tucking-in, and extend it from Move to Merge. This analysis, inspiredby the Tree-Adjoining Grammar approach, allows us to trigger the effect of suc-cessive cyclic movement by postulating one single application of Move, and thenmerging elements from higher clauses below the position of the moved wh.Issues of clausal typing, typology of cyclic movement and wh-islands are alsoexplored, in connection with ideas about phrase structure that apply independ-ently to other constructions.

We will assume that successive cyclicity holds of movement transformations.The questions we try to answer are (i) why this is the case, and (ii) how it can becaptured.

Successive cyclicity was initially required in order to implement movementout of bounding nodes (Chomsky 1977b). The intuition is still alive in Chomsky(2000): phases (substitutes for bounding nodes) are impenetrable except to theiredge. If a category C manages to relate to the edge of a phase, then C may beable to relate further up in the phrase-marker. Why do edges have this privi-leged status and exactly how is the mechanism supposed to work? About thefirst question, Chomsky has little to say.1 About the second, he proposes a tech-nical solution. Partial derivations have access to a special set of peripheral (P)features, which happen to be active when successive cyclicity is necessary, andhappen to be checked at the edge of phases. Naturally, why any of this holds isjust what we would like to understand.

A different alternative exists in the literature which, we believe, has a moreplausible take on successive cyclicity. It is due to Kroch and Joshi (1985), and isbased on the formalism of a Tree Adjoining Grammar (TAG). Their idea isvery simple. A sentence like (1a) starts in a structure like (1b):

(1) a. I wonder who John thinks that Mary loves.b. [[Who]j [IP Mary loves tj]]

136

Page 148: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

In their grammatical formalism, one can literally split who from the rest of thetree in (1b), and insert “under” it an entirely separate sub-tree John thinks that,as follows (avoiding technical details):

(2) a. tree splitting:[Who]j & [IP Mary loves tj]

b. additional sub-tree:[John thinks that IP]

c. tree pasting, 1st step:[John thinks that [IP Mary loves tj]]

d. tree pasting, 2nd step:[[Who]j [John thinks that [IP Mary loves tj]]]

(2d) is the relevant chunk of structure that interests us here. One where “succes-sive cyclic movement” has taken place. The basic intuition of this approach isthat bona fide movement is always local, as in (1b). What carries a movedphrase further and further out in the tree is not movement, but tree splitting.

One can dismiss this sort of account in Bare Phrase Structure (Chomsky1995a, 1995b). If one does not have trees at all in one’s representational reper-toire, but familiar sets of terms, then it is not clear how to execute the tree split-ting and pasting operations.2 However, we believe that this is an overly narrowperspective, and that a generous reading of this proposal can be made by way ofextending a device proposed in Richards (1997), tacitly assumed in Chomsky(2000): so-called tucking-in. We discuss the details of this possibility next.

2 Tucking-in

As Chomsky (2000) points out, two natural conditions come to mind whenthinking of how the constituents of a tree should be merged. Let us start with(3), where Y is merged to X, which projects:

(3)

Suppose we want to now merge Z to X. Where do we add it? At the root of Kor at the leaves? The first of these two possibilities would satisfy the ExtensionCondition in (4), whereas the second would satisfy an equally natural conditiondemanding Local Association, as in (5):

(4) Extension ConditionAlways extend the phrase-marker as operations take place in it.

(5) Local AssociationAlways associate locally to given lexical terms.

The effect of (4) is to make K grow at the top, whereas the effect of (5) is tomake K grow at the leaves. Neither of these is preferable a priori, and it is quite

a. b.X

YX

tree notation Official BPS notationK � {X, {X, Y}}

A N O T E O N S U C C E S S I V E C Y C L I C I T Y

137

Page 149: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

possible that the empirical differences are equally insignificant, as Drury (1998)shows in some detail.

Chomsky stipulates no Local Association to the head of an already assembledphrase-marker. We shall return to this. What matters to us now is that Chomskydoes allow associations not only as in (6), but also as in (7), as is natural:

(6)

(7)

(7) is Richards’s tucking-in process.

3 Applying tucking-in to successive cyclicity

Generally, tucking-in has been used for processes of movement, as in Richards(1997). It is not obvious, however, that anything would prevent tucking in of amerged sub-phrase-marker. In other words, W in (7a) may be there because ofMove or, equally plausibly, because of Merge. Suppose this is the case.

Then we can recast the TAG analysis in present terms. To get the gist of theidea, Z in (7a) will be who in (2a), W in (7a) will be John thinks that, and therest of the structure in (7a) will be Mary loves tj in (2a). However, things cannotbe that simple or we would get (8):

(8)

[IP Mary loves t j]C

C'

C'whoj

CP

Wmerge

John thinks that

b. official BPS notation{X, {Z, {X, {W, {X, {X, Y}}}}}}

prior to 3rd merge{X, {Z, {X, {X, Y}}}}

X

X

X

Z X

Y

Leaf Associationa. tree notation

Wmerge

b. official BPS notation{X, {W, {X, {Z, {X, {X, Y}}}}}}

prior to 3rd merge{X, {Z, {X, {X, Y}}}}

Z X

X

X

W X

Y

merge

Root Associationa. tree notation

D E R I V A T I O N S

138

Page 150: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

One could of course argue for (8) as the structure of long-distance questions,but we are not ready to do that. Consider one argument against this view.

Of course, the logic of what we are saying in this chapter extends to instancesof successive cyclic A-movement, assuming it exists (but see Castillo, Drury andGrohmann 1999; Epstein and Seely 2002). Relevant structures would be as in(9), tucking in seems to be likely:

(9)

An immediate problem is what counts as the complement of (be) likely, whichwe have marked as a question mark in (9) (a related problem arises in (8),where we put that as a non-obvious complement of think). More seriously,though, observe that a question can proceed from the domain of seems to belikely:

(10) a. Why does John seem to be likely to love Mary?b. Because John always SEEMS something or other, which ends up

never being the case.

However, according to the representation in (9), the structure that seem is partof is a “left branch.” Extraction of why out of this structure should be a viola-tion of Huang’s Condition on Extraction Domains (CED) (see Chapters 3 and 4for a minimalist account). This suggests that the structure in (9) cannot be right.

The successful derivation that we have in mind proceeds rather differently.We start with the structure in (11). The embedded clause is assembled, includ-ing an interrogative Comp, whose wh-feature must be checked overtly. Whomoves to [Spec, CP], and checks the feature. Up to this point, the TAGapproach and our analysis are the same. Notice that both analyses capture theintuition that the question “who does John like?” underlies the more complex“who does Bill think John likes?”

[VP t j love Mary]to

I'

I'Johnj

IP

Wmerge

seems to be likely [?]

A N O T E O N S U C C E S S I V E C Y C L I C I T Y

139

Page 151: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(11)

From this point on, our analysis is different from the TAG one. First, there is notree-splitting operation. The effect of this operation is achieved by merginglexical items one-by-one below the C-projection.

The derivation proceeds as follows. First, we merge that, which targets IP, amaximal projection, as expected (12a). Because that is minimal, and it selects forIP, it projects a CP-projection (more on this in Section 4). Next, we merge thematrix verb thinks, which again selects for a non-interrogative Comp. The verbtargets CP and merges with it. Selection is satisfied and the verb projects a VP(12b). Next, we merge the subject of thinks, Mary. The DP needs a �-role fromthe verb, and thus the verb projects a higher VP-node, which creates the appro-priate Spec-head configuration for �-role assignment (12c).

(12) b.

VP

C'whoj

CP

C[�wh]

thinksmerge

John likes t j

IPthat

CP

a.

CP

C'whoj

CP

C[�wh]

thatmerge

John likes t j

IP

C[�wh] IP

John

CP

who C'

John V'

likes

I'

I VP

who

D E R I V A T I O N S

140

Page 152: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

For ease of exposition we have not bothered to specify a v projection, or the Tprojection in the intermediate clause; but those too would have to be tucked in,in the obvious fashion – with corresponding movements of John to [Spec, TP],and so on. Essentially, an entire derivation would have to be run in terms oftucked-in separate mergers and moves.

Notice that each instance of Merge is motivated by the usual requirements:selectional restrictions of the item that is targeted (C merges with IP, V with CP,and so on) theta-relations (Mary is theta-marked by thinks, etc.) or checkingconfigurations (movement of Mary to [Spec, IP] for Case/EPP reasons, etc.).Nothing special must be said, except that the merger does not occur at the rootof the tree, but rather in an arbitrary point in its spine.

4 Basic technical implementation

We are assuming, first, that there is a feature responsible for local wh-movement. This wh-feature is what causes the wh-element to move to the frontof the local sub-tree it is a part of. Crucially, for us the relevant feature is notwaiting at the end of a derivational cycle, but is part of the first cycle. Whether itis already in the wh-element (as suggested in Chomsky 2000) or is rather in thefirst Comp in the derivation, is immaterial for our purposes. What matters isthat the local movement must proceed, or else the rest of the account falls apart.

Second, the question arises as to how the grammar “knows” when not to con-tinue moving the wh-element. For instance, why does the tucking-in not apply in(13a) at the point illustrated in (13b).

(13) a. Mary wonders who John likes.b. [whoj C[�wh] John likes tj]

Remember that we said that every instance of Merge must satisfy selectionalrequirements of the item merged. When faced with a structure like (13b), averb that selects for a wh-complement, like wonder, has no choice but to merge

c.

VP

C'whoj

CP

C[�wh]

Marymerge

thinks

V'

John likes t j

IPthat

CP

A N O T E O N S U C C E S S I V E C Y C L I C I T Y

141

Page 153: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

at the root, because this is the only place where its selectional requirementis met.

What prevents a mistaken tucking-in? Evidently, tucking-in is just a possibil-ity that Move and Merge have. Hence, whatever can be moved/merged, accord-ing to whatever governs those processes, can also be tucked in. In the exampleabove, we were just dealing with separate words. But surely entire phrases couldbe tucked in as well, for instance, the man instead of Mary in (12):

(14)

What could not be tucked in is Peter thinks, for instance, since that is not a validphrase (it lacks a complement, hence is impossible in BPS terms). This generalline of reasoning has consequences for our account of islands, as we nowproceed to show.

5 wh-islands

Observe the structure in (15):

(15)

If we could tuck in John wonders why Bill said that, then it is not clear why (15)should be an island violation, as it is. (This is another argument against doingthings as in (8).) We can prevent that if the complementizer that requires an IPcomplement, which it does not have in (15).

That still raises the question of why one could not tuck in Mary loves tj as thecomplement of that. The unwanted derivation would be extremely bizarre. It

[IP Mary loves t j]C[�wh]

C'

C'whoj

CP

Wmerge

John wonders why Bill said that

VP

C'whoj

CP

C[�wh]

merge

thinks

V'

John likes t j

IPthat

CPthe man

DP

D E R I V A T I O N S

142

Page 154: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

would go all the way up to (15), but then it would literally move the IP Maryloves tj to the complement of that, as illustrated in (16).

(16)

One way to rule this out is by having basic theta-relations satisfied in terms ofMerge, not Move, as in Hale and Keyser (1993) and assumed throughout inChomsky’s program. The derivation just outlined would violet the implicitcondition.

One might be tempted to try (17) instead:

(17) a. Separate whoj from Mary loves tj

b. Assemble John wonders why Bill said thatc. Merge Mary loves tj as a complement to thatd. Merge whoj to John wonders why Bill said that Mary loves tj

The problem is in (17a). Our proposal is not separating whoj from Mary loves tj.Once those objects have been assembled, that structure is indeleble. Only amirage may make one suppose that one is literally separating anything whiletucking in. In fact, tucking in simply adds more structure to an already existingchunk, and hence the impossible island violation never arises this way.

Another stab at the bad sentence would be as shown in (18).

(18) a. Start with whoj Mary loves tj

b. Tuck in, successively, that, said, Bill, why, wonders, John(and corresponding functional categories)

wonders

John

C'

C[�wh]

CP

whoj C'

tkV'

VP

why C'

CP

C[�wh] VP

Bill V'

said CP

that IPk

Mary loves t j

move

A N O T E O N S U C C E S S I V E C Y C L I C I T Y

143

Page 155: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

This case is trickier, given that (19) is an intermediate derivational stage:

(19)

If we are allowed now to insert why, we are doomed. Successive insertion ofwonder will license why, and then nothing else would prevent whoj from risinghigher, thereby licensing the unwanted island.

To prevent that, let us look at step 4, prior to the insertion of why:

(20)

The intuition to pursue is that there is a problem in the intermediate constituent{C, C�}, marked with boldface in (20) where two elements have exactly the samelabel: C. Of course, in BPS terms there is no such thing as C�; its label is reallyC. The element does have some internal constituent structure: {C, IP}. Butsuppose the grammar has no access, at least for category identification purposes,to the contents of a category; rather, it just sees the label. Then there is no waythe grammar has of telling apart the two objects in the constituent set {C, C�};the relevant set can thus not be formed (it is not a set of two – by assumption,identical – elements), and merger of further elements to C is then impossible ifthey are to be thought of as integrated in the phrase-marker in (20).3

Another way of putting what we are saying is that clausal typing is, to someextent at least, a consequence of long-distance relations. Why grammars shouldovertly type clauses is far from obvious, but if our reasoning is correct, if theydid not, there would be no way of allowing tucking in, at least in the crucialspots that are relevant for long-distance movements, which involve several Ccategories.

[IP Bill said that Mary loves t j]C[�wh]

C'

C'whoj

CP

C[�wh]

VP

C'whoj

CP

C[�wh]

said

V'

Mary loves t j

IPthat

CP

Bill3rd tucking-in

2nd tucking-in

1st tucking-in

D E R I V A T I O N S

144

Page 156: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Apart from giving us some insight into clausal typing, the suggestion abovepredicts the following possibilities:

(21) a. typing�whb. no typing�whc. retyping in conflict cases

(21a) is the situation just seen for English, yielding the obvious results: possiblelong-distance wh-movement when no two C’s clash, but impossible long-distance wh-movement when two C’s clash (that is, a wh-island). (21b) may bepossible in a language, unless wh-typing is necessary for some independentreasons (e.g. wh-selection). In this language we expect no kind of long-distancewh-movement, as mere tucking-in would result in a C clash. Madurese is anarguable instance, if Davies’s (2000) claim that it does not have long distancewh-movement holds true, and indeed no clausal typing is ascertained in the lan-guage. Finally, (21c) would arise if a given language has an extra mechanism forre-typing identically typed merged C’s, of the sort in (20). If this possibilityexists, the language in question would allow movement across a wh-island.Perhaps familiar instances from Romance, Slavic and others fall in the right cat-egory (again, pending some serious research on the putative retyping mechan-isms). In sum, we suspect that all these possibilities exist, but our point now isjust to observe the logic.

Note, to conclude, that we are forced to say that all instances of movementacross wh-islands involve the same sort of C, or else the account just proposedwould fail. This may be significant for instances of long-distance topicalization,for example, which is known to be sensitive to wh-islands, at least in some lan-guages.

6 Other islands

We are forced to say that islands fall into two broad categories: those that arisebecause of the sort of situation outlined in the previous section (C clash), or elseislands that arise because of impossible configurations or derivational routes. Acouple of instances of the latter sort are worth mentioning.

For example, if Chomsky (2000), and more generally Chapters 3 and 5, areright in devising a highly dynamic system with Multiple Spell-Out, whateverisland properties arise from this system will be relevant for us, rather trivially. Itis true that with our proposal much of the need for Chomsky’s specific phases,and some of their properties (in particular, edge-targeting and transparency) isobviated. Nonetheless, it is possible that other emerging islands arise in thesystem, for instance “cascades” for subjects or adjuncts. If movement is imposs-ible from those domains of Spell-out, that result will immediately carry throughto our system.

More generally, it is conceivable that some islands will arise because we aretucking in wrongly, for instance as suggested at the beginning of the previoussection. Tucking in has no properties, other than those of Merge and Move, but

A N O T E O N S U C C E S S I V E C Y C L I C I T Y

145

Page 157: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

those operations do have rather specific properties which limit the array of pos-sible outputs.

7 Conclusions

Our main goal in this chapter was to re-present the TAG analysis of successivecyclic wh-movement by making it fit into the Minimalist machinery. Wh-phrasesmove to the first Comp in the derivation. From this point on, we concluded thatthe technical details of the tree-adjoining mechanism need to be changed tocomply with BPS requirements. Instead of building a whole subtree for thematrix clause, the lexical items are merged one by one, by tucking them inunderneath the original Comp. In the process of the derivation, there will becertain points at which two Comps are next to each other. If the computationalsystem does not have a way to distinguish them, an island violation arises.

Since our analysis assumes that wh-phrases move only once in the course ofthe derivation, the wh-feature that triggers this movement (be it on the wh itselfor in Comp) is all that is needed to derive successive cyclic movement. Thisaccount does not need to stipulate additional features whose only purpose is toallow the derivation to proceed later on.

Further research is needed to account for the usual phenomena involved insuccessive cyclic movement, such as that-trace effects, Complementizer agree-ment (both of the Celto-Germanic and the Chamorro type, at least), inversionin embedded and matrix clauses, the argument-adjunct asymmetry regardinglong distance extraction, and of course reconstruction effects involving inter-mediate C domains.

D E R I V A T I O N S

146

Page 158: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

8

FORMAL AND SUBSTANTIVEELEGANCE IN THE MINIMALIST

PROGRAM

On the emergence of some linguistic forms†

1 General considerations

The surprising fact about minimalism, in my view, is not that we seek economy,but that we actually find it. Biological evolution, to begin with, does not explainit, if seen in the realistic light that Gould (1991: 59–60) provides:

Those characteristics [such as vision] that we share with other closelyrelated species are most likely to be conventional adaptations. Butattributes unique to our species are likely to be exaptations.1 . . . As anobvious prime candidate, consider . . . human language. The adaptation-ist and Darwinian tradition has long advocated a gradualistic continu-ationism . . . Noam Chomsky, on the other hand, has long advocated aposition corresponding to the claim that language is an exaptation ofbrain structure. . . . The traits that Chomsky (1986b) attributes to lan-guage – universality of the generative grammar, lack of ontogeny, . . .highly peculiar and decidedly non-optimal structure, formal analogy toother attributes, including our unique numerical faculty with its conceptof discrete infinity – fit far more easily with an exaptive, rather than anadaptive, explanation. [My emphasis.]

Of course, one must be careful about what is meant by “non-optimal structure.”The structure of language is not functionally optimal, as garden paths show

for parsing structure, and effability considerations (the grammar allows us to sayless than we otherwise could) for producing structure. Lightfoot (1995) stressesthis aspect of Gould’s view, in the familiar spirit of Chomsky’s work. Thenagain, the issue arises as to whether the structure of language is non-optimal aswell, as the prevailing rhetoric of the 1980s presumed. The view at that time wasthat finding non-optimal structures is an excellent way of defending the speci-ficity of the linguistic system as a biological exaptation (hence as a natural,independent phenomenon of mind in the strongest sense, with little or no con-nection to communication processes and all that). However, Chomsky (1986b)already showed that the linguistic practice was far removed from this rhetoric.Thus, the working details of this piece of research showed an example of opti-mality in syntax, as exemplified by the notion of Movement “as a last resort.”

147

Page 159: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

The piece also assumed categorial “symmetry,” a research program which hadby then been developed into the X�-theory.

And in fact the book made significant use of the research strategy of elimin-ating redundancy within the model, a trait of Chomskyan linguistics whichmakes its practice closer to that of the physical sciences, rather than to the exer-cise of evolutionary biology.

In the 1990s, Chomsky explicitly asks “. . . to what extent [standard] principlesthemselves can be reduced to deeper and natural properties of computation. Towhat extent, that is, is language ‘perfect’, relying on natural optimality con-ditions and very simple relations?” If the empirical answer is: “to some extent,”(realistic) biological evolution invoking exaptations (see Note 1) will havenothing to say about it. In class lectures (Spring 1995), Chomsky partlyaddresses this issue when comparing linguistics to another special science,chemistry, rather than to standard biology. Chemistry faces the question of howunstructured matter assumes organized form. Linguistics deals with howunstructured features are organized into syntagms and paradigms. The set ofprinciples that is responsible for determining linguistic structure is, to someextent, comparable to the set of principles with parallel chemical effects. Andalthough this is admittedly pushing it to a point which Chomsky might only veryspeculatively ever raise, it is imaginable that there is a deep level at which thesesets of principles, which determine certain pockets of regularity, can be relatedin ways that go beyond the metaphorical. Recent work on molecular biology(where, for instance, García Bellido (1994) literally talks about “phonology andinflection in genetics,” “molecular sentences” or “syntactic analyses”) suggeststhat the connection might not be totally impossible, or implausible.

Be that as it may, it is good to keep different sorts of economy in perspective.On the one hand, analyses in terms of what we may call static elegance have beenvery much part of the tradition of linguistics, and helped determine “best theor-ies” or “learnable languages.” But on the other hand, minimalism makes funda-mental use of the notion of what we may call dynamic elegance, when decidingamong alternative derivations that converge on the basis of their least action.Fukui (1996) argues that this is an instance of the classic problem of calculus ofvariations, as is found in various areas of mechanics, electromagnetism, quantumphysics, and so forth (see Stevens 1995). Given various paths that, say, a ray oflight may take from an initial point 0 to a final point f, the path that light actuallytakes is determined in terms of it being the one involving least action. Likewise,given various derivational paths that may be invoked when going from a point 0in a derivation to its final point f, the path the derivation takes is determined interms of its fewest computational steps. This behavior of the linguistic systemdoes not follow from its static elegance (bare output conditions). All competingderivations converge, and in that sense are equally elegant. In fact, there is aglobal character to dynamic elegance that we do not witness in static elegance.This is a consideration arising in calculus problems, where the underlying issue isthe examination of a set of variations and a way of deciding on the adequacy ofthe solutions they provide.2 While there is a univocal solution to the problem “Is

D E R I V A T I O N S

148

Page 160: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

this structure grammatical?” (yes or no), there is no univocal solution to theproblem “Is the derivation of this structure grammatical?” Whether this solutionis valid depends on whether that solution is better; what is one bad solution for astructural context may be the best we have for an alternative context.3

The questions we face are not new. The substantive properties of water mole-cules, for instance, tell us little about the macroscopic behavior of a creek, andthe whirlpools of turbulence it forms. Examples of this sort abound, involvingglobal properties of systems of some sort. To use a fashionable term, these areemergent patterns that, for some reason, obtain of complex systems, in unclearcircumstances. There is much talk these days about systems being comparable atsome level, the behavior of some helping in the understanding of the behaviorof others (see Meinzer 1994). I am not really sure what this means, though,particularly if we are supposed to be talking (at least in some central instances)about systems whose behavior is predictably unpredictable. More to the point, Ihave not seen any proposal explaining any linguistic principle in terms of anyso-called principles of self-organization. Even Fukui’s interesting proposal thatlinguistic derivations involve the calculus of variations falls short of establishingwhat the relevant Lagrangian would be.4 Nonetheless, whether questions per-taining to the dynamic elegance of the system will be expressible in familiarterms is something we cannot determine a priori.

Consider, in that respect, Chomsky’s treatment of the Last Resort Condition(LRC) as a derivational requirement. A derivation that does not meet the LRCwhen generating a chain is impossible, and it is cancelled at the point of viola-tion, without ever reaching a convergent or even a crashed LF. One may ask whythe LRC should hold of derivations, and Chomsky (1995b) speculates that con-ditions of this sort reduce the range of alternatives to calculate a solution from.The system does not even include in the reference set of derivations thatcompete for optimality those which are cancelled somewhere along the way,because they failed to meet the LRC. This reduction of possible variations of thesystem recalls Haken’s (1983) Slaving Principle. Unstable modes in a systeminfluence and determine stable modes, which are thereby eliminated at certainthresholds.5 A new structure emerges which results from unstable modes servingas ordering devices, which partially determine the behavior of a system as awhole. We may thus think of the LRC as ordering a region within the macro-scopic behavior of a derivation. That is, at any given stage t in a derivation thereis a potentially very large derivational horizon ht to contemplate, if the systemproceeds according to standard derivational laws. This derivational horizon isdynamic, in that the system could in principle proceed forward in various ways.But suppose that certain regions in the derivational horizon (those correspond-ing to domains of morphological checking) are less stable than others. In particu-lar, let it be true that a strong feature is a “virus” that the computational systemmust eliminate, immunizing the derivation from it as fast as possible.6 There is asense in which this area within the derivational horizon provides a point of insta-bility (as derivational complexity increases), which according to the Slaving Prin-ciple should enslave other competing modes that lead to other horizons. That is,

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

149

Page 161: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

while derivations that move to “immunize” an intruder feature are attracted to apoint of instability, derivations whose movement is “idle” (or existing just toreach an interpretation) are not attracted to a slaving vortex. Thus, the deriva-tional system does not even try derivations that conform to less unstable modes,any more than, given a particular vortex within a water creek, turbulence willhave a high probability of emerging (as fluid velocity increases) anywhere otherthan within the confines determined by that very vortex.7

Frankly, I am not trying to give this speculation so much as an explanationfor LRC, as an example of the sort of approach that we may attempt. It hasadvantages and disadvantages. The advantages are that invoking this sort ofline, we cannot be accused of gradualistic continuism (the LRC holds of deriva-tional systems because, in this way, they are easier to use in speech productionand processing, thereby giving us an argument that elegance within language is aresult of an adaptation) or of Platonism (the LRC holds as a defining property ofderivations because the nature of these processes is purely formal, and that is justhow they work, thereby giving us an argument that language is a mathematicalobject). But the disadvantages should be obvious.

First, we run the risk of applying the latest complexity tool to whatever con-dition we may want to derive; we might even be tempted to bend the linguistictheory beyond recognition, for it to fit with the outside principle. However, itwould be surprising if traditional linguistic principles all of a sudden start fallinginto place within those principles which are beginning to emerge in the emerg-ing science of complexity. More likely, we will find nothing but similarities touse more or less metaphorically – and probably because many things resemblemany other things out there. Second, we may be escaping some form of Platon-ism to fall into yet another form, albeit at a larger scale (“reality is a mathe-matical object”). Why the LRC should follow from the Slaving Principle issomething that should worry us. Fortunately, here we are in the same boat aseveryone else within the special sciences, and perhaps we should not worryabout this any more or less than biologists should worry about whether theSlaving Principle played any role in the emergence of life, say.

So I suppose the moral is the usual one. We have to keep doing our dirtywork, assuming that the end of syntax is nowhere nearer than the end of chem-istry or biology is. However, doing work now involves two different, legitimatejobs. One is the usual analytic one, perhaps revamped with new tools. The otheris a bit more abstract. Suppose we find phenomenon P, and characterize it interms of conditions x, y, z. Have we found x, y, z properties of the linguisticsystem? The sorts of issues raised in this section allow for the possibility thatphenomenon P may be an emergent property – like the spiral pattern in a snailshell – which arises dynamically from various systemic interactions. We mayeven have ways of coherently characterizing the properties of phenomenon Pdirectly, by way of some function using x, y, z as variables – just as we can calcu-late the snail shell as a given logarithmic function. And yet, minimalism invitesthe suspicion that phenomenon P may not, in itself, instantiate any axiomaticproperty of the linguistic system – any more than a logarithmic function per se

D E R I V A T I O N S

150

Page 162: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

instantiates any axiomatic property of growth in the biological system.8 And tomake life interesting, telling gold from glitter will not be nice or easy.

In what follows, I raise this sort of question with respect to the phenomenonof obviation – which I believe is partly emergent. It involves two structural rela-tions, and one (perhaps more) interpretive relations. Let us call the latter Rela-tion R. As for the former, full referring expressions (names and definitedescriptions) involve only command: � must be obviative with respect to � onlyif � commands �. In contrast, pronominal elements involve command and local-ity. So this is perhaps a candidate for phenomenon P above, obeying conditions x(R), y (command), and z (locality). These sorts of conditions are articulated intoa theoretical whole in various places that I return to, but most notably inChomsky (1981) and (1986b). I want to propose, instead, that each of the rela-tions involved in obviation, and the corresponding structural correlates that theyinvolve, follow from different properties of the language faculty. If so, seeking aunified theory of obviation is akin to seeking a theory of shell patterning.

2 Command paths and Multiple Spell-out

Let me start by summarizing the proposal in Chapter 3, where it is shown thatcommand paths emerging in the PF and LF components are a result of MultipleSpell-out. This is important in itself: Spell-out being just a rule (not a level), itshould be able to apply anywhere, anytime. But the proposal is of relevance tous now because it makes command relations natural emergent properties of thesystem, thus not something that has to be re-stated at the PF or the LF com-ponents, even if playing a role there as well.

To begin with, Chapter 3 deduces Kayne’s LCA:9

(1) � precedes � iff: (a) � commands �; or (b) commands � anddominates �.

Throughout, I assume that command is a direct reflex of merger (M), adaptedfrom Epstein (1999):

(2) � commands � iff � merges with , reflexively dominating �.10

Then command is the only relation which is derivationally defined (via M) for aset of heads within a derivational block. Compare (3a) and (3b). The boxes in(3) define “derivational blocks” or monotonic applications of M.11 Given anytwo categories X and Y within a derivational block, either X and Y are merged,or otherwise X is merged with (the projection of) a category which Y hasmerged with, etc.:

(3) a. b.

{�,{�,�}}

{�,{{,{, }},{�,{�,�}}}}

{,{, }}

{�,{�,�}}

{�,{{�,{�,�}},}}

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

151

Page 163: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Call the object that results from monotonic applications of M a “commandunit.”12 Command units emerge within a system with usual assumptions.

To deduce the base step of the LCA in (1a), note that it encodes two formaland one substantive requirement. The latter is that unordered structural rela-tions map to precedence ones. Chomsky (1995b) assumes plausibly that thisfollows from bare output conditions on PF (contra Kayne 1994). The firstformal requirement expressed through (1a) is that the sequence of PF headsshould be structured in terms of already existing structural relations; the secondformal requirement, that the PF sequence should map specifically to familiarprecedence relations.13 Neither requirement follows from the substantiveassumption, but they are optimal routes to take. First, command is the onlydeduced structural relation that holds of heads which are structured within“command units”. If the LCA attempts a mapping from already existing rela-tions among heads, only command could be relevant without any additionalcost.14 Second, mapping the command sequence ��, �, ,…� to a sequenceof PF timing slots �1, 2, 3, …� in a bijective fashion is optimal. For a two-dimensional space, mapping the first sequence to the x axis and the secondsequence to the y axis, the relevant function is the trivial x�y; alternatives tox�y involve more operational symbols.15 Therefore the base-step of (1) followsfrom piggy-backing on M’s derivational history, given dynamic economy.

Chapter 3 then proceeds to deduce the induction step in (1b) in a less directfashion, since there is no trivial way of deducing the fact that domination isinvolved in this step. The proposal is that (1b) is a result of applying Spell-outmultiply, each time according to the base step in (1a). Assuming an LCTheorem (the base in (1a) being deduced), only command units can be lin-earized by the application of a transformational procedure L at Spell-out. L isan operation L(c)�p, mapping command units c to intermediate PF sequencesp (see Note 15), and removing phrasal boundaries from c representations:

(4) { �,{ �,{ �,{ �,{ ,{ , }}}}}}→ { �, � �, �, , � }

The output of L is a labeled word sequence, with no constituent structure,hence, not a phrase-marker. It is still a labeled object, though, and in that senseakin to a frozen expression whose internal structure is inaccessible to the com-putational system, but can nonetheless be part of a phrase-marker. We can thenproceed to merge it further, which will yield a new command unit for L to lin-earize. This is Multiple Spell-Out, which makes (1b) true, but derived.

A complex phrase-marker involving several derivational blocks must bespelled-out prior to M, or the derivation will crash at PF (once (1b) has noaxiomatic status). But note, crucially, that partially spelled-out objects mustbe mapped to LF prior to the phrasal flattening induced by L. Thus generalizedmerger (of the sort in (3b)) has to produce two objects, a linearized one forPF, and a hierarchically unordered one for LF. This entails a dynamically bifur-cated model, of the sort sketched by Bresnan (1971), Jackendoff (1972) andLasnik (1972, 1976: appendix), and recently revamped by Lebeaux (1991). ThePF and LF components are not mapped in a single static step, but through a

D E R I V A T I O N S

152

Page 164: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

cascade of derivational events that end up converging (or not) on finalrepresentations.

This property of the system is central, given that a dynamic bifurcated modelmust have incremental characteristics (so as to assemble partial structures untilfinal representations are achieved). The ultimate assembling operation has to besubstitution, as in Chomsky (1995b: Chapter 3). The spelled-out object servesthe purpose of Chomsky’s designated substitution target 0, for it has a label(allowing substitution of the right sort), but no phrasal structure. Structure isgained only by substituting into the partial p and l representations, already“stored” in the LF/PF components upon previous instances of Spell-out:16

(5) a. Standard merger and Spell-out of first derivational block:

b. Standard merger and Spell-out of second derivational block:

c. Generalized merger of first command unit and second command unit:

Note that PF processes must happen within intermediate p representations, orthey will not be constructive processes. For instance, determination of a focusprojection (as in Cinque 1993), or comparable phonological processes of thesort discussed in Bresnan (1971), must take place prior to the integration of�p1, p2, p3,… pn� sequences into a final phonetic representation. Afterwardsthe representation will be in the interpretive A/P components. We can in prin-ciple talk about representational units within this phonetic object, but thesehave to be justified on the basis of substantive bare (A/P) properties. It isunclear what such property a focus projection, say, could follow from. In con-trast, a focus projection path is a natural object to expect in partial representa-tions p. In fact, Cinque’s characterization is essentially in terms of command(right) branches, as the dynamic bifurcated model correctly predicts.17

Comparable issues arise for the LF branch. Just as PF processes must happen

{�,�{,…},�,��}

{,�, �}

Final PF result:

{�,�{,�, �},�,��}

{�, {{,…},{�,{�,�}}}}

{,{, }}

Final LF result:

{�,{{,{, },{�,{�,�}}}}

Substitution S(p1, p2) Substitution S(l1, l2)

{�,�{,…},�,��}

L {�,{{,…},{�,{�,�}}}}

{�,{{,…},{�,{�,�}}}}

{,…} {�,{�,�}}

� �

{,�, �} L {,{, }}{,{, }}

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

153

Page 165: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

within intermediate p representations, LF processes must happen within inter-mediate l representations – or they will not be constructive processes. Then ifonly partial representations l can be the locus of LF constructive (i.e. formal)processes, it follows that only command units could be the locus of LFuniversals.18 Again, we could in principle talk about representational unitswithin the final semantic representation, but these would need substantive (I/C)justification.

In sum, command is indirectly relevant for LF (or PF) units because these aredynamically mapped from partial representations where only command holdsamong word-level units.19 If this proposal is correct, it does not just happen tobe the case that LF dependencies are sensitive to command (as opposed toother imaginable relations). Matters could not have been otherwise. Relation R,the substantive relation involved in obviation, instantiates a sub-case of thisgeneral property of LF.

Of course, we could define other relations, but whatever step we take in thisdirection has serious consequences, given these (rather crucial) assumptions:

(6) Assumption about the Inclusiveness of the LF ComponentAll LF representations are the expression of lexical relations.

(7) Assumption about the Uniformity of the LF ComponentAll LF processes are observable in some form prior to Spell-out.

Given (6) and (7), simply stipulating a specifically covert relation like “is-in-the-same-phrase-marker-as,” will be non-inclusive, since this hypothetical relationwould be determining representations which are not the expression of lexical fea-tures. Suppose, then, that we want to deduce the relation in question from someconstructive procedure. Saying that it is a procedure specific to the LF compo-nent would be non-uniform, since the procedure in question would not beobservable in the overt component. This means the only possibility left is that“is-in-the-same-phrase-marker-as” follows from the pre-Spell-out procedures. Ifthe architecture I have discussed is anywhere near right, such procedures neveryield a global relation of the sort we are considering; rather, they are limited tocommand relations (“is-in-the-same-derivational-block-as” dependencies).

I hasten to add that the semantic component, after LF, may in fact be able toinvoke such global relations as “is-in-the-same-phrase-marker-as.” But, ulti-mately, that is a matter of performance: whatever goes on after the levels of LFand PF which interface with the sensory-motor and intentional-conceptual com-ponents. Given the present architecture, if we find that phenomenon P is sensi-tive to a structural relation that cannot be deduced from syntactic procedures(in the narrow sense invoked here), this is an immediate indication that P is notnarrowly syntactic.

That possibly is the case with Weak Cross-over, of the sort illustrated in (8):

(8) a. His son likes everyone.b. Who does his son like?c. His son likes a man.

D E R I V A T I O N S

154

Page 166: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

There is a characteristic interpretive limitation in sentences of this sort: thepronoun cannot co-vary with everyone, who or a man (in an indefinite reading).Crucially, though, there is no command relation between either of these ele-ments, and hence by hypothesis whatever interpretive relation is relevant inthese instances cannot be narrowly syntactic (i.e. takes place after LF).20

Conversely, that P is best characterized by a relation which can be deducedon syntactic grounds makes it possibly, but not necessarily syntactic. Forinstance, in light of (9), we may assert (10) (see May 1977):

(9) a. John is believed by his mother to be a genius.b. His mother believes that John is a genius.c. Everyone is believed by someone to be a genius.d. Someone believes that everyone is a genius.

(10) If quantifier � commands quantifier � at LF, then � has scope over �.

Now, even if (10) indeed obtains, it may be possible for it to result from post-LFconditions.21 This is all to say that the fact that command defines obviation doesnot in itself make obviation an immediate part of syntax. I will argue that atleast local obviation is syntactic, but the point cannot be conceded a priori.

3 Locality matters

Consider next locality. The first obvious question is why given elements shouldcare about how far away other elements are. For relations involving movement,we have an answer, the Minimal Link Condition. This poses the same sorts oftheoretical questions that the Last Resort Condition did, in terms of whether itsviolation leads to a derivational cancellation or rather a crash, and either way,why locality should be part of the phenomenon of grammar. It is worth stressingthat, when we say locality in minimalism, we mean something quite specific:

(11)

Given a command unit including ��…,�…,…, …,�…�, and whereMinD(X)� {�,�,…} and MinD(Y)� { ,�,…}, is closer to theelements in MinD(X) than the elements in MinD(Y) are, but (i) isnot closer to any of the elements in MinD(X) than to any other suchelement in MinD(X) and (ii) none of the elements in MinD(Y) iscloser to than any other element in MinD(Y).

Distance is relevant only within command units, which we expect given thepresent architecture. But locality voids distance:. Elements within the sameminimal domain are equally far as targets or equally close as sources of move-ment to or from an element that is trying to move or be moved.

Assuming the state of affairs in (11),22 I am concerned with the fact thatelements standing in a “super-local” relation of lexical (L) relatedness do not

… …[… …MinD(Y)

[ …[�MinD(X)

[�…[�

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

155

Page 167: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

interfere with one another for distance purposes. Curiously, various phenomenaseem to be sensitive to the notion of L-relatedness, as in (12):

(12) � and � are L-related iff a projection of � is in the minimal domain of�.

(13) a. Absence of distance (as in (11)).b. Head movement.c. Word formation (as in Hale and Keyser 1993).d. �-role assignment (as in Chomsky 1995b).e. A-movement (as in Chomsky 1995b).f. Distribution of Case features (see below).g. Local anaphoric binding (see below).h. Local obviation (see below).

(13b) obeys the head movement constraint.23 (13c) (words cannot be derivation-ally formed without L-relatedness of all their constituents to a pivotal head)adds to (13b) a limit on successive head movement. It only happens twice, stillcreating a word. (13d) is true for role assignment (verbs are L-related to theirinternal argument(s), and to the specifier of the v-projection, after the verbraises to the v shell). As for (13e), consider (14) below.

(14) a.

T vP

O

TP

S T'

t v VP

tV

v '

tS v '

tO

Tv

V v

D E R I V A T I O N S

156

Page 168: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

While (14a) involves A-movements which are never across domains of L-relatedness (O moves within the domain of V’s L-relatedness, S, within thedomain of v’s L-relatedness), the movement of O in (14b) is not within thedomain of L-relatedness of any category, and is impossible. This should bear onthe locality implied in the distribution of Case features (13f), although there isthe issue altogether of what it means to have uninterpretable Case features thatmoved determiner phrases must check (see Section 5). In turn, (13g) can beillustrated as in (15) (to ensure local binding, I use the Spanish clitic se):

(15) a. El se destruyó. b. *El se destruyó fotos.he self destroyed *he self destroyed photos

(“He destroyed photos of himself.”)

Finally, local obviation (13h) arises in the same contexts:

(16) a. He destroyed him.b. He destroyed photos of him.

Whereas (16a) is obviative (the reference of he and him differs), (16b) is not.To the extent that all of the phenomena in (13) involve L-relatedness, we

may think of this notion as determining a dimension of some sort, within whichcertain structural properties are conserved.24 There are plausible candidates forconservation in most of these examples. (13a) conserves locality itself; (13b), thesymmetry of head relations; (13c), lexical integrity; (13d), lexical dependencies;(13e), (A-)chain integrity. The last three, though, are less obvious. (13g) mightbe a sub-case of (13b) (as in the literature partly summarized in Cole and Sung1994) or a sub-case, essentially, of (13d) (as in Reinhart and Reuland 1993,though see Section 7). Why (13f) should be conserving Case relations is hard tounderstand, if we do not know what Case is. And as for (13h), we are essentiallyclueless. In what follows, I shall try to relate the mystery in (13h) to that in(13f), by way of a detour.

b.

T vP

S

TP

O T'

t v VP

tV

v '

tS v '

tO

Tv

V v

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

157

Page 169: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

4 Clitic combinations

The Spanish paradigm in (17) is idealized from Perlmutter (1971) (see moregenerally Wanner 1987). (The basic sentence is always: he showed y to z (in themirror) and judgments are my own.)

(17) Possible argument clitic combinations:25

Dat Acc Dat Acc Acc Dat Acc Dat

a. *me me/nos *te me/nos *me me/nos *te me/nosI.sg I.sg/pl II.sg I.sg/pl I.sg I.sg/pl II.sg I.sg/pl*nos me/nos *os me/nos *nos me/nos *os me/nosI.sg/pl I.sg/pl II.pl I.sg/pl I.sg/pl I.sg/pl II.pl I.sg/pl

b. *me te/os *te te/os *me te/os *te te/osI.sg II.sg/pl II.sg II.sg/pl I.sg II.sg/pl II.sg II.sg/pl*nos te/os *os te/os *nos te/os *os te/osI.pl II.sg/pl II.pl II.sg/pl I.pl II.sg/pl II.pl II.sg/pl

c. *le(s) lo(s) lo(s) *le(s)III(pl) III(pl) III(pl) III(pl)

d. me lo(s) te lo(s) *me *le(s) te le(s)I.sg III(pl) II.sg III(pl) I.sg III.pl II.sg III(pl)nos lo(s) os lo(s) *nos le(s) *os le(s)I. pl III(pl) II.pl III(pl) I.pl III.pl II.pl III(pl)

e. *le(s) me *le(s) te *lo(s) me *lo(s) teIII(pl) I.sg III(pl) II.sg III(pl) I.sg III(pl) II.sg*le(s) nos *le(s) os *lo(s) nos *lo(s) osIII(pl) I.pl III(pl) II.pl III(pl) I.pl III(pl) II.pl

The first surprising fact about this paradigm is that so few clitic combinationsare possible. Some ungrammatical combinations may follow from the distinctionargued for in Uriagereka (1995b) between strong [�s] and weak [�s] specialclitics – I/II clitics being [�s], and III clitics, definite determiners in nature,being [�s]. Assuming that analysis, combinations of the form �[�s],[�s]� areimpossible in a language like Spanish as a result of syntactic restrictions onmovement and clitic placement, irrelevant now. That eliminates (17e). In turn,Uriagereka (1988a: 369 and ff.) provided an account of why in Spanish the cliticordering �Accusative, Dative� is not possible (along the lines of why *I gavethe book him is out in English). Supposing this too, the paradigm would reduceto (18) below. And surely there are ways in which to reduce some of theexamples in (18) to violations of Chomsky’s Binding Condition B. However, thereduced paradigm in (18) shows that this cannot be the whole story:

D E R I V A T I O N S

158

Page 170: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(18)

Dat Acc Dat Acc

a. *me me/nos *te me/nosI.sg I.sg/pl II.sg I.sg/pl*nos me/nos *os me/nosI.pl I.sg/pl II.pl I.sg/pl

b. *me te/os *te te/osI.sg II.sg/pl II.sg II.sg/pl*nos te/os *os te/osI.pl II.sg/pl II.pl II.sg/pl

c. *le(s) lo(s)III(pl) III(pl)

d. me lo(s) te lo(s)I.sg III(pl) II.sg III(pl)nos lo(s) os lo(s)I.pl III(pl) II.pl III(pl)

In the shaded examples, the reference of the two clitics is necessarily different,given their lexical interpretation. So while a binding theoretic account exists forthe rest of the paradigm, it clearly does not generalize.

To try a different take on these problems, consider restrictions as in (19):

(19) a. the lions / * the lionsesb. the yankees / * the yankeesesc. the scissors / * the scissorsesd. the bus / the buses

Why can lionses not denote a plurality of pluralities of lions? Or yankeesesdenote the plurality of teams which carry the name and/or description (the)yankees? Or why is scissorses not the way to denote two or more simple objects?

Apparently morphology is to blame. When a plurality marker appears inlions, yankees and scissors (unlike in bus), a second one is not tolerated. Thisrestriction is known to hold only for inflections, not for derivational affixes:

(20) a. re-attach, re-reattach, re-rereattach . . .b. sub-optimal, sub-suboptimal, sub-subsuboptimal . . .c. transformation, transformationalize, transformationalization . . .

It is at first not clear what minimalism can say about (19). Certainly, an inflec-tional feature must be checked (and erased if uninterpretable) in the course ofthe derivation, unlike a derivational feature. Assuming this, though, considerthe Spanish (21c) below, which is analogous to (19a), but more conspicuous.First, note that in (21a) one of the two plural features must be uninterpretable.The data in (21c), which do not involve plurality markers on the nominals,suggest that the interpretable feature is the one in the determiner.26 Hence weassume that la “the” has the interpretable feature against which the pluralityfeature of leonas “lions” is checked:

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

159

Page 171: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(21) a. [[[la] s] [[leona] s]]the [�pl] lion [�pl]

b. * [[[la] s] [[[leona] s] as]]the [�pl] lion [�pl] [�pl]

c. la Ridruejo, Sartorius, Habsburg . . .“The female who is a member of the Ridruejo, Sartorius andHabsburg families.”

d. las Ridruejo, Sartorius, Habsburg . . .“The female who are all members of the Ridruejo, Sartorius andHabsburg families” or“the female who are each a Ridruejo, a Sartorius and a Habsburg.”

The question is why the checking in (21b) cannot proceed. Note that theproblem is not the fact that we have two uninterpretable features to checkagainst the, since the’s feature is interpretable, and hence does not erase uponchecking an uninterpretable feature that moves to its checking domain. Theproblem is not either that the most deeply embedded [�pl] feature would notbe in the checking domain of the interpretable feature, particularly if at LF eachuninterpretable feature can move (together in a feature bundle or separately) tothe relevant target. Likewise, it does not seem that one of the features is a closersource for attraction than the other one is: both are within the same localdomain.

To avoid this impasse, consider the definition of Minimal Domain:

(22) Definition of Minimal Domain:For �, a feature-matrix or a head #X#, CH a chain (�,t) or (the trivialchain) �:ii(i) MAX(�) is the smallest maximal projection dominating �.i(ii) The domain D(CH) of CH is the set of features dominated by

MAX(�) that are distinct from and do not contain � or t.(iii) The minimal domain MIN(D(CH)) of CH is the smallest subset

K of D(CH) such that for any x belonging to D(CH), some ybelonging to K dominates x.

There is a crucial modification in (22) to the standard notion of MinimalDomain, which I have italicized. We must grant this extension in the currentversion of minimalism because, otherwise, matrices of formal features whichmove by themselves at LF will never be part of a minimal domain. The exten-sion is trivial. Checking is about features, not about categories. Checkingdomains should be about features and not about categories.27 However, nowconsider the following fundamental question. How does the grammar know onefeature from another? Two identical features should be taken to be only onefeature once they are lumped into a set – and a minimal domain is nothing but aset. This is to say that if the two [�pl] features in (21b) reach the same checkingdomain (which is a sub-set of a minimal domain), they will be identified as only

D E R I V A T I O N S

160

Page 172: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

one feature in that set, and hence only one checking will be possible. This leadsto a direct crash, perhaps even to a derivational cancellation.28

A similar analysis can be given to the bad examples in (18), if we are able toshow that the crucial feature that motivates clitic movement is identical in twodifferent clitics which, following Chomsky (1995b), seek checking within thechecking domain set of the same hosting head. For all of the (a) and (b)examples in (18) we can say that the relevant feature is [�s], whose substantivecontent is “pertaining-to-the-pragmatic-axis” (that is, I and II). For the (c)examples the relevant feature is [�s], whose substantive content is that of a defi-nite article. Of course, the clitics do differ in phonological shape, but this is irre-levant. PF features never enter into syntactic computations. It may also besuggested that even if all III clitics have the same substantive character, theydiffer in the uninterpretable feature of Case. Below, I propose that it is becausearguments in the same checking domain are in fact non-distinct that they mustbe marked with Case.29 Finally, why are I and II not distinguished? Thegrammar distinguishes the fact that these are speech-oriented clitics, which Icode through feature [�s]. What I do not think to be the case is that thegrammar codes differences between I and II, even if pragmatics does.30

If these ideas are on the right track, then not surprisingly, when a combina-tion of [�s] and [�s] features is at issue, the result is perfect, as in (18d). Eachfeature is indeed identified as distinct in the checking domain.

One may be tempted to favor a semantic analysis, but the examples in (23) –corresponding to the bad sequences in (18a)/(18b) – do not make it advisable.31

(23) a. Me mostró a mí/nosotros. Te mostró a mí/nosotros.me showed to me/us you showed to me/us“He showed me to me/us.” “He showed you to me/us.”Nos mostró a mí/nosotros. Os mostró a mí/nosotros.us showed to me/us you.pl showed to me/us“He showed us to me/us.” “He showed you guys to me/us.”

b. Me mostró a ti/vosotros. Te mostró a ti/vosotros.me showed to you/you.pl you showed to you/you.pl“He showed me to you/ “He showed you to you/you

you guys.” guys.”Nos mostró a ti/vosotros. Os mostró a ti/vosotros.us showed to you/you.pl you.pl showed to you/you.pl“He showed us to you/ “He showed you guys to you/

you guys.” you guys.”

The fact that some (as opposed to none) of these combinations arise with fullpronouns indicates that there is no semantic problem with (18). But the fact thatall of the examples are grammatical is even more interesting. It suggests thata Condition B treatment of some of the ungrammatical examples in (18) is onthe wrong track. This is welcome, because otherwise we would be ruling outsome of those examples twice, through Condition B, and through a failure inchecking.

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

161

Page 173: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

One other example is significant. Compare (18c) now repeated, to (24b):

(24) a. *Le lo mostró. b. Se lo mostró.*him him showed se him showed

“He showed him to him.”

This example usually gets what I believe to be a misleading analysis. It is glossedas (24a), claiming that se in (24b) is just a phonological variant of le. But why leshould get a phonological variant in just this instance is never discussed. Like-wise, the surprising fact in (25d) usually goes unnoticed:

(25) a. Aquí se encarcela hasta a uno mismo.here se(impersonal) jail even to one same“Here, one even jails oneself.”

b. *Aquí se se encarcela.*here se(impersonal) se(reflexive) jail

c. Aquí se envía dinero a los familiares.here se send money to the relatives“Here, one sends money to one’s relatives.”

d. *Aquí se se lo envía.*here se(impersonal) se(“DATIVE”) it send

Bouchard (1984) discussed paradigms in which incompatibilities of the sort in(25b) arise. Impersonal se cannot co-occur with reflexive se. This has led variousauthors to treat all instances of se as being the same se (see Raposo andUriagereka 1996 for discussion and references). Let us suppose this is correct.Whatever indirect object se is, it cannot co-occur with impersonal se (25d). Thusit looks as if this “dative” se is se after all.32 If so, we immediately see why (24b)is grammatical, assuming (see Section 6) that se, unlike le, does not have afeature [�s], thus being compatible with the [�s] lo.33

We shall discuss in Section 8 how (24b) comes to mean something similar towhat (24a) would mean. (24b) is as good a structure as the grammar of Spanishgenerates when trying to express the meaning implied in (24a). It is plausiblethat other Romance languages avoid the sequence le lo in other ways, thus forinstance creating merged clitics like the Galician-Portuguese llo/lho a singleclitic coding the meaning of lle and (l)o, or the Italian glielo, which collapses leand lo. Why the language would need to do this is what matters for our pur-poses. Two clitics with the same [–s] feature hosted in the same head lead to anLF crash.

D E R I V A T I O N S

162

Page 174: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

5 Uninterpretable features

Recall now the English paradigm in (26)–(30) (where ! marks obviation):

(26) a. *I like me b. …you c. …him d. …us e. …them

(27) a. you like me b. *…you c. …him d. …us e. …them

(28) a. he likes me b. …you c. !…him d. …us e. …them

(29) a. we like me b. …you c. …him d. *…us e. …them

(30) a. they like me b. …you c. …him d. …us e. !…them

If the approach in the previous section is correct for (18), can we get it to handlethese familiar facts?34 In Chomsky’s (1995b) system, the Case features of adirect object move to the domain of T at LF, where they are checked against thesublabel V of T (which is created upon the independent movement of V’s fea-tures to T).35 The point is: at some stage in the derivation the Case features ofthe subject are in the checking domain of T (its specifier), and later on the Casefeatures of the object are in the checking domain of T (adjoined to it).36 How dowe keep track of those features, if they are identical and included in a set?

We may argue that the grammar does not get “fooled” because the objectCase features are [�accusative], while the subject Case features are[�accusative]. Yet, there is something peculiar in this reasoning. Why are theredifferent Case features for elements within the same domain? Because we needto distinguish subject Case checking from object Case checking. And why do weneed to distinguish those? Because both subjects and objects have Case fea-tures! In a nutshell, we are positing a distinct Case feature with the sole purposethat an object or a subject moves to a configuration that gets rid of it.

The reason that Chomsky (1995b) considers Case an uninterpretable featureis empirical. Argumental movement by LF itself would also be forced even ifthe Case features were interpretable, so long as the elements that host theCase feature at LF have themselves an uninterpretable feature. However, con-sider (31):

(31) * The man seems that t is smart.

The man moves to check the matrix D feature in T, after having checked anderased Case in the embedded clause. The only thing that goes wrong in (31) isthat the matrix “Case hosting” feature is not checked, because the regular Casefeature in the man has been erased downstairs; but that holds only if Case isuninterpretable, for otherwise Case could be checked several times.

However, while these mechanics work, they are unsatisfactory. The realquestion was and still is why there should be uninterpretable Case features inthe grammar. Note that the existence of uninterpretable features per se is notparticularly troubling. Take, for instance, uninterpretable D features of varioussorts. They code a dependency between an interpretable D feature in a deter-miner or tense and some associate. Why there should be such long-distance

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

163

Page 175: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

dependencies is an issue, but once this is assumed, it is perhaps even natural thatwe should code them through this particular device. What is harder to see iswhat is at issue in Case checking, since both the source and the target of themovement involve uninterpretable features. Is this an irreducible quirk?

Consider Case features in some detail. Unlike intrinsic features (like D orwh-), Case features are relational. A D feature, for instance, has a value “�”which needs no further validation and allows the D feature to appear in certainchecking domains. Case is different. If a Case feature has value “accusative,”that value needs to be checked, specifically, against a “Case hosting” feature “I-check-accusative.” In fact, Chomsky argues that checking an “accusative”feature with a feature “I-check-nominative” leads to a feature mismatch, and animmediate derivational cancellation. But the very notion of “feature mismatch”presupposes not just feature checking, but furthermore feature matching.

Why can we not say just that, in the same way that some feature in T attracts,specifically, a D feature (and not a wh- one, say), some other feature in Tattracts, specifically, an accusative feature (and not a nominative one)? Wecannot, because that denies the phenomenon of Case altogether. Accusative ornominative are not features, but values of a particular sort of feature, even ifthese values are relationally determined, through matching. It must be empha-sized that matching is a grammatical process, unlike checking, which is just anon-unified phenomenon. Under certain checking configurations (givendomains as in (22)), features may stand in checking relations. When they do,certain things may happen; for instance, erasure ensues if a given feature isuninterpretable. Matching, in contrast, is a derivational process that sanctionscertain values of features; if this process fails, the derivation is canceled. Noderivation is canceled for a failure in checking; there are no such failures – whatfails is the resulting representation.

It is thus important to understand what happens when matching succeeds:

(32) Checking ConventionWhen a relational feature [R-F] is attracted to match a feature [F-R],the FF-bag containing the attracted feature is R-marked.

A simple look at an actual morphological paradigm shows that (32) is accurate:

(33) Spanish pronominal clitics: Their features and their morphology:Acc Dat Acc Dat sg pl ms fm

sg I me me I m e e � s o aII te te II t e eIII lo/la le III l � e

pl I nos nos I n o oII os os II � o oIII los/las les III l � e

The key here is this: there is no unified morphological realization for anaccusative or dative feature. For I/II-sg, it is e. For I/II-pl, it is o. For III-sg/pl, itis � for the accusative and e for the dative. Paradigms of this sort are quite

D E R I V A T I O N S

164

Page 176: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

common, and tell us something simple and well known. We need complex fea-tures. We must allow for features like [I, sg, acc] or [III, sg, ms, acc].37 Or differ-ently put: the value accusative or dative is a value for a whole bag of formalfeatures, as the Checking Convention predicts.

But even if (33) indicates that things are this way, why should they be so? Isuggest that the Checking Convention allows featural distinction, for FF-bagswhich typically end up in the same set. That is, assume that all person, number,gender features are interpretable, in accordance to Chomsky’s (1995b) analysis.Now imagine a situation (within a given minimal domain) where the features ineach argument have the same values. This should be very common, once we setaside Case. Most statements about the world involve III subjects and III objects.Then it is natural for grammars to have a purely formal device to mark FF-bagsfor distinctness. At that point, no issue ever arises, regardless of specific values.

6 Obviation revisited

The question is then what effect this grammatical fact has on speaker interpreta-tion, a part of performance within minimalism. We have a feature “I-check-accusative” associated to v, and it has the effect of erasing the Case feature inhim, and correspondingly “accusative”-marking the FF-bag that contained theerased Case feature. I suggest that we bite the bullet and relate the mystery ofCase to the mystery of local-obviation, in the following way: “accusative”-marked FF-bags are disjoint from “nominative”-marked FF-bags.38

Chomsky (1995b) assumes, essentially, that FF-bags carry referential features(see Chapter 10), and in that respect it is not surprising that bags marked fordistinctness should be interpreted differently as well. I assume with Postal(1966) that pronouns are hidden definite descriptions; him has the import of theone. With Higginbotham (1988), I take definite descriptions to invoke a contextvariable whose range is confined by the speaker. Then the one is roughly,though more accurately, the one I have in mind. In sum, (34a) has the informallogical form in (34b):

(34) a. He likes him.b. [the x:one(x) & X(x)] [the y:one(y) & Y(y)] x likes y

And the key factor is this: can context variables X and Y have the same value?If not, the two unique ones invoked will come out distinct: one is a unique onesalient in context X, the other one a unique one salient at context Y. On theother hand, if X�Y, then the two ones will be the same, all other things beingequal. My specific claim, then, can be technically expressed as follows:

(35) Transparency ConditionIn the absence of a more specific indication to proceed otherwise,where FF-bags � and � are grammatically distinct, the speakerconfines the range of �’s context variable differently from the range of�’s context variable.

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

165

Page 177: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

I do not purport to be claiming that (35) follows from anything, only that it ismore natural than traditional obviation conditions, starting with Lasnik’s (1976)Disjoint Reference Rule. Why should pronouns have to be locally obviative?Within minimalism, that is a fair question to ask. (35) answers it thus: becausethey are grammatically distinct, the most transparent mapping to their seman-tics is also in terms of semantic distinctness. I emphasize also that, from thispoint of view, the phenomenon of obviation is taken to be speaker-dependent,thus invoking a mode of presentation from a perspective. This, though, is notcrucial to my proposal, which is more modest than understanding the detailedimplications of the obvious mapping in (34). I simply want to know why localobviation is there, to start with. If I am right, it is the semantic price of grammat-ical distinctness through Case.

Consider next some potential problems that the rest of the examples in(26)–(30) may raise. First, Case marking also distinguishes you and he in he likesyou, or you and I in I like you, raising the question of why this should be so ifthese are already distinct pronouns. (36) poses a related question:

(36) a. * He has arrived.b. * He to arrive was a mistake.

While he does not have to be different from any other argument here, to thinkthat it should not involve Case – when it does otherwise – is to force thegrammar into two sorts of paradigms for arguments, Case-marked and Case-lessones. This is more costly than having a single paradigm where all noun-phrasesare Case marked. We do predict, however, that when a single argument is atstake (unaccusative constructions), grammars should not ascribe much signifi-cance to what form of Case is employed. This is what we find. Where Englishemploys a subject Case in (36a), Basque employs an object Case in this samesituation. Mixed paradigms also abound.39 Similarly, we may say that the you/heand you/I combinations, and others, are still Case marked because that is thesimplest decision for the grammar to take, and it has no semantic consequences.40

The sequences �I,I�, �II,II�, or �III,III� in (26)–(30) now follow. TheCase mark will force either ungrammatical or uninterpretable results forthe first two sequences (depending on whether checking proceeds), and giventhe fact that II cannot be disjoint from II or I from I. As for III, the feature willyield grammatical and interpretable results, albeit obviative ones.

Why local domains where Case is checked should correlate with the domainswhere local obviation obtains is also entirely expected. They are aspects of thesame phenomenon. This means, first of all, that (37), and similar examples,should face the same sorts of problems that the paradigm we have just seen does:

(37) a. ! A friend of mine likes a friend mine.b. * The best friend of mine likes the best friend of mine.c. ! John likes John.d. ! John likes him.e. ! He likes John.

D E R I V A T I O N S

166

Page 178: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Certainly, all of these expressions are obviative, and (37b) is either ungrammati-cal or uninterpretable, assuming we are forcing two expressions singling out thesame unique individual not to co-refer. As for why the last three examplesshould be obviative, the matter is trivial if only FF-features are relevant in thesyntactic computation, as assumed by Chomsky.41

Now we must also concern ourselves with why the Case marker could notsave all the examples in (18) – with the exceptions of combinations of the samepersonal feature, uninterpretable under disjointess. I deal with this below.

7 Two types of anaphora

But first, we must deal with coreference. Given my analysis, coreference shouldbe a marked instance for arguments within the same local domain. Typical localanaphoric instances support this intuition, given Pica’s (1987) insight that theyare bimorphemic. A statement of self V-ing is generally possible only if thesecond argument carries an added morpheme like self or same (see Safir 1992).Then we may say that the syntax distinguishes each argument in the way we sawabove, by Case-marking. In turn, the semantics is now dealing with two formalfacts with an apparently contradictory interpretive import. Case differentiationnormally entails difference, while the anaphoric element is instantiating somesort of statement about sameness. This need not be a problem, though, althoughwe must separate two instances.

Consider the expression of anaphoricity by way of a regular pronoun fol-lowed by a morpheme expressing contextual sameness, as in the Danish (38):42

(38) a. Peter fortalte Anne om hende selv /* ham selv.Peter told Anne about her self him self

b. [the x:Peter(x) & X(x)] [the y:Anne(y) & Y(y)][the z: one(z) & same-as-before(z)] x told y about z

I take names not to pose any particular problems as definite descriptions, oncethey involve a context variable,43 so Peter is akin to the Peter I have in mind (seeChapter 12). In turn, I take selv to determine sameness of contextual confine-ment. When attached to hende (�the one), selv makes the context variablepredicated of the variable of the definite description be the same context vari-able as a previously introduced one. This is not just any variable, though, butthe closest one.44 As a result, (38a) is ungrammatical when the pronoun ham(which normally would hook-up to Peter) is introduced. This is because thecontext variable which is closest to ham is that of Anne, and confining the rangeof ham in terms of Anne leads to an absurdity (assuming she is not a transves-tite). Intuitively, the Transparency Condition would, by default, make all x, yand z in (38) be different variables. However, there is in this sentence a morespecific, explicit indication to the effect that y and z have the same value, sincethey pick out a salient, unique individual in two identical contexts.

While the puzzle above is solved by declaring the Transparency Condition adefault interpretive strategy (and treating pronominal anaphors as logophors, in

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

167

Page 179: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

the spirit of Reinhart and Reuland 1993), instances of local anaphors involve adifferent process. Compare the Danish (39a) to the Galician-Portuguese (39b):

(39) a. Peter fortalte sig selv om Anne.Peter told SIG same about Anne

b. Pedro dixose algunha cousa a si propio sobre de Ana.Pedro told-SE something to SI same about of Ana“P. told himself (something) about A.”

I take Lebeaux’s (1983) insight, particularly as pursued by Chomsky (1986b), tosimply be that (39b) is pretty much the way (39a)’s LF should look like. If so,some feature from within the Danish sig selv must raise at LF to wherever it isthat se is in Galician-Portuguese.45 But what is the relation between se and thedouble si (and therefore between whatever moves covertly in Danish and sig)?

The Galician-Portuguese propio, literally “own,” gives us a clue, if we take itto be invoking “possession” as in Szabolcsi (1983), Kayne (1994), or morespecifically as extended in Chapter 9 – see (40a). I propose in this light (andmuch in the spirit of Pica 1987) that the (se, si) relation invokes something like“x’s own persona” (40b), where pro is the null instantiation of a personal classi-fier (see Chapters 10 and 12):

(40)

(40b) is intended to cover all instances of clitic doubling (modifying an analysispresented in Uriagereka 1995b), and not just anaphoric clitic doubling. This isdesirable on two other grounds. First, it unifies the paradigm in (41):

(41) a. Le levanté a él la mano.him raised.1sg to him the hand“I raised his hand.”

b. Lo levanté a él mismohim raised.1sg to him same“I raised him (himself).”

c. Se levantó a sí mismose raised.1sg to si same“He raised himself.”[ … se … [XP a si [ t [AgrP [ mismo [pro]] Agr [SC t t ]]]]]

[ … lo … [XP a él [ t [AgrP [ mismo [pro]] Agr [SC t t ]]]]]

[ … le … [XP a él [ t [AgrP [ la mano Agr [SC t t ]]]]]]

a. John[ be� in(�have) [XP t [ t [AgrP a head Agr [SC t t ]]]]]

b. [ … CLITIC … [XP DOUBLE [ t [AgrP pro Agr [SC t t ]]]]]

D E R I V A T I O N S

168

Page 180: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(41a) is an inalienable possessive relation. The possessor clitic raises, leavingbehind the small-clause predicate la mano “the hand.” Identically, in (41b), anormal transitive construction involving clitic doubling, the “possessor/whole”clitic raises, leaving behind the small-clause predicate pro. I take pro to be anindividual classifier, as in Muromatsu (1995), for East Asian languages. We candiscern the presence of pro by optionally adding the adverbial mismo “same,”as argued in Torrego (1996). Finally, (41c) is almost identical to (41b), exceptthat the head of XP is the anaphoric clitic se.46

The second reason the proposal above is desirable is that it explicitly makesus treat (e.g. Romance) special clitics rather differently from (e.g. Germanic)regular pronouns (cf. (18) vs. (26)–(30)). Following Chomsky (1995b), I takeregular pronouns to be both heads and maximal projections; if so, they will notinvolve the elaborate structures in (41). I think that precisely those structuresaccount for why so few clitic combinations are possible in (18), and perhaps alsowhy special clitics must be overtly displaced. Because clitics are non-trivial,functional heads as in (41), they are morphologically weak – in fact, rather moreso than the idealized picture in (33) implies. Thus me, te, nos, os can beaccusative and dative in all dialects; le(s) can be third person or formal secondperson in all dialects, and in Latin-American dialects any second person(particularly, when plural). In most Peninsular dialects there are no distinctionsbetween accusatives or datives (they can all be of the le or of the la type);in most informal registers, le can be singular or plural. In Andean dialects locan be just any third person (number and gender irrelevant), and in various sub-standard dialects los is used for first person plural. Regular pronouns, incontrast, are paradigmatically distinct, which we may associate to theirbeing whole phrases, and thus being robust enough on morphologicalgrounds to support stress, emphasis, heavy syllabic distinctions, and so forth. Inthe spirit of Corver and Delfitto (1993), I propose that, given their morphologi-cal defectiveness, arguments headed by special clitics are not safely distin-guished by the Case-checking mechanism. More particularly, I propose that theChecking Convention is inoperative for special clitics, simply because the FF-bag of the clitic is incapable of hosting a Case-mark. And I should emphasizethat I am not saying that the clitic does not check Case; rather, that it is notappropriately Case-marked (as per (32)) as a consequence of this checking. In anutshell, if the clitic does not engage in a further process, the relevant FF-bagwill not be distinguished from other FF-bags in the same checking-domain-set.We may see overt cliticization as a process aimed at explicitly marking, throughsyntactic positioning, certain distinct relations. For example, a strong clitic inSpanish comes before a weak clitic, making the two distinct.47 At the same time,any combination (regardless of number, gender and morphological case) ofweak, or strong, clitics leads to undistinguished FF-bags, and a subsequent LFcrash.

We can then return to what covert relation is relevant in the Danish (37a): byhypothesis one in which sig is subject to [selv[pro]], and the features of a nullse raise to the domain of Peter. I should say that standard se has peculiar

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

169

Page 181: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

placement properties (see Note 48), which may relate to the usual assumption inthe literature, forcefully expressed in Burzio (1996), that se is in essence anexpletive element, with no features other than categorial ones (it is neither [�s]nor [�s], and thus behaves like neither of the clitics thus defined; besides, se isunspecified for gender, person, number, and Case). Ideally, matters of PF andLF interpretability decide on successful placements for se. For instance, as theclitic that it is, it should group with other clitics at PF, albeit as a wild card. Inturn, the fact that it is expletive in character may relate to its interpretive possi-bilities, as follows.

8 Some thoughts on “sameness”

I set aside the ultimate semantic import of the relation between sig and pro,assuming (to judge from (41b)) some form of identity, with the rough content of“his person.” The issue is the relation between the null se heading the complexanaphoric construction and its antecedent. By hypothesis, se cannot be markedfor distinctness in terms of Case. Thus, although it is distinguished from [�s]and [�s] clitics in that it is a clitic with no [s] value, it is harder to see how itcould be syntactically distinguished from a non-clitic expression whose featuresend up in the same checking-domain-set. So suppose the grammar in fact doesnot tell se apart from such a non-clitic expression. This leads to no LF crash if, infact, se only has categorial features, and hence no uninterpretable feature toerase.48 In turn, the expression that se “collapses” with can be seen as itsantecedent, with the rough semantics in (42) for the examples in (39) (thoughsee Chapter 12):

(42) [the x:Peter(x) & X(x)] [the y:Anne(y) & Y(y)] x told [x’s person]about y

The fact that se gets taken by the grammar to be the same item as the elementit shares a checking domain with entails, ideally, the formation of an exten-ded chain with two roles having been assigned configurationally before move-ment.

It must be stressed that although (42) comes out as anaphoric as (38) does,they achieve this interpretive result in radically different ways. In (42),anaphoricity is expressed through a single operator binding two variables, and asingle chain receiving two roles. In (38), there are two separate chains, eachinvolving their own operator, which nonetheless come out “anaphoric” (orlogophoric) because of a lexical particle demanding contextual sameness.

The sort of analysis delineated above trivially extends to languages, likeBasque, where anaphoric relations are expressed as in (43) (and see Chapter 10):

(43) a. Jonek bere burua ikusi du.Jon-S his head-the-O seen has“John has seen himself.”

b. [the x:Jon(x) & X(x)] x saw [x’s head]

D E R I V A T I O N S

170

Page 182: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Obviously, the relation here between bere “his” and burua “head” is not thesame as exists between sig and [[pro]selv]. Yet this is arguably a low level lexicalfact, assuming that “head” is the relevant classifier in Basque for persons. Whatis again crucial is that the null clitic se relates to its antecedent by being syntacti-cally indistinguishable from it, leading to a fused chain.

I stress that nothing is wrong, a priori, with fused chains receiving two roles.Of course, the question is whether or not we should have a Thematic Criterion,a matter that Chomsky (1995b) debates and does not resolve. Certainly, nothingthat I have said allows (44a) (meaning John used himself ):

(44)

Note that the first movement violates the Last Resort condition (Jairo Nunes, per-sonal communication). What allows the similar structure in (44b) is that the cliticplacement does not violate Last Resort, by hypothesis. In general, fused chainsshould be possible only when Last Resort is appropriately satisfied.49 Explicitly:

(45) Chain Fusion SituationLet � and � be different chains. If �’s head is non-distinct from �’s headwithin a given checking domain, and �’s tail commands �’s tail, then �and � fuse into an integrated chain , subsuming properties of � and �.

I take (45) to be the source of true anaphoric coreference, which should belimited to se (and similar elements) “collapsing” within a checking domain.50

It is now easy to see that nothing in the properties of se makes it anaphorical,specifically. Consider (46):

(46) pro se mataronse killed.III

“They killed themselves/each other.”

Aside from group readings, this has three transitive, distributive readings. It canhave the rough import of “they unintentionally caused their death.” It also hastwo causative readings: a reciprocal one, and an anaphoric intepretation dis-cussed above. Of course, readings like these are disambiguated in terms of“doubles,” as they are in English with the overt elements themselves and eachother. What I am saying, however, is that the “doubles” are not crucial to thedependency, in a way that se is, inasmuch as it induces chain fusion.

(47a) provides an argument that only chain fusion induces anaphoricity:

(47) a. *Juan se ha sido encomendado t a si mismo.*Juan se has been entrusted to si same

b. Juan le ha sido encomendado t a el mismo.Juan him has been entrusted to him same“Juan has been entrusted to himself.”

a.

b.

[ John T [ t v [ used t ]]]

[ John [ clitic-T [ t v [ used […t… ] ] ]]]

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

171

Page 183: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Rizzi (1986) discusses an Italian example similar to the Spanish (47a) to arguefor his Local Binding Condition. He suggests that what goes wrong with (47a)is that Juan is trying to form a chain over se, which has the same index asJuan. Surprisingly, however, the Spanish (47b) is quite good. Here, we havereplaced se for the indirect object le. Generally, that should give the indirectobject an obviative reference with respect to Juan, but we have loaded the diceby adding a logophoric double a él mismo, “to him same,” which forces the ref-erence of le to be that of Juan. If we only had indices to establish referentiallinks, (47b) would fall into Rizzi’s trap, being predicted as ungrammatical, con-trary to fact. However, what I have said above provides a different account.Compare:

(48)

A fused chain arises in (48a), since all the relevant elements stand in acommand relation. In contrast, a fused chain does not arise in (48b), since thethird and fourth elements do not command each other.51 Nonetheless, the fea-tureless se cannot be told apart from the element it shares a checking-domain-set with (technically, the D features of Juan). This is a paradox: se cannot headits chain, and cannot be in a fused chain. The result is ungrammatical.

Compare finally (48b) to the grammatical (24b), repeated now as (49a). Inthis instance, we should note, a reflexive reading is actually possible (49b).However, a reading is also possible whereby se is taken to be some indirectobject, a matter noted in Section 4 which we left pending until now.

(49) a. Juan se lo mostró.Juan se it shown“Juan showed it to him.”/“Juan showed it to himself.”

When se is considered within the same checking domain as Juan, (49b) is asstraightforward as (49a) (the fused chain succeeds in this instance, becausecommand obtains throughout the members of the chain). In turn, (49c) involvesthe consideration of se within the same checking domain as lo.52 In this instance,

b. [TP Juan se lo [ [ mostró [ v]] [VP t [[XP …t…] [ t t ] ] ] ]]

c. [TP Juan se lo [ [ mostró [ v]] [VP t [ [XP …t…] [ t t ] ] ] ]]

a. [TP Juan se [VP t [ levantó [XP …t…] ]] ] “Juan raised himself”

b. *[TP Juan se [ha sido [ encomendado[v] [VP [XP …t…] [ t t ]]]]]

D E R I V A T I O N S

172

Page 184: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

a fused chain cannot be formed because, just as we saw for (49b), neither thethird nor the fourth elements command the other. However, se can be told apartfrom lo, if what I have said in Section 4 is true: se is a different clitic from lo, inthat the latter is [�s], while se is not specified for this feature. Whereas thisproperty of se does not make it distinguishable from the D features of a regularsubject like Juan, it does make it distinguishable from lo, la, etc. Therefore, sedoes not collapse with lo in (49b), and then it can indeed form its own separatechain. It will not be an anaphoric chain, but this is in fact desirable. We want thereading of the se chain to be one invoking a third person.53

9 A word on long-distance obviation

One can think of other potential problems for my approach, but by handlinganaphoricity I have tried to show that there are reasonable ways of keeping thesystem to the sort of bare picture that minimalism expects. While I will not beable to extend myself at this point to many other instances that immediatelycome to mind, I will, however, say a word about the fact that my analysis doesnot predict the obviation in (50), and many similar instances:

(50) ! He thinks that John is a smart guy.

Now, consider an analysis of along the lines of Reinhart (1995: 51):

The coreference generalization . . . is that two expressions in a given LF,D, cannot corefer if, at the translation to semantic representations, wediscover that an alternative LF, D�, exists where one of these is a variablebound by the other, and the two LFs have equivalent interpretations. i.e.D� blocks coreference in D, unless they are semantically distinct.

Reinhart lets structures like (50) converge; her economy analysis of an exampleof this sort is thus strictly not related to any of the derivational concerns raisedhere. Her intuition is that variable binding is a more economical means of iden-tifying referential identity, provided that assignment of reference requires relat-ing an expression to the set of entities in the discourse. This is post-LF economyin the performance systems, about which I have nothing to say.54

Suppose we were to accept this description of (50) – should it extend to allinstances discussed here? While this is a fair question, the point cannot be con-ceded a priori, given the sort of issue this chapter is raising. A pattern of formalbehavior does not immediately demand a unified, minimalist explanation. Moreconcretely, what do (50) and the examples in (26)–(30) have in common? The factthat command matters for all is not significant beyond the fact (given the model ofthe grammar argued for in Section 2). This commonality is necessary to just anyLF process. Locality is in fact not common to these examples, obtaining only inthe (26)–(30) paradigm, but not in (50). Then the only significant commonality isdisjointness of reference, Relation R. However, as I have noted, it is plausible thatthe grammar only codes sameness and, by default, difference. If so, howsurprising is it really that we find two unrelated phenomena which have in

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

173

Page 185: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

common one of these two effects? To demand commonality between long-distance and short-distance obviation would be like demanding commonalitybetween long-distance and short-distance anaphora, which Reinhart and Reuland(1993: 658–60) – in my view, correctly – are careful enough to distinguish.

Once it is taken as an empirical question, are there in fact any advantages tokeeping the local and the long-distance phenomena separate? I think there maybe a couple. First, non-local relations are immediately suspect when attemptinga standard syntactic analysis. They should not exist in derivational syntax. Thenagain, one may try to argue that what I have called local obviation is not a phe-nomenon of grammar either. While I do not have any special reasons to wantobviation in the grammar, the analysis I have provided here gives a simple wayof dealing with it, when it is local. My treatment follows trivially from anindependent property of the system. The fact that it makes use of sets that wecall checking domains, which are there in some form, whether we agree on therest or not.55 As I have tried to argue, obviation is just one among several pos-sible results of the marking-for-distinctness involved in tagging, specifically,different FF-bags within a checking-domain-set. Making these minimalistassumptions, we were able to account for the facts in (18), which involve FF-sameness, but not obviation. In turn, we were forced to look into the matter ofanaphoricity, which in itself has interesting consequences: a simple distinctionbetween logophors and anaphors, and an account of Rizzi’s Local Binding Con-dition effects in terms of a mechanism of chain fusion. Finally, the presentaccount gave us a motivation for what the mysterious uninterpretable Case fea-tures are. If we insist that whatever underlies long-distance obviation shouldpredict the short-distance facts we will lose all of these rather natural results.

10 Concluding remarks

As I see it, true LF properties are the result of constructive processes. I have notsaid this before, but now I can be loud and clear. The dynamic model that I havesummarized in Section 2 is in effect saying that LF and PF do not exist as levels,although they certainly exist as components (the locus of Full Interpretation in aradically derivational system). If LF and PF are not levels, there should not beany formal relations obtaining there, other than the ones that the derivationprovides by way of its mechanisms of constructing structure. From this perspect-ive, anything long distance should be post LF. In turn, perhaps everything short-distance is LF, in the sense that a constructive process of grammar is beinginvoked. In this respect, local obviation looks hopeless at first. Why would thegrammar “bother” to code such a weird requirement? As a matter of fact, thevery existence of obviation conditions is not a bad argument for Gould’s posi-tion on language. Prima facie, obviation hinders communication, in that it limitsthe class of possible thoughts that are expressed by way of grammatical combi-nations. Nonetheless, this is a point about language function, not language struc-ture. Structurally, it is still possible that obviation has a minimalist place in thegrammar. I have suggested that this place is a mere reflex of something deeper:

D E R I V A T I O N S

174

Page 186: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

the matter of deciding on identical or different structures, given set-theoreticdependency constructs. The grammar tags symbols for distinctness and bydefault assumes they are also semantically different, in some way relating tomode of presentation or intended reference. In sum, the Minimalist Programforces us to ponder the nature of principles themselves, over and above thenature of phenomena. The latter are obviously important. Yet without aprogram to guide us into reflecting on what principles we are proposing (it maybe stressed: what properties we are ascribing to the human mind), we may findourselves redescribing the structure of something like a snail shell, withoutlearning something deeper about what is behind that very pretty pattern.

F O R M A L A N D S U B S T A N T I V E E L E G A N C E

175

Page 187: Uriagereka J. Derivations. Exploring the Dynamics of Syntax
Page 188: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Part II

PARADIGMATIC CONCERNS

Page 189: Uriagereka J. Derivations. Exploring the Dynamics of Syntax
Page 190: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

9

INTEGRALS

with Norbert Hornstein and Sara Rosen

1 Introduction

Consider the following sentence.

(1) There is a Ford T engine in my Saab.

It embodies the ambiguity resolved in (2).

(2) a. My Saab has a Ford T engine.b. (Located) in my Saab is a Ford T engine.

(2a) depicts the Ford T engine as an integral part of the Saab. This is the kind ofengine that drives it. Call this the integral interpretation (II). The meaning expli-cated by (2b) fixes the location of at least one Ford T engine. This engine neednot be a part of the Saab. Were one in the business of freighting Ford T enginesin the back seat of one’s Saab, (2b) could be true without (2a) being so. Call thisthe standard interpretation (SI).

The central point of this chapter is that the ambiguity displayed by (1)between an I and S interpretation has a grammatical basis. In particular, (1) isderivable from two different kinds of small clauses, involving two differentkinds of predication structures. Underlying a II is a small clause such as (3a). SIsderive from small clauses like (3b).

(3) a. …[SC My Saab [a Ford T engine]]…b. …[SC a Ford T engine [in my Saab]]…

(cf. “I saw a Ford T engine in my Saab.”)

We intend to interpret (3b) in the standard way. The locational PP is predicatedof the small clause subject. This SC is true, just in case at least one Ford Tengine is in my Saab. Clearly the SC in (3a) cannot have the same standardpredicational structure. Whereas one can read (3b) as saying that a Ford Tengine is in my Saab one cannot read (3a) analogously. The SC does not assertthat my Saab is a Ford T engine. Rather, (3a) must be interpreted as true justin case my Saab “is (partly) constituted of” a Ford T engine. We discussbelow what we intend by this notion, henceforth simply referred to as “RelationR.” For present purposes, it suffices to distinguish the underlying small clausesin (3) and tie them to different kinds of predication within the grammar. We

179

Page 191: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

hypothesize that the predication structure displayed in (3a) underlies construc-tions involving inalienable possession, part/whole relations and mass term predi-cations among various others. Our aim here is to show how making thisassumption permits an explanation of the distinct properties such constructionstypically have.

The structure of this chapter is as follows. In the next sections we explore thegrammar of IIs and SIs. In particular, we show how to get from their respectiveunderlying small clauses to the various surface forms that they display. We relyon and expand earlier work by Kayne (1993) and Szabolcsi (1981 and 1983).Section 3 deals with the definiteness effect each structure displays and traces itto different sources. In Section 4, we discuss the fine points of the syntax/semantic proposal we are making. Section 5 extends the range of data andshows how it fits with the grammatical analysis proposed.

2 Some syntax

Kayne (1993) argues that English possession constructions such as (4a) arestructurally identical to those in Hungarian. Following Szabolcsi (1981, 1983) heassigns them a small clause phrase structure as in (4b) (Spec�Specifier).

(4) a. John has a sister.b. [Spec be [DP Spec D0 [[DPposs John] Agr0 a sister]]]

In essence, to get the manifested surface order, John raises to the Spec positionbefore be, a sister raises to Spec D0 and the D0, which Kayne suggests is in somesense prepositional, incorporates into be. The resulting incorporated expressionbe�P surfaces as have. These elaborate movements are visible in Hungarian asSzabolcsi showed.

In Hungarian, if the larger DP is definite the possessor DP can remain inplace. If it does, it surfaces with the nominative case. It can also raise to Spec D0,however. It appears with the dative case if it does. Once moved to Spec D0, thepossessor DP can raise further to the matrix Spec position. In this instance, ittakes the dative case along with it. Importantly, the dative possessor co-occurswith a post nominal Agr.

If the larger DP is indefinite, as occurs in cases such as (4a), the raising of thepossessor is obligatory and the agreement morpheme occurs.1

Kayne extends this analysis to English. He suggests that English has a non-overt oblique D0 and that the possessor moves through its Spec to get to thesurface subject position. The derivation is as in (5), with the D0 treated as (insome sense) a preposition.

(5) DPposs/i be [DP ti [D/P]0 [ti Agr0 QP/NP]]

Kayne suggests that the incorporation of [D/P]0 is required to change its Specfrom an A� to an A position. Illicit movement obtains without this incorporation.

Though we accept the basics of the Kayne/Szabolcsi analysis, we differ onsome details. Before getting into these, however, consider how to extend this

D E R I V A T I O N S

180

Page 192: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

analysis to cover II constructions like (2a). Given the II, the relevant underlyingstructure must be as in (6a). Given that these constructions display a definite-ness effect,2 the derivation must be (6b) with the DPposs moving all the way up tothe matrix Spec.

(6) a. [Spec be [DP Spec [D/P]0 [[DPposs my Saab] Agr0 a Ford T engine]]]b. [My Saabi be�[D/P]0

j [ti ej [ti Agr0 a Ford T engine]]]

(1) should have a similar underlying source on its II reading. However, here werun into technical difficulties if we adhere to the details of Kayne’s analysis. Theproblem is where to put in and how to derive the correct surface word order. Ifwe respect the leading idea behind the Kayne/Szabolcsi analysis, the correctsource structure should be something like (7). This redeems the intuition thatthe D0 is somehow prepositional.

(7) [Spec be [DP Spec in [[DPposs my Saab] Agr0 a Ford T engine]]]

Given (7), however, we must alter the details of Kayne’s proposed derivation.In particular, to get the right surface order we must raise the predicate a Ford Tengine to the Spec of in and insert there in the matrix Spec position.

(8) [there be [[a Ford T engine]i in [my Saab ti]]]

If this is correct, it suggests that (mutatis mutandis) the derivation of (2a) alsoinvolves predicate raising. The derivation should thus be (9b) rather than (6b).

(9) a. [Spec be [DP Spec [D/P]0 [[DPposs my Saab] Agr0 a Ford T engine]]]b. [my Saabi be�[D/P]0

j [a Ford T enginek ej [ti Agr0 tk]]]

This alternative derivation has some pleasant features. First, it allows us toextend the Kayne/Szabolcsi analysis to constructions such as (1). Second, asnoted above, Kayne required the incorporation of [D/P] into be to alter A/A�status of the Spec position. However, it is not entirely clear how this change iseffected by the incorporation. The derivation in (9b) gives another rationale forthe required incorporation of [D/P] into be. Without it, movement of my Saabacross the Spec containing a Ford T engine violates minimality. More precisely,we offer the following minimalist rationale for the incorporation.

Chomsky (1993b) provides a way of evading minimality restrictions in certaincases. It proposes that minimality can be evaded just in case the target of move-ment and the landing site that is skipped are in the same minimal domain. Theeffect of incorporating [D/P] into be is to place the two Specs in the samedomain. This then permits my Saab to raise to the matrix Spec position withoutviolating minimality.3

Note, furthermore, that there are analogues to sentences like (2a) thatprovide evidence for this derivation. II constructions have paraphrases involvingovert pronominal forms and the particular preposition we find in these para-phrases is the same one that appears in the there-integral construction. If weexchange the preposition, even for a close synonym, the II reading fades. (10b,c)only have SI readings, in contrast with (10a) which is ambiguous.

I N T E G R A L S

181

Page 193: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(10) a. My Saab has a Ford T engine in it.b. My Saab has a Ford T engine inside it.c. There is a Ford T engine inside my Saab.

In a theory that treats movement as copying and deletion (like the minimalisttheory), forms like (10a) are to be expected.4

The derivation of the SI reading for (1) proceeds from a different underlyingsmall clause.

(11) There is [sc a Ford T engine in my Saab]

Here, a Ford T engine is the subject, not predicate, of the underlying SC. If theexpletive is not inserted, raising is possible and we can derive (12).

(12) A Ford T enginei is [sc ti in my Saab]

Observe that (12) is unambiguous. It does not have an II reading. This follows ifwe assume (with Kayne) that movement from Spec [D/P] position is a case ofillicit movement. (13) is the required derivation of an II analogue of (12). Themovement from Spec DP to the matrix Spec is disallowed on the assumptionthat Spec DP is an A� position.5

(13) a. [Spec be [DP Spec in [[DPposs my Saab] Agr0 a Ford T engine]]]b. *[[A Ford T engine]i is [DP ti in [[DPposs my Saab] Agr0 ti]]]

Note, furthermore, that incorporating in to be does not yield a valid sentence.

(14) * A Ford T engine has my Saab.

This follows if incorporating the preposition into the copula does not alter theA� status of the Spec D0 position (pace Kayne).6

There is further data that this analysis accounts for. First, we explain whycertain existential constructions do not have be-paraphrases. This is what weexpect for an existential that has only an II source. Consider for example(15)–(17). Each (a)-example has an acceptable have-paraphrase. Note that thethere-sentences only carry I interpretations. For example, (15a) means thatCanada is comprised of ten provinces, not that ten provinces are located there.7

Similarly for (16a) and (17a). Consequently, these sentences must all be derivedfrom SCs like (3a) above and so will possess have-paraphrases.

(15) a. There are ten provinces in Canada.b. * Ten provinces are in Canada.c. Canada has ten provinces.

(16) a. There are eight legs on a spider.b. *Eight legs are on a spider.c. A spider has eight legs.

(17) a. There are too many storys in my building.b. * Too many storys are in my building.c. My building has too many storys.

D E R I V A T I O N S

182

Page 194: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Second, we account for why it is that pied-piping disambiguates (18). (18) hastwo readings. On the II reading it tells us that this elephant has a big nasalappendage. On the SI reading it locates a piece of luggage. Interestingly, (19) isnot similarly ambiguous. It only has the SI reading.

(18) You believe that there is a big trunk on this elephant.

(19) On which elephant do you believe that there is a big trunk?

This is what we should expect given the present analysis. It is only with the SIreading that on this elephant forms a constituent. With the II, on is actually a D0

that takes as complement an AgrP. The inability to pied-pipe with the whichphrase in (19) is parallel to the unacceptability of pied-piping in (20).

(20) *For which person would John prefer to take out the garbage?

There is another parallel between on in the II reading of (18) and the for com-plementizer in (20). Neither licenses preposition stranding.

(21) a. *Which person would John prefer for to take out the garbage?b. *Which elephant do you believe that there is a big trunk on?

(21b) is relatively acceptable, but only with an SI reading, only in the construc-tion where on the elephant forms a constituent. When we control this, we getunacceptability. Under the preferred reading, (22) has an integral interpreta-tion. With this reading, the sentence is unacceptable.

(22) *Which bug do you believe that there are eight legs on?

There is one further piece of evidence that in my Saab does not form a PP con-stituent in (1) when given the II reading. The addition of PP specifiers like rightdisambiguates these constructions. (23) has only the SI.

(23) There is a Ford T engine right in my Saab.

This is what we expect if on the II, in is not the head of a PP constituent. If it isnot, there is no PP for right to relate to. Hence the presence of this element isonly consistent with the SI of (1) because in this case in my Saab is a PP con-stituent.

A third property of existential constructions with II readings is that theyshow subtle but distinctive agreement properties. Consider the data in (24).8

(24) a. There appear to be no toilets in this place.b. There appears to be no toilets in this place.

(24a) is ambiguous. It can report that this room is not a men’s room or it can betaken as saying that the toilet storage room seems to have been cleared out andleft empty. Contrast this with (24b). It can have the first interpretation.However, it cannot be used to report on the inventory of toilets. In short, it canbe used to say that this room has no toilets but not that it has been emptied oftoilets. There is a straightforward way to present the generalization. To expressthe interpretation which has a be-paraphrase, agreement is required. If in cases

I N T E G R A L S

183

Page 195: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

such as these, number agreement is taken to be an LF phenomenon, then anexpletive replacement account of the kind offered in Chomsky (1986b) couldaccount for the obligatory agreement evidenced in the SI readings. Observe thatstandard expletive replacement is rather unlikely for the II readings given thelack of a be-paraphrase in such cases. This plausibly underlies the optionalagreement pattern.

A further agreement effect is visible in the following pair.

(25) a. There are two gorillas in this skit and don’t you change it.b. There’s two gorillas in this skit and don’t you change it.c. There are two gorillas in this skit; Horace and Jasper.d. *There’s two gorillas in this skit; Horace and Jasper.

This too makes sense given our analysis. In (25a,c) the indefinite is actually anunderlying SC subject. In (25b,d) it is an underlying SC predicate. Now observethat indefinites in argument and predicate positions function differently when itcomes to explicit enumeration of their members. Thus, the contrast in (26).

(26) a. A doctor arrived, namely Paul.b. *He is a doctor, namely Paul.

The contrasts in (25) can be viewed similarly. (25a,b) involve no enumeration atall so they are both acceptable. In (25c,d) we do enumerate the gorillas. Asexpected, the non-agreeing (25d), in which two gorillas is actually a predicaterather than an argument, is less acceptable than the constructions with an SIreading.

Consider, finally, another important difference between these two kinds ofthere-constructions. They display rather different definiteness effects. As is wellknown, the associate in an existential construction must be a weak NP. What iscurious is that not all weak NPs are felicitous in IIs and some strong NPs are.Consider the following contrasts.

(27) a. There are some people in John’s kitchen.b. *There are some provinces in Canada.

(28) a. *There is every (possible) problem being addressed.b. There is every (possible) problem with this solution.

The (a) examples each have be-paraphrases. The (b) examples are paraphrasedwith have. Observe that the have-paraphrase with (27b) is also unacceptablewhile the one for (28b) is fine.

(29) a. *Canada has some provinces.b. This solution has every possible problem.

Given our view that IIs derive from SCs with a predicational structure like theone underlying these have-paraphrases, it is natural that their acceptabilityshould swing in tandem. The key seems to be that some-phrases cannot actpredicatively while the sort of every phrase we have in (29b) can. Consider (30)in this light. As expected, some pig, in contrast to a pig, cannot be used as a

D E R I V A T I O N S

184

Page 196: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

predicate NP. If the associate in an II existential must so function, the unaccept-ability of (27b) follows.

(30) a. Wilbur is a pig.b. *Wilbur is some pig.9

As for the acceptability of (28b), it seems tied to the fact that this NP, especiallywhen possible modifies the head noun, can be used with an amount reading.Thus, (31) seems like a good paraphrase of (28b).

(31) This solution is as problematic as it could be.

These amount uses of every, then, seem to have the semantics of superlativeconstructions and these superlatives are possible as predicates. Given that theassociate here is an underlying predicate, we can trace the contrast between(28b) and (28a) to the fact that the associate in the former is an underlyingpredicate, while this is not so in the latter.

If accurate, the above suggests that the definiteness effect is not a unitaryphenomenon. In effect, the restriction on the associate in II existentials withhave-paraphrases is due to the fact that the associate is actually an underlyingpredicate. We expect NPs that appear here to be capable of predicative uses. InSIs, in contrast, whatever underlies the definiteness restrictions must be tied tosomething else. As the be-paraphrases indicate, in these cases the subjects areindeed arguments. We return to these issues in the next section.

To recap, we have proposed that the ambiguity in (1) is due to two differentunderlying SC sources. The existence of these small clauses points to twofundamentally different kinds of predication structures that grammars makeavailable.

3 Definiteness effects

We suggested above that there are two sources for the definiteness effect (DE)observed in existential constructions. The DE observed for II readings in there-constructions should be traced to the predicative status of the NP in associateposition. In SI there-constructions, in contrast, the associate is never a predicateand so the DE in these constructions must have a different source. Several havebeen suggested.

English existential constructions are typically analyzed as having the struc-ture in (32).

(32) Therei … be [NPi…]

There is an expletive sitting in subject position. The copula takes a small clausecomplement (Stowell 1978; Burzio 1986) and the subject of the SC (the associate)is related to the expletive by co-indexation. A salient property of (32) is the DE.

Two different syntactic approaches have been pursued in trying to accountfor the DE. One set of analyses traces the DE to some feature of the expletive.In Milsark (1974), for example, there is treated as a sort of quantificational

I N T E G R A L S

185

Page 197: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

expression in need of a variable to bind.10 Indefinites are analyzed as havingboth a quantificational and non-quantificational “adjectival” reading. In thelatter guise, indefinites make available free variables which the expletive there(interpreted as an existential operator) can bind.

Other accounts are similar in conception, though not in detail. Higginbotham(1987) allows adjectival NPs to be interpreted absolutely. The expletive there istreated as without semantic content of any sort though it occurs with postcopu-lar NPs that can have the required absolute interpretations. The NP in existen-tials is interpreted propositionally, with the indefinite acting as subject and theN� as predicate. Higginbotham (1987) ties the DE to a stipulated capacity ofadjectival quantifiers to get absolute interpretations. The syntax of these con-structions is exemplified in (33a) and the interpretive structure in (33b). Notethat (33b) is of subject-predicate form and that the expletive is taken to bewithout semantic consequence (in contrast to the Milsark proposal).

(33) a. There are [NP few men in the room]b. [Few (x)] [men x & in the room x]

Keenan (1987) makes a relatively similar proposal, though once again the detailsare different. He does not derive a DE. Rather he shows that existential sentencesare logically equivalent to existence assertions if and only if the determiner is exis-tential. He too treats the expletive (and copula) as semantically inert. However,he treats the postcopular expression as sentential, in contrast to Higginbotham.Semantically and syntactically the postcopular expression is a proposition.11

What unites all these analyses (for our purposes) is that they each treat thepostcopular expression here as forming a standard predication structure withthe indefinite (or the quantificational part thereof) being the subject. Theproblem with this, from our perspective, is that it will not serve to adequatelyaccommodate I interpreted existentials. If we are correct above, then these arenot cases of standard predication, and the associate is not a subject. Thus, anyaccount that treats it as such is inadequate. The empirical evidence points to ourconclusion. We saw in Section 2 that the distribution of these indefinites differssomewhat from the distribution of indefinites in SIs. As a further illustration,consider a fact discussed in Higginbotham (1987).

We pointed out above that one can find acceptable quantified associates incertain existential constructions.

(34) a. There’s every possible error in this solution.b. This solution has every possible error (in it).

This have-paraphrase underlay our suggestion that the universally quantifiedassociate in (34a) is predicative. Higginbotham points out that we find every-predicates in sentences such as (35).

(35) John is everything I respect.

However, these expressions are not generally licensed in existential construc-tions. He notes the unacceptability of (36).

D E R I V A T I O N S

186

Page 198: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(36) *There was everything I respect about John.(cf. *John has everything that I respect (about him).)

What underlies the contrast between (34a) and (36)? Given our proposal, thecontrast is tied to the fact that only the former has an acceptable have-paraphrase and so only here is the every phrase a predicate. In contrast, in (36)the every phrase is a subject, thus falling under whatever correctly limits associ-ates to indefinites. (34a) is acceptable so long as the every phrase can be inter-preted predicatively. While the have-paraphrase of (36) is unacceptable, the pairin (37) is considerably more acceptable.12

(37) a. There is every quality that I respect in John.b. John has every quality that I respect (in him).

There is a further point worth making. Keenan (1987) observes that have-constructions also display a definiteness effect.

(38) a. John has a brother (of his) in college.*John has the brother (of his) in college.

b. This solution has a problem (with it).*This solution has the problem (with it).

An adequate account of DEs should deal with these cases (see Note 12). Weclaim that the distribution of indefinites in these constructions follows from thepredicative status of the post have NP in underlying structure. We thereforeexpect the definite effect in these constructions to parallel those found in IIexistentials. What we find significant is that the DE manifested in the have-constructions is not identical to that found in SIs. We are claiming that a properanalysis of the DE in have existential constructions will extend to there existen-tial constructions with an II reading provided that one adopts a Kayne/Szabolcsisyntax for these constructions.

A second influential approach to the DE ties the effect specifically to theindefiniteness of post copular predicate NPs (Safir 1987). As should be clear, weagree that there are some DEs that should be traced to the fact that the NPsfunction predicatively. However, the flip side of our analysis is that not allinstances of the DE should be derived in this way. In particular, the indefiniteassociate in SI interpreted there-clauses is not a predicate at any relevant levelof representation. The crux of our proposal is that there are two types of DEsand that the two theoretical approaches that have been mooted each correctlycharacterize one of these. The data presented above empirically underwritesthis dual approach to the DE.

4 Constitution and part/whole

We have suggested that, in addition to standard predication, grammars employa kind of integral predication. This second type of predication (involving Rela-tion R) underlies IIs. This section aims to clarify somewhat just what this second

I N T E G R A L S

187

Page 199: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

kind of predication amounts to. Our proposal is that in (1) above this kind ofpredication involves two different kinds of information, a claim that the SCsubject is partially characterized in terms of the SC predicate and a claim thatthere is more to the subject than that. The first kind of information is what givessuch sentences their inalienable feel, the second, the part/whole aspect of theinterpretation. Consider each in turn.

Burge (1975), in discussing mass term constructions, proposes a novel varietyof primitive predication.

(39) a. The ring is gold.b. That puddle is water.

The sentences in (39) are analyzed as in (40a,b) with C read as in (40c).

(40) a. C [the ring, gold]b. C [that puddle, water]c. “C”�“_____ is constituted of _____ at . . .”

Thus, we read (39a) as follows: “the ring is (now) constituted of gold.”The interpretation of the copula as C in cases such as this allows for a smooth

account of the logical form of sentences such as (41).

(41) That engine was once steel but now it is aluminum.

The reader is referred to Burge for further illuminating discussion.Let us assume that Burge is correct and that mass term constructions like

those in (39) require postulation of the kind of predication that C embodies. Wecan adopt this for our purposes by proposing that an extension of C-predicationis what we get in the II construction. Note that if Burge is right, then this isrequired even for constructions without a have-paraphrase, such as (39). Putanother way, the sort of predication we envisage for IIs appears to have someindependent motivation.

Interestingly, constructions like those in (39) have “near” have-paraphrases.

(42) a. The ring has gold (in it).b. That puddle has water (in it).

The main difference between the sentences in (39) and (42) is that the formerappear to be stronger statements. (39a) says that the relevant ring is entirelyconstituted of gold. (42a) says that it is partially so made up. In other words, thehave analogues invite the inference that there is more than gold in that therering. The distinction observed here also occurs in non-mass term constructions.

(43) a. This dining room set is four chairs and a table.b. This dining room set has four chairs and a table (in it).

(43a) tells us what comprises the dining room set in toto. (43b) says that the setincludes these five items but comes with more. We would like to suggest that inaddition to the C-predication that is the inherent interpretation of the smallclause, the DP headed by the D/P yields the part/whole semantics that (42) and(43b) display.13

D E R I V A T I O N S

188

Page 200: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Now for a few refinements. The above adopts Burge’s proposal, but generalizesit. The relation of gold to the ring in (39a) is one of material composition. Gold isthe stuff that the ring is made of. In (43), in contrast, the dining room set does nothave four chairs and a table as stuff – rather these comprise the dining room set’sfunctional organization. If C-predication is restricted to the elaboration of stuffalone, we may think of R-predication (via Relation R) as characterizing the func-tional structure of an item in some abstract space. So just as humans are made upof flesh and bones they are also (more abstractly) made up of noses, heads, arms,and so on. The latter kind of characterization we call the functional make up incontrast to the physical make up that is typical of mass term constructions.14

In sum, we take C- and R-predications to be primitive, putting aside nowwhat unifies these two (though see Note 15). We furthermore suggest that thesepredications occur in SCs of the kind that Kayne and Szabolcsi firstinvestigated.15

5 More facts

In light of the above, consider the data discussed by Vergnaud and Zubizareta(1992) (V&Z). They discuss two kinds of possession constructions in French.The first, which they dub the external poss construction (EPC) is exemplified in(44). The internal poss construction (IPC) is given in (45).

(44) Le médecin a radiographié l’estomac aux enfants.the doctor has X-rayed the stomach to-the children“The doctor X-rayed the children’s stomachs.”

(45) Le médecin a radiographié leurs estomacs.the doctor has X-rayed their stomachs“The doctor X-rayed their stomachs.”

(44) also has a clitic variant:

(46) Le médecin leur a radiographié l’estomac.the doctor to-them has X-rayed the stomach“The doctor X-rayed their stomachs.”

V&Z point out some interesting properties that these constructions have. First,they observe that EPCs involve distributivity, in contrast to IPCs. For example,if leur “to them” refers to Bob, Sue and Sally in (47a) then each washed twohands. (47b), in contrast, is vaguer and can be true so long as each individualwashed at least one.

(47) a. On leur a lavé les mains.they to-them have washed the hands“We washed their hands.”

b. On a lavé leurs mains.they have washed their hands“We washed their hands.”

I N T E G R A L S

189

Page 201: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Second, observe that in EPCs the inalienably possessed part shows semanticnumber. In (44) l’estomac “the stomach” is in the singular given that childrencome with one stomach apiece. In (47a), in contrast, les mains “the hands” isplural given that people typically come equipped with two hands. In contrast,the number agreement in IPCs is constant. In both (45) and (47b) we find theplural s-marking on the head noun.

Third, following Authier (1988), V&Z note adjectival restrictions on EPCsthat do not beset IPCs. EPCs do not tolerate appositive adjectives, thoughrestricted adjectives are permitted.

(48) a. *Pierre lui a lavé les mains sales.*Pierre to-him has washed the hands dirty“Pierre washed his dirty hands.”

b. Pierre a lavé ses mains sales.Pierre has washed his hands dirty“Pierre washed his dirty hands.”

c. Il lui a bandé les doigts gelés.He to-him has wrapped the fingers frozen“He bandaged his frozen fingers.”

(48a,b) display the contrast between the external and internal poss construc-tions while (48c) shows that adjectival modification of the external poss con-struction is possible so long as the adjective is restrictively interpreted.

The account that we have elaborated above for inalienable constructionsgives us an analysis of these three facts. Given our extension of theKayne/Szabolcsi approach, (44) and (46) involve a SC structure such as (49).

(49) le médecin a radiographié [DP Spec D/P [[DPposs [les enfants] Agr [l’estomac]]]]

the doctor has X-rayed the children the stomach

To derive the correct output for (44), l’estomac raises to the Spec position anda is inserted. a�les is spelled out as aux. To derive (46), the clitic leur (cor-responding to les enfants) moves up to clitic position. The derivations are asin (50).

(50) a. Le médecin a radiographié [DP [l’estomac]i [D/P à [[DPposs

les enfants Agr ti]]]]the doctor has X-rayed the stomach

the childrenb. Le médecin leuri a radiographié [DP [l’estomac]j

D/P [[DPposs ti Agr tj]]]16

the doctor to-them has X-rayed the stomach

Now consider the interpretation of these constructions. The SC source carrieswith it the R-predication interpretation. What lends the distributivity to theseconstructions is the R-predication. V&Z note that a predication structure is

D E R I V A T I O N S

190

Page 202: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

required to get the distributive readings. However, it is not just any predicationwhich licenses it. For example, (51) can be true even if some strawberries arenot spoiled. That is, despite the predication relation holding between the subjectand the predicate distributivity is not enforced in standard cases.

(51) The strawberries are spoiled.

What is required for distributivity is not standard predication but R-predication.This makes sense. In asserting that children have stomachs as constituents wemean that each child has one – or whatever the canonical number happens tobe.17

Note, furthermore, that (following V&Z) we assume that the numbermarking in these French constructions is significant, in that it provides the cardi-nality of the predicate. Thus we account for the number facts noted above.After all, it is true that typically children are constituted of one stomach and twohands (and see Note 17). Last of all, given that the inalienably possessedelement is actually a predicate on this view, it is not surprising that it cannot beappositively modified. Appositive modification holds for individuals, not predi-cates. For example, in English, appositive relatives are restricted to referringheads, whereas non-restricted relatives apply to N� predicates. This correctlybars appositives adjectives from EPCs, given our analysis.

In sum, our proposed extension of the Kayne/Szabolcsi account provides arationale for the properties that V&Z described.

I N T E G R A L S

191

Page 203: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

10

FROM BEING TO HAVING

Questions about ontology from aKayne/Szabolcsi syntax†

192

1 Possession in cognition

The relation between being and having has puzzled humans for millennia.Among grammarians, Benveniste offers an excellent instance of both cautionand open mindedness when dealing with the details of this intriguing relation-ship. He tells us:

That to have is an auxiliary with the same status as “to be” is a verystrange thing. [To have] has the construction of a transitive verb, but itis not. . . . In fact, to have as a lexeme is a rarity in the world; most lan-guages do not have it.

(1971: 168)

This is more than a curiosity about an auxiliary verb. Think of the relationbetween the sentences John has a sister (, Mary) and Mary is a sister (of John’s).

The traditional analysis for this phenomenon (for instance, as insightfullypresented in Keenan 1987) is in terms of postulating a relational term sister,which has two variable positions, as a matter of lexical fact. Then the intuitionis: one of two elements can saturate each variable position. If what we maythink of as the referent of sister is promoted to subject of the sentence, we haveMary is a sister (of John’s). If instead the other, possessor element is promotedto subject position, what we get is John has a sister (, Mary). All that be andhave do is mark each relation.

But if Benveniste is right, be and have in fact cannot systematically mark eachrelation, particularly in languages that lack have. The immediate question is: whatis then the difference between being a sister and having a sister? How do we knowthat one of these can only be a property of Mary while the other is a property ofJohn, but may be Mary’s as well? Is all of this out there, in reality, or is itsomehow a function of the way humans conceptualize the world – and if so, how?

Interestingly, Du Marsais worried about this very issue. The following quoteis taken from Chomsky.

Just as we have I have a book, [etc.] . . . we say . . . I have fever, . . .envy, . . . fear, a doubt, . . . pity, . . . an idea, etc. But . . . health, fever, fear,doubt, envy, are nothing but metaphysical terms that do not designate

Page 204: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

anything other than the ways of being considered by the points of viewpeculiar to the spirit.

(Chomsky 1965: 199, n. 13)

It is equally telling to see the context where Chomsky invokes his reference toDu Marsais, just after reminding the reader how “certain philosophical posi-tions arise from false grammatical analogies” (p. 199). To support his view, heintroduces the following quote from Reid (1785), alluding to having pain.

Such phrases are meant . . . in a sense that is neither obscure nor false.But the philosopher puts them into his alembic, reduces them to theirfirst principles, draws out of them a sense that was never meant, and soimagines that he has discovered an error of the vulgar.

(p. 199)

Chomsky then goes on to suggest that “a theory of ideas” cannot deviate fromthe “popular meaning,” to use Reid’s phrases.

With this perspective in mind, consider the fact that all of the expressions in(1) have a possessive syntax.

(1) a. John has a houseb. John has only one armc. John has a sister: Maryd. John has a bad temper

When we say possessive syntax, we must not just mean that these expressionscan go with have; they can also appear as in (2).

(2) a. John with a houseb. John’s only armc. A sister of John’sd. John is bad tempered

Certainly a relation, in fact a legal one, exists between John and the house heowns. Likewise, any part of John’s may be expressed, with respect to him, inpossessive terms. It is tempting to blame this on a more general notion of“inalienability.” It is, however, not clear that one’s parts are uncontroversiallyinalienable – or there would be no successful transplants. The notion “inalien-able” is even harder to characterize once part/whole relations are abandoned.Family relations seem inalienable, but not obviously – as the child who recentlydivorced her mother can readily attest. And as we saw, matters get even moreconfusing with abstract possessions. Children are said to have the tempers oftheir nasty relatives and the looks of their nice ones. What does this reallymean, if these notions are supposed to be inalienable?

It is also tempting to think that just about any relation between two entitiescan be expressed as a possession. This, however, is false. I relate to you rightnow, but it makes no sense to say “I have you.” Numbers relate to each other, ina sense inalienably, yet what does it mean that “3 has 1.2?”

Against Reid’s advice, one could perhaps say there are a handful of core

F R O M B E I N G T O H A V I N G

193

Page 205: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

primitive possessive relations, and the rest are accidents of history, gamespeople play or metaphors. It is, however, surprising to find the same types ofaccidents, games or metaphors, culture after culture. Take the examples in (3).

(3) a. Juan con una casaJuan with a house

b. Su único brazohis only arm

c. Una hermana suyaa sister his

d. Está de mal humor.is-3sg of bad temper

Basically the same things one is said to have in English, one is said to have inSpanish. Or in other languages, for that matter, as illustrated in (4).

(4) a. Vai: Nkun ?. be.my head exists“I have a head.”

b. Turkish: Bir ev-im vara house-mine is“I have a house.”

c. Mongol: Nadur morin buyto me a horse is“I have a horse.”

d. Ewe: Ga le asi-nyemoney is in-my hand“I have money.”

I have chosen the instances in (4) from unrelated languages which exhibit superfi-cial differences with both English and Spanish (for example, they do not involvehave). Even so, the possessed elements here are hardly surprising. And as Ben-veniste (1971: 171) puts it, at “the other end of the world” (Tunica) there is a classof verbs that must carry prefixes of inalienable possession, and express emotionalstates (shame, happiness), physical states (hunger, cold), or mental states (know-ledge, impressions). No such morphological manifestation exists in Spanish, butobserve the examples in (5), which simply reiterate Du Marsais’s point.

(5) Juan tiene …EMOTIONAL STATE PHYSICAL STATE MENTAL STATEvergüenza hambre conocimiento“shame” “hunger” “knowledge”alegría frío impresión“happiness” “cold” “impression”

If the conceptual agreement between pre-Colombian inhabitants of Louisianaand their brutal conquerors is an accident, this can be no other than the humanaccident.

D E R I V A T I O N S

194

Page 206: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

In sum – and this is what should worry us as grammarians – there is noobvious way we have of defining possession without falling into vicious circular-ity. What expressions are capable of appearing in the context of have and thelike? Possessive expressions. What are possessive expressions? Those thatappear in contexts involving have and the like. So at the very least inalienablepossession appears to be a cognitive notion, seen across cultures with minimalvariations. Still, what does this mean? Consider an example taken from afamous commercial, the punch line of which reads as in (6).

(6) I want the soldier that belongs to this beer to step forward!

The individual in question is no other than John Wayne himself, which raisesthis question: what might the nature be of that improbable beer that happens toown the duke? Is that serious possession or military talk? Perhaps the latter, butthe Spanish examples in (7) suggest otherwise.

(7) a. El oro tenía forma de anillothe gold had form of ring

b. El anillo tenía (9gde) orothe ring had (9gr of gold

(8) a. La ciudad tenía (estructura de) barriosthe city had structure of neighborhoods

b. Los barrios tenían lo peor de la ciudadthe neighborhoods had the worst of the city

The point of (7) and (8) is that, to some extent, they manifest an inalienablepossessive relation and its inverse, with roughly the same syntax. Granted, theseexamples do not have the perfect symmetry of the John Wayne case in (6), butthis may be a relatively low-level fact. Once we abandon the specific expressionof possession through have or similar elements, we find (9)–(10).

(9) a.

a�.

b.

b�.

(10) a.

a�.

b.

b�.

F R O M B E I N G T O H A V I N G

195

El peso de un kilothe weight of one kiloUn kilo de pesoone kilo of weightUna concentración de 70°a concentration of 70°70° de concentración70° of concentration

Una organización de subgruposan organization of subgroupsSubgrupos de organizaciónsubgroups of organizationUn ensamblaje de partesan assembly of partsPartes de ensamblaje

Page 207: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

We could, of course, claim that (9) and (10) are not really possessive. It is unclearwhat that means, though, in the absence of an ontological notion of possession.Syntactically, we can say such things as the organization had subgroups or the sub-groups had organization, as much in Spanish as we can in English; and certainly,there is a characteristic inalienability to all of the notions in (9) and (10). One canretreat, then, to saying that the organization the subgroups have is not the sameorganization that has the subgroups – but apart from hair-splitting, this is far fromobvious. For the bottom line is: “are we more justified in saying this substance hasform than we are in saying that this form has substance?” And if these are bothgrammatical, are we always going to insist on the opaque claim that the form thissubstance has is not the same as the form that has this substance?

There is a different way to proceed. Suppose we agree that all of the above,form and substance, organization and subgroups, concentration and degrees,and even John Wayne and his temper, his horse or even his beer, stand in a yet-to-be-determined Relation R, which in fact number 3 and number 1.2, or writersand readers, for some reason do not stand in. Crucially for our purposes,however, that Relation R has nothing to do, in itself, with the subject or objectpositions of a verb have or a preposition like of. Quite the opposite, an intricatesyntax carries the terms of Relation R to either subject or object of the relevantsyntactic expressions. Are there advantages in making such a claim?

2 Every term can be relational

First, I find it interesting that we cannot confine relational problems to so-calledrelational terms like sister. The minute we extend our coverage to part-wholepossessions, just what is not a part of something else? Other than the lexicalentries for God, all terms in our lexicon have to be thought of as relational. Thisimmediately suggests we are missing a generalization that should be placed outof the idiosyncratic lexicon.

Second, consider the intriguing facts in (11) and (12).

(11) a. The poor neighborhoods of the cityb. The city’s poor neighborhoodsc. The city has poor neighborhoods

(12) a. A city of poor neighborhoodsb. * The/a poor neighborhoods’ cityc. The poor neighborhoods are the city’s

Note that the part-whole relation (city, neighborhood) is kept constant in allthese examples. Traditionally, this was expressed in terms of neighborhoodhaving two variable positions, one referential and one possessive. Now, in (11)and (12) we see the city and the neighborhoods each promoted to subject posi-tion (or concomitantly, associated to the preposition of ). This is as expected.What is not expected is that (12b) should be out.

One could try to blame that on the fact that, in this instance, the relational

D E R I V A T I O N S

196

Page 208: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

term neighborhoods is relinquishing reference to the other term, city. But thissurely cannot be a problem in itself given the perfect (11b), where reference is tocity. One might then try to take reference as an indication that, to begin with, weshould not have compared the two paradigms; this, of course, would be because(11) invokes reference to neighborhoods, whereas (12) does only to city. If refer-ence is an intrinsic property of a word, is this not mixing apples and oranges?

Keep in mind, however, the central fact that in both (11) and (12), the R rela-tion between city and neighborhoods is constant, crucially regardless of the ulti-mate reference of poor neighborhoods of the city or city of poor neighborhoods. Ifwe demand that these two have nothing in common, the implied lexicon is goingto be even uglier, for now we need two relational terms, neighborhoods and city,since each of these can project its own structure and be related to something else.This is worse than before. We needed to say that all terms in the lexicon are rela-tional, but now we have to sortalize Relation R, the way a city relates, as a whole,to a neighborhood is different from how it relates, as a part, to a state.

And never mind that: the greatest problem still is why on earth (12b) isimpossible. I doubt this is a quirk, given that the very same facts hold in Spanish(as well as other languages), as shown in (13)–(14).

(13) a. Los barrios pobres de la ciudadthe neighborhoods poor of the city

b. Sus barrios pobres (los de la ciudad)its neighborhoods poor those of the city

c. La ciudad tiene barrios pobresthe city has neighborhoods poor

(14) a. Una ciudad de barrios pobresa city of neighborhoods poor

b. *Su ciudad (la de los barrios pobres)*their city that of the neighborhoods poor

c. Los barrios pobres son de la ciudad.the neighborhoods poor are of the city

Indeed, the facts are extremely general, as (15)–(16) shows.

(15) a. Los brazos largos de un calamarthe arms long of a squid

b. Sus brazos largos (los del calamar)its arms long those of-the squid

c. El calamar tiene brazos largos.the squid has arms long

(16) a. Un calamar de brazos largosa squid of arms long

b. *Su calamar (el de los brazos largos)*their squid that of the arms long

c. Los brazos largos son del calamar.the arms long are of-the squid

F R O M B E I N G T O H A V I N G

197

Page 209: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Again, there are differences of detail between languages. For example, Spanishdoes not realize non-pronominal noun phrases in pre-nominal position, as doesEnglish; and English uses the expression a long-armed squid, with noun incor-poration, for the corresponding Spanish “a squid of long arms” (16a). Butneither language allows a form such as *the long arms’s squid or *their squid(16b), meaning the one with long arms. Needless to say, the syntactician mustalso predict this surprising gap, but it is precisely here that syntax works. Con-sider in this respect the syntax in (17), the structure discussed by Kayne (e.g.1994) and Szabolcsi (e.g. 1983) in recent years, presented in the more accurateguise in Chapter 9. The main difference with Kayne’s structure is that, insteadof bottoming out as an AgrP, (17) (p. 291) is built from a small clause, which isdesigned to capture Relation R in the syntax.

Although I keep Kayne’s Agr intact, I think of this position as a referentialsite because, as it turns out, whatever moves to its checking domain determinesreference. If we are to be technical, the moved item is assigned a referentialfeature that is attracted to the checking domain of Agr. This means the deriva-tions in (18) start from different lexical arrays, which is as expected: despite theobvious parallelism, the terms of the relations differ, at least, in definitude.

Nevertheless, what is important is that the two expressions in (18) have thesame, as it were, pseudo-thematic structure, and hence code the same Relation R.

(17)

(18) a.

D AgrP

DP

D'

t neighborhoods

Agr'

Agrof

SC

city[�r]

b.

D AgrP

DP

D'

city t

Agr'

Agrof

SC

neighborhoods[�r]

D AgrP

DP

D'

city neighborhoods

Agr'

Agr SC

D E R I V A T I O N S

198

Page 210: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Now observe the interesting movements, as in (19).

(19)

Note that this time we involve the specifier of D. I think this is to checkspatial contextual features, although this is really not crucial now. As is cus-tomary, we take the genitive ’s to materialize as the head of D when its specifieris occupied, and following Kayne, we do not lexicalize Agr when it doesnot support lexical material, presumably because it incorporates to the D head.But these are all details. What matters is the shape of the movement paths.The ones in (19b) cross, while those in (19a) are nested. One possible accountfor the situation in (19) is in terms of the Minimal Link Condition. Intuitively,the head D cannot attract neighborhoods in (19a) because city is closer. But themain point is that, whereas the movements as shown in (19) are meaningfullydifferent, we cannot say much about the relevant lexical correspondences –which would all seem to be licit. This simply means that it pays off to placeRelation R in the syntax, contra the traditional assumption that views it asmerely lexical.

3 The syntax of possession

We have, then, both conceptual and empirical reasons to suppose not only thatthe Kayne/Szabolcsi syntax for possession is right, but furthermore that this iswhere possession itself, as a semantic Relation R, is encoded – instead of lexi-cally through relational terms. Ontologically, this is very interesting, since wenow have to ask under what circumstances two terms enter into the pseudo-thematic relations involved in the small clause under discussion. We wantthe part (for instance, neighborhood) to be the small clause predicate, and thewhole (for instance, city) to be the small clause subject, but why? Why not theother way around? And is this general? Likewise, if whatever moves to the Agrdomain determines reference, what is the nature of reference? Is referencealways coded in this relativistic manner?

If these questions seem too troubling, note that we can propose a very trans-parent semantics for the objects in (13)–(14).

a. b.

D’s

AgrP

DP

D'

t t

Agr'

Agr SC

neighbor[�r]

city

D’s

AgrP

DP

D'

t t

Agr'

Agr SC

city[�r]

neighborhoods

F R O M B E I N G T O H A V I N G

199

Page 211: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(20) a. [∃e:city(e)] T1(city, e) & T2(neighborhood, e)b. [∃e:neighborhood(e)] T1(city, e) & T2(neighborhood, e)

We can think of each term of the small clause as satisfying some sort of primi-tive pseudo-role (T1 or T2), of Agr as the lexicalization of an event variable,and of D as a quantificational element. The small clause determines the pseudo-thematic properties of the expression, much as a verb phrase determines thethematic properties of a sentence; the primitive pseudo-roles of T1 and T2 donot seem, a priori, any more or less worrisome than corresponding verbal roleslike AGENT or THEME. In addition to pseudo-thematic or lexical structure, thefunctional structure of the expression determines referential and quantifica-tional properties, through a variable site and a quantificational site. This fleshesout Szabolcsi’s intuition that nominal and verbal expressions are structurallyalike.

Architectural matters aside, though, we have to worry about the internalmake-up of the small clauses. It is not enough to think of T1 and T2 as, say,WHOLE and PART roles, since we have seen the need for other inalienable rela-tions. Of course, we could simply augment our vocabulary for each new sort ofrelation we find – but that is hardly illuminating. The bottom line is whetherthere is anything common to the three sentences in (21).

(21) a. El vino tiene 12°.the wine has 12°

b. La organización tiene varios subórganos.the organization has several sub-organs

c. La gente mediterránea tiene muchos parientesThe people Mediterranean has many relatives

(21a) expresses a relation between a mass term and the measure of its attributeof concentration; (21b), the relation between a count term and the expression ofits classifier of structure; (21c), the relation between an animate term and aspecification of its kinship. Given their syntactic expression, these would allseem to be manifestations of Relation R – but what does that mean?

Note, also, the facts in (22).

(22) a. El vino tiene 12° de concentración.the wine has 12° of concentration

b. La organización tiene varios subórganos de estructura.the organization has several sub-organs of structure

c. ? La gente mediterránea tiene muchos parientes de familia.? the people Mediterranean has many relatives of family

Each of the possessed elements in (21) can show up with an associate termwhich demarcates the type of possession at stake. Curiously, this term can bepromoted, as shown in (23) (see [9]–[10] above).

(23) a. El vino tiene una concentración de 12°.the wine has a concentration of 12°

D E R I V A T I O N S

200

Page 212: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

b. La organización tiene una estructura de varios subórganos.the organization has a structure of several sub-organs

c. La gente mediterránea tiene familias de muchos parientes.the people Mediterranean has families of many relatives

Again, the expressions in (23) do not mean the same as those in (22). However,Relation R is kept constant in either instance. The examples in (23) are alsosignificant in that any naive analysis of their syntax will make it really difficult toestablish a thematic relation between the matrix subject and what looks like thecomplement of the object. Plainly, thematic relations are not that distant. Fortu-nately, the Kayne/Szabolcsi syntax gives us (24).

(24)

Depending on what gets promoted, T1 or T2, we will find the same sort ofdistributional differences we already saw for (18) and the like. In turn, what thepossessors in (22) and (23) – wine, the organization, and Mediterranean people –possess is the entire structure in (24), whatever it is that it refers to in eachinstance. That way, the thematic relation is as local in (23) as in (22) or (21),directly as desired.

A related question that we must also address is when auxiliary have appears,and when it does not. Compare (23) to (25), which involves, instead, auxili-ary be.

(25) a. El vino es de 12°.the wine is of 12°

b. La organización es de varios subórganos.the organization is of several sub-organs

c. La gente mediterránea es de muchos parientes.the people Mediterranean is of many relatives

The structure of these examples is very transparent, as in (26). This ispossessor raising, of the sort seen in many languages. Of course, possessorraising is also at issue in similar examples involving have. According toKayne’s analysis, a derivation of the form in (26) is involved in the paradigm of(21) to (23), with an associated D-incorporation to be. This, in the spiritof Freeze (1992), is in fact what causes be to Spell-out as have. If so, whatis really the difference between (26), yielding (25), and a similar derivationyielding (21)? Why does only one involve D-incorporation, resulting in aux-iliary have?

a. b.

Agr SC

D'

Agr'

organsstructure

D

c.

Agr SC

D'

Agr'

relativesfamily

D

Agr SC

D'

Agr'

degreesconcentration

D

F R O M B E I N G T O H A V I N G

201

Page 213: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(26)

There had better be some difference, because there is actually a rather import-ant consequence for meaning in each structure. Thus, compare the examplesin (27).

(27) a. La ciudad es de barrios pobres.the city is of neighborhoods poor

b. La ciudad tiene barrios pobres.the city has neighborhoods poor

(27a) tells us what kind of city we are talking about, a poor city. However, (27b)tells us what kinds of neighborhoods the city has. Some are poor. It could be,and in fact there is an invited implicature to the effect that the city also has richneighborhoods. No such implicature is invited in (27a), and the existence of richneighborhoods in the scenario that (27a) makes reference to is contradictorywith what the proposition asserts.

These sorts of contrasts indicate a different derivation for be cases, as in (26),and for have cases, as in (28). Note, first, that this derivation is consistent withthe fact that we can say in English the city has poor neighborhoods in it. We mayfollow Kayne in taking in – a two-place relation – to be one of the possible real-izations of the D trace, redundantly spelled-out as a copy when the spacialcontext of [Spec, DP] forces agreement with it. Likewise, the clitic it spells-outthe trace of the raised possessor in the specifier of Agr.1 In both of these sugges-tions, I am assuming that movement is an underlying copying process, whichmay or may not involve trace deletion depending on linearization factors (as inNunes 1995).

POSSESSOR be … of POSSESSED

D AgrP

V'

D'

POSSESSOR[�r]

be

Agrof

SC

Agr'

t POSSESSED

D E R I V A T I O N S

202

Page 214: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(28)

But of course, the most pressing question is why (28) does not violate theMinimal Link Condition, just as (19a) did. Arguably, this is because the struc-ture in (28) involves the Freeze-type incorporation that results in have. Thegeneral import of this sort of incorporation is to collapse distance within its con-fines. If so, the major difference between (19a) and (28) is that only in the latterare the terms related to the Agr/D skeleton equidistant from higher sites, in thesense of Chomsky (1995b: Chapter 3) (though with differences of detail that Iwill not explore now). Of course, the Freeze incorporation is not done in orderto salvage a derivation. It is just a possibility that Universal Grammar grants, bydistributing appropriate matching features to the relevant lexical items, forinstance, affixal features to the Agr and D that incorporate. Without the neces-sary combinations of features, the alternative derivations terminate along theway, and are thus not valid alternatives.

At any rate, why does this syntax entail the appropriate semantics? I have sug-gested in passing that the element that moves to the specifier of D codes contextualconfinement for the quantifier in D; then we expect that movement through thissite would have important semantic correlates. Concretely, in (28) the possessedserves as the context where Relation R matters, whereas the possessor determinesR’s reference. Concerning poor neighborhoods, the city has that, but there is noreason to suppose the city does not have something else. This is different from thesemantics that emerge for (26), where the possessor, a city, may serve to confinethe context (although in the diagram I have not represented the possessor movingto [Spec, DP]). In this instance, the referent of R and the context confiner of thequantification over events which are structured in terms of R, are one and thesame. Differently put, this is a decontextualized Relation R. Regardless of context,the city is endowed with poor neighborhoods. In other words, poor neighborhoodsis a standing characteristic of the city, in the sense of Chapter 11.

POSSESSOR [be�D] (�have) … POSSESSEDPOSSESSOR…

D'

V'

DPbe

t[�r]

Agr'

AgrP

Agr

POSSESSED[�c]

D

SC

t t

[�r] � reference[�c] � context

F R O M B E I N G T O H A V I N G

203

Page 215: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

4 Paradigmatic gaps

I trust that these derivational gymnastics have the effect, at least, of confirmingthat some serious syntax is plausibly involved in possession. But now we haveobviously opened Pandora’s box. If indeed matters are so transparently syntac-tic as I am implying, why are there any gaps in the paradigms?

(29) a.

a�.

b.

b�.

(30) a.

a�.

b.

b�.

(29) is as predicted. Depending on what moves to the referential site in theKayne/Szabolcsi structure, we refer to either term of Relation R, as we canattest through agreement. For instance, kilo in (29a) agrees in the masculine,whereas carne “meat/flesh” in (29b) agrees in the feminine. However, we canhave referential shifts only with certain canonical measures or classifiers, such askilo or group, but not with cart or flock, as shown in (30). This is confirmed in(31)–(32).

(31) a.

a�.

b.

b�.

(32) a.

a�.

b.

D E R I V A T I O N S

204

El kilo de carne que corte Shylock deberá de ser exacto.the kilo of flesh that cut Shylock must be exact.AGR

El kilo de carne que compres deberá de ser tierna.the kilo of meat that buy-you must be tender.AGR

El grupo de truchas que estudio es interesantísimo.the group of trouts that study-I is interesting.SUP.AGR

El grupo de truchas que ví eran alevines.the group of trouts that saw-I were young.AGR

El carro de leña que traigas deberá de estar engrasado.the cart of wood that bring-you must be oiled.AGR

*El carro de leña que traigas debe de estar seca.the cart of wood that bring-you must of be dry.AGR

Una bandada de pájaros está muy organizada.a flock of birds is very organized.AGR

*Una bandada de pájaros están piando como locos.a flock of birds are chirping like crazy.AGR

La carne tendría el peso de un kilo.the meat must have the weight of one kiloLa carne tendría un kilo de peso.the meat must have one kilo of weightLas truchas tenían la estructura de un grupo.the trouts had the structure of a group? Las truchas tenían un grupo de estructura.

the trouts had a group of structureLa leña tendría las dimensiones de un carro.the wood must have the dimensions of a cart* La leña tendría un carro de dimensiones.

the wood must have a cart of dimensionsLos pájaros tenían la organización de una bandada.the birds had the organization of a flock

Page 216: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

b�.

Although all possessive relations can be expressed in the pedantic guise of indi-cating not just a certain measure or classifier of the possessor, but also the typeof measure or classifier this is, a reversal of this expression is possible only withcanonical measures or classifiers, as shown in (31), but not otherwise (cf. 32).

Observe also the curious facts in (33) and (34).

(33) a. (gramos de) oro con/tiene(n) *(forma de) anillo.(grams of gold with/have(pl) form of ring

b. (*forma de) anillo con/tiene (gramos de) oro.(*form of ring with/has grams of gold

(34) a. (conjunto de) truchas con/tiene(n) *(estructura de) grupo.(set of trouts with/have(pl) structure of group

b. (*estructura de) grupo con/tiene (conjunto de) truchas.(*structure of group with/has set of trouts

Why, together with gold with the form of a ring or trouts with the structure of agroup, can we not say *gold with ring or *trouts with group? Why do we need tospecify the notions “form” or “structure”? Conversely, we may say grams ofgold with the form of a ring or set of trouts with the structure of a group, but not*form of ring with gold or *structure of group with trouts. Here what we cannotdo is specify notions like “form” or “structure,” though they seem to be seman-tically appropriate.

Note also that have is involved in the examples, which signals a derivationlike (28). Curiously, though, the examples in (33) and (34) involve raising of thepossessed (ring or group), instead of the possessor, as we saw in (28). I believethis is possible because of the Freeze incorporation, which leads to the spelled-out have, and which should allow either term of Relation R to be promoted tosubject position.

To clarify the possibilities that this allows, I present the diagram in (35), withaccompanying explanations.

F R O M B E I N G T O H A V I N G

205

Los pájaros tenían una bandada de organización.the birds had a flock of organization

Page 217: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(35)

Explanation of diagramFirst, we distribute features: [�r], a referential feature; [�c], a con-textual feature; and [�D], the Extended Projection Feature thatmakes something a subject. Observe how all items marked [�D] arepromoted to subject position (the top element in the structure); howthe items marked [�c] move to or through the contextual site, byassumption [Spec, DP]; and how the items marked [�r] move to orthrough the referential site, [Spec, AgrP]. Needless to say, I am assum-

c.POSSESSED[�D]

D'

V'

DPbe

t[�r]

Agr'

AgrP

Agr

POSSESSOR[�c]

D

SC

t t

d.POSSESSED[�D]

D'

V'

DPbe

Agr'

AgrP

Agr

POSSESSOR[�r]

D

SC

t t

t[�c]

a. b.POSSESSOR[�D]

D'

V'

DPbe

t[�r]

Agr'

AgrP

Agr

POSSESSED[�c]

D

SC

t t

POSSESSOR[�D]

D'

V'

DPbe

Agr'

AgrP

Agr

POSSESSED[�r]

D

SC

t t

t[�c]

D E R I V A T I O N S

206

Page 218: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

ing that different elements may involve different features, sometimestwo of them.

Now, this should allow for more possibilities than are, in fact, attested: the pos-sible combinations are as in (36), but only some are fully grammatical.

(36) a. * (grs of) gold with (the form of) a ring in it/them.a�. ? (set of) trouts with (*the structure of) a group in it/them.b. (grs of) gold with *(the form of) a ring (*in it/them).b�. (set of) trouts with *(the structure of) a group (*in it/them).c. (*form of a) ring with (grs of) gold in it.c�. ? (*structure of a) group with (a set of) trouts in it.d. ! (form of a) ring with (grs of) gold (*in it).d�. ! (structure of a) group with (a set of) trouts (*in it).

Let me abstract away from the merely questionable status of some of theseexamples, concentrating only on relative judgments vis-à-vis completelyungrammatical instances. In turn, observe how the examples marked with anexclamation mark are possible strings of words – but this is arguably a mirage. Iuse the in it lexicalization of traces as an indication of the structure that con-cerns us now; (35b) and (35d) do not allow for such a lexicalization, since thesurface grammatical object lands in [Spec, AgrP]. In contrast, examples with thestructure in (35c) have the desired output format, in addition to the curiousraising of the possessed element.

For reasons of space, I will not examine all of the structures in (35) in anydetail. The main point I am trying to raise is a simple one, though. Syntax alonedoes not predict the contrasts in (36) – at least I have not been able to deter-mine how it could.

5 Toward a semantics for possession

Nevertheless, some intriguing generalizations can be abstracted away from (36),in two separate groups. Those concerning referentiality are in (37).

(37) I. In possessive structures mass terms are not referential.II. A possessed T2 can be a subject only if referential.

If (37I) is correct, (35a) and (35d) are not viable derivations for mass terms in pos-sessor guise, since these derivations would leave the referential Agr unchecked – amass term being improperly forced into a referential site. This describes theungrammaticality of (36a) and (36d). In turn, if (37II) is true, the movement tosubject position in (35d) must be impossible, a non-referential possessed elementending up as subject; correspondingly, (36d) must be ungrammatical.

The generalizations in (38) concern the possible formats of possessed terms.

(38) I. When the possessed T2 is manifested in the referential site, it mustbe typed with an overt marker.

II. Elsewhere, the possessed T2 may not be overtly typed.

F R O M B E I N G T O H A V I N G

207

Page 219: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

As we already saw, the terms of Relation R may surface in purely lexical guise(as gold or trouts), or through the more detailed expression of their extension (assome measure of gold or some set of trouts). In fact, even in its bare guise, a nounlike gold in our examples really means some measure of gold, just as groupmeans a structure of a group, and so on. In any case, these manifestations aregenerally possible, occasionally obligatory, and occasionally impossible. Curi-ously, the possessor term T1 of Relation R has no obvious restrictions. In con-trast, (38I) describes obligatory manifestations of the possessed term T2, as in(36b) and (36b�); and (38II) describes impossible manifestations of the possessedT2, as elsewhere in the paradigm. In other words, it is mostly T2 that is respons-ible for the idiosyncrasies in (36). This might help us understand Relation R.

I have not really said much about what Relation R is. The question is verydifficult. However, given the generalizations in (37) and (38), it seems as if T2,the second term of R, is less innocent than the semantics in (20) would lead usto believe. There, the possessed T2 is taken as a pseudo-role, just as the posses-sor T1 is. However, we now have reason to believe that T2 is special. Forexample, when T2 is promoted to a grammatical site where reference appears tobe necessary, we must accompany this element by a grammatical mark thatovertly marks its type, like set, for instance. Otherwise, we in fact cannot markthe type of T2. This would make sense if T2 is itself the kind of element whosegeneral purpose is to tell us something about the more or less abstract type ofan expression, a kind of presentational device for an otherwise unspecified con-ceptual space. The idea is expressed as in (39).

(39)

Forgive my origami metaphor. The intention here is to talk about a raw mentalspace which gets measured, shaped, or otherwise topologically transformed, byway of a presentational operation. If this view of Relation R is on track, then T1and T2 have a very different status. T2 is really an operation on T1, and thesemantics in (20) would have to be complicated to capture this picture, a formalexercise that I will not attempt now (though see Chapter 14).

The intuition is that generalization (38II) is the default way of realizing T2.What we see then in (38) is the Paninian Elsewhere Condition at work. When inreferential sites, presentational device T2 is forced out of its canonical realiza-tion. In these contexts, T2 surfaces in the specific format that makes its natureexplicit, as a set, or whatever. This way of looking at T2 has nice, independentconsequences. (40) is constructed so as to allow for a plausibly ambiguous quan-tifier interaction, while at the same time not invoking an irrelevant specificreading on the possessor.

(40) By 2010, most women in Utah will have had two husbands.

The example invokes reference to two husbands per woman, in a country thatallows divorce. The alternative reading is a priori equally plausible in a statethat allows polygamy, and where divorce is infrequent. However, the possessed

MENTAL SPACE PRESENTATION

D E R I V A T I O N S

208

Page 220: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

term does not like to take scope over the possessor. We may account for this ifthe inalienably possessed element, a presentation device in the terms of (39), isfrozen in scope because it is a predicate of sorts. This is an old intuition thatsquares naturally with the syntax of small clauses that we are assigning to theterms of the R relation, where T2 is a predicate. The other aspect of the gener-alizations concerning (36) that I find interesting is the fact that mass terms arenot referential in possessive constructions. I do not know why that is, but I thinkit correlates with another fact illustrated in (41). The Spanish example in (41)shows the relevant grammatical ordering when more than one R relation isinvolved. Crucially, alternatives to it, such as (42), are ungrammatical.

(41) animal de 100 kgs (de peso) con varios órganos (de estructura)animal of 100 kgs of weight with several organs of structure

(42) *animal de varios órganos (de estructura) con 100 kgs de peso*animal of several organs of structure with 100 kgs of weight

This suggests a structural arrangement along the lines of (43).

(43)

Syntactically, (43) corresponds to the structure in (44).

(44)

If this much is granted, we have the possibility for a recursive structure, withpotentially various levels of embedding, each corresponding to some type-lifting, whatever that means.

I must emphasize that (41) is a simple piece of data, and (44) a simple struc-ture to describe it. Had (42) been grammatical, we would have needed (45)instead of (44).

SC'

PREDICATE'[various organs (of structure)]

SUBJECT'SC

SUBJECT[animal]

PREDICATE[100 kgs (of weight)]

[animal] [100 kgs (of weight)]

R'

[various organs (of structure)]R

F R O M B E I N G T O H A V I N G

209

Page 221: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(45)

But (42) is ungrammatical, and we should make much of this. We shouldbecause it makes no sense to blame the hierarchy in (44) on any sort of outsidereality. Surely (44) looks Aristotelian, with substance coded logically prior toform, but there is no physical basis for this, or any similar distinction. Likewise,it makes no sense to blame (44) on any effective reality, of the sort presumablyinvolved in how our species evolved. We have no idea what it would mean forus to code the world in terms of the alternative (45), simply because we do not.

All we know is we have (44), with or without a reason. That is enough forsomeone who is concerned with how the mind is, and pessimistic about findinghow it got to be so. In these terms, a real questions is how (44) is used to refer;apparently, standard reference in possessive structures is a phenomenon thatstarts in the second layer of structure in (44). This is like saying that the secondpresentational device in (44) is responsible for individuation, an intriguingrestatement of generalization (37I).

6 A word on standard possession

I cannot close without saying something about simple possessions, having shownwhat I would like to think of as “ontological” possession. What are we to makeof John Wayne simply having, say, a gun? Immediately, we can say this. Inas-much as standard possession exhibits many of the syntactic properties of onto-logical possession (e.g. presence of have/with and similar elements), we shouldprobably take this general sort of possession to be nothing but ontological pos-session as well. Needless to say, if we do that, we have to discover in which waystandard possession is hiding some sort of ontological claim.

Having freed ourselves from the optimistic view that a possessor is just thesubject of have, and the possessed is just its object, what prevents us from think-ing that, in John has a gun, the gun (yes, the gun) is actually ontologically in pos-session of something like a stage of John? I do not find (46) accidental in thisrespect.

(46) El arma está en manos de Juan.the gun is in hands of Juan“The gun is in Juan’s hands.”

The question, of course, is why we take the gun to be in John’s hands as a roughparaphrase for John having the gun.

Examples like (46) suggest that the gun is ontologically related to something

SC'

PREDICATE'[100 kgs (of weight)]

SUBJECT'SC

SUBJECT[animal]

PREDICATE[various organs (of structure)]

D E R I V A T I O N S

210

Page 222: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

which – for lack of a better word – we express in terms of a metaphor for a stageof John’s, his hands. This is important, because we surely do not want to say thatthe gun, in this instance at least, is ontologically related to the whole of John (orelse we would be invoking, again, the sort of inalienable possession that we haveseen so far).

That is obviously just a speculation. I find it tantalizing, though, in one crucialrespect. Once again the facts of language open a window into the facts of mind.Perhaps the small synecdoche we invoke in these instances, lexicalizing a part ofan individual in place of one of its spatio/temporal slices, is no small indicationof a surprising fact about standard possession, that it expresses an ontological,inalienable relation between what is alienably possessed and a spatio/temporalslice of what possesses it. At the very least, that is a humbling thought.

7 Conclusion

Let me gladly admit that much work lies ahead. The sketch in (44) is a syntacticstructure corresponding to promissory semantics. Relation R may turn out to bea way of operating on a mental space of some sort, perhaps (somehow) lifting itsdimensionality – but this is just a fancy way of talking about the topological littlestory in (39). Interestingly, although this may be thought of as a lexical seman-tics, it has to somehow make it into Logical Form, or else we will not predict theabsence of scope interaction in (40). Basically, the possessed element associatedto T2 does not take scope because it does not have the right type to be a scope-bearing element. Needless to say, (44) can be directly plugged into the Kayne/Szabolcsi syntax for possession, and may be seen as a generalization of theirinsight.

Philosophically, the main conclusion I would like to reach is perhaps notsurprising for the linguist. The view that possession is out there in reality, andwe code it trivially through little things like have, with, and all the rest, is mis-taken. I also think it is wrong to think of possession as the manifestation of alexical relation that certain terms allow. Possession is a kind of syntax, withwell-behaved properties. Correspondingly, the semantics of possession seems tobe a kind of presentational operation. If so, possession is just a cover term forsomething which may happen in various mental dimensions that embed withinone another. Much as I would like to turn all of this into an argument againstthe capitalist dictum that derives being from having, I am satisfied with gettingcloser to an understanding of the distributional properties of possession,without blaming them on whim, metaphor or mistakes people make.

F R O M B E I N G T O H A V I N G

211

Page 223: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

11

TWO TYPES OF SMALL CLAUSES

Toward a syntax of theme/rheme relations†

with Eduardo Raposo

1 Introduction

In this chapter we propose that the fundamental difference between “stage” and“individual” level predicates is not lexico-semantic and is not expressed in them-atic/aspectual terms. We study the apparent differences between small clauseswith an individual and a stage level interpretation (which are selected by differ-ent types of matrix verbs) and argue that these differences are best expressed byway of purely syntactic devices. In particular, we argue that what is at stake aredifferences in information (theme/rheme) structure, which we encode in thesyntax through different mechanisms of morphological marking. There are noindividual-level predicates, but simply predicates which in some pragmatic sense“are about” their morphologically designated subject (an idea already present inMilsark 1977). There are no stage-level predicates, but simply predicates whichrather than being about their thematic subject are about the event they intro-duce. The distinction corresponds roughly to what Kuroda (1972) calls a categor-ical and a thetic judgment (a terminology we adopt). The former is about aprominent argument (for us, a CATEGORY), while the latter is simply report-ing on an event. A minimalist grammar encodes differences of this sort in termsof morphological features. These features are checked in a designated site Fwhich interfaces with the performative levels, where aspects of intentional struc-ture are expressed. Having argued for this syntactic account, the chapter pro-ceeds to posing two related semantic questions. We deal with why it should bethat categorical (individual level) predication introduces a standing characteristicof a CATEGORY, while thetic (stage level) predication introduces a non-stand-ing characteristic of a standard subject argument. We also propose a line ofresearch for why CATEGORIES should be “strong” in quantificational terms,while standard arguments may be “weak,” roughly in Milsark’s original sense.We suggest that these two semantic properties follow from, and do not drive thesyntactic mapping. Our approach, thus, is blind to semantic motivation, althoughit is not immune to semantic consequence. Our main motivation in writing thischapter is that this is the correct order of things, and not the other way around.

212

Page 224: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

2 Types of predication in small clauses

Higginbotham (1983a) shows that Carlson’s (1977) distinction betweenindividual-level and stage-level predication holds systematically even inside thesimplest syntactic predication, the small clause. This raises an intriguing ques-tion, if small clauses (SC) are as proposed by Stowell (1981) (1), which leaveslittle room for expressing structural differences:

(1) [XP NP [XP Pred]]

Raposo and Uriagereka (1990) in specifically Carlson’s terms, and Chung andMcCloskey (1987) in comparable terms, show systematic differences in the dis-tribution of individual-level and stage-level SCs. Thus, only stage-level SCs canbe pseudo-clefted (2), right-node raised (3), focus fronted (4), and be depen-dents of what . . . but . . . constructions (5). We illustrate this with Spanish,although the same point can be raised more generally in Romance and Celticlanguages:1

(2) a. Lo que noto es [a María cansada].what that note.I is to Maria tired“What I perceive is Mary tired.”

b. *Lo que considero es [a María inteligente].what that consider.I is to Maria intelligent

(“What I consider is Mary intelligent.”)

(3) a. Yo vi y María sintió a Juan cansado.I saw and Maria felt to Juan tired“I saw and Maria felt Juan tired.”

b. *Yo creo y María considera a Juan inteligente.I believe and Maria considers to Juan intelligent

(“I believe and Maria considers Juan intelligent.”)

(4) a. Hasta a Juan borracho vi!even to Juan drunk saw.I“Even Juan drunk have I seen!”

b. *Hasta a Juan inteligente considero!even to Juan intelligent consider.I

(“Even Juan intelligent do I consider!”)

(5) a. Qué iba a ver, sino a su padre borracho?what went.he to see but to his father drunk“What could he see but his father drunk?”

b. *Qué iba a considerar, sino a su padre inteligente?what went.heto consider but to his father intelligent

(“What could he consider but his father intelligent?”)

Certain heads, such as perception verbs, take only stage-level SCs. Others, suchas opinion verbs, take only individual-level SCs. Furthermore, the individual-level SC must be associated to the head selecting it throughout a derivation,

T W O T Y P E S O F S M A L L C L A U S E S

213

Page 225: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

while this is not necessary for the stage-level SC, which can be displaced fromthe government domain of its head. So minimally a treatment of these mattersmust explain (i) how selection is done in these instances (i.e. what does oneselect if the structure is just (1)), and (ii) why the two types of SCs behave dif-ferently with respect to their dependency on the head that selects them (see theAppendix).

3 Some recent proposals

An approach taken for SCs by Iatridou (1990) and Doherty (1992), and moregenerally for other predicates by at least Diesing (1990), De Hoop (1992) andBowers (1992b), builds on Kratzer’s (1988) claim that only stage-level predi-cates introduce an event argument position e (a line discussed as well in Higgin-botham 1983a). But it is not obvious what this means for SCs.

The first difficulty arises because, although the different kinds of predicationare easy enough to ascertain, it is not clear that there are pure individual-levelor stage-level predicates. Thus, one can see or feel John finished as much as onecan consider or declare John finished. In many languages John is finished maytake a stage-level or an individual-level mark, such as an auxiliary or a givenCase form in John. In fact one wonders whether the most rigidly individual-level or stage-level predicates (I saw him angry vs. ??I consider him angry; ??Isaw him intelligent vs. I consider him intelligent) are so narrow because of purelypragmatic considerations.2

But pragmatics aside, the grammar must provide a way in which a regularpredicate may be taken as either a standing or a transient characteristic of itssubject. This of course is the traditional intuition, however we may end up treat-ing it. So a Kratzer-type approach forces us to systematically duplicate the syn-tactic representation of predicates like finished, angry or intelligent. In Kratzer’sterms, this entails that all predicates are ambiguously selected into phrasemarkers as in (1): either with or without an extra argument.

The syntactic expression of this systematic ambiguity is not without prob-lems. The intuition that all variants of Kratzer’s approach pursue is this. At D-structure the subject of an individual-level predicate is outside the lexicalprojection of this predicate. There are different ways of executing this, butmechanics aside, the question for SCs is clear. What does it mean for a subjectto be outside of an SC in D-structure? SCs are not VPs. They are simple predi-cation structures. To be outside of a SC is not to be part of the SC. So either ourconception of these constructions as in (1) is incorrect, or else subjects for theseelements are simply not outside of their domain. More generally, within currentsyntactic views and particularly in the Minimalist Program of Chomsky’s(1993b), all arguments are projected within the lexical domain of a word, sincethere is no level of representation to project them otherwise. That is, there is noD-structure to say that the argument Y of X is outside of the projection of X. IfY is an argument of X, Y starts within the X�-shell associated to X.

Second, and more generally, it is unclear what it means for a predicate not to

D E R I V A T I O N S

214

Page 226: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

have a Davidsonian argument. The neo-Davidsonian project of Higginbotham(1985) and (1987) is rather straightforward about this. Clearly, Davidson’s ori-ginal motivation for the event positions holds inside the simplest of SCs. Thus,you can consider Hitchcock brilliant, and raise the consideration vis-à-vis otherHollywood directors, only for his American movies, and putting aside hissexism. All of this can be predicated of the eventuality of Hitchcock’s brilliance,and it is unclear how to add these circumstances otherwise, short of falling intothe poliadicity that worried Davidson and motivated event arguments to beginwith.3

Third, empirical problems arise. Diesing (1990) argues that Kratzer’sapproach is incorrect. Citing evidence from Bonet (1991), Diesing notes that inCatalan all subjects are VP internal, including subjects of individual-level predi-cates. Bonet’s argument is based on floating quantifiers, which followingSportiche (1988) she assumes originate VP internally. Floated quantifiers can beVP internal regardless of the nature of the predicate, as (6) shows:

(6) The pigs are all stout.

The floating quantifier in (6) tells us the underlying position of the subject,which must thus be internal to VP.

To address this issue, Diesing (1990) proposes two types of Infl. Stage-levelpredicates have an Infl whose subject is base generated in VP, with raisingbeing a possibility. Individual-level predicates have an Infl that assigns a �-roleto its Spec , with the import “has the property x,” x being expressed by thepredicate. The NP in this Spec controls a PRO subject internal to VP, whichgets the �-role assigned by the V�. The floated quantifier in (6) modifies thePRO in VP. Note that Diesing’s proposal alters the thematic relations by addinga �-role to the structure. Each individual-level predicate that exhibits an adicityof n arguments is in effect of adicity n�1, with the “subject” involvingsystematically two arguments in a control relation, an overt NP and an extraPRO.

Following our 1990 proposal that SCs involve an Agr projection, Diesing’sapproach could be adapted to SCs as in (7):

(7) a. [AgrP NP [agr [XP PRO [XP IL Pred]]]]b. [AgrP [AGR [XP NP [XP SL Pred]]]]

(We use the notation agr vs. AGR to distinguish each type of Infl.) Here thestructure of the SC is invariant (as in (1)), and what changes is the structure thatselects this SC (the agr/AGR head).

Difficulties arise for Diesing’s approach when extending it to SCs. The idea isincompatible with standard analyses of (8a), taken from a similar example inRizzi (1986). The clitic me “to me” climbs from inside the predicate fiel “faith-ful” up to the matrix clause. Climbing is local, which follows from the ECP(Kayne 1991; Roberts 1994; Uriagereka 1995a). But if the clitic governs its tracein (8c), nothing prevents the PRO that Diesing hypothesizes from being gov-erned from outside its SC:

T W O T Y P E S O F S M A L L C L A U S E S

215

Page 227: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(8) a. Juan me es (considerado) fiel.Juan me is considered faithful“Juan is considered faithful to me.”

b. __ es (considerado) [Juan [fiel me]]

That PRO is indeed (undesirably) governed when it is the subject of a comple-ment SC is shown in the ungrammatical examples in (9). Whatever the ungram-maticality of governed PRO ultimately follows from, it is unclear why PRO in aDiesing-style (8c) would be allowed to be governed.

(9) a. John tried [[PRO to be intelligent]]b. *John tried [[PRO intelligent]]c. *it seems [that [John is intelligent]]d. John seems [t (to be) intelligent]c. *it seems [PRO (to be) intelligent]

Consider also (10), a Dutch example discussed by De Hoop (1992):

(10) Els zegt dat er twee eenhoorns intelligent zijn.Els says that there two unicorns intelligent are“Els says that two (of the) unicorns are intelligent.”

De Hoop notes that in (10) the individual-level subject is VP internal. Thesedata, unlike Bonet’s, cannot be explained away by positing a PRO inside VP.The specifier of IP is taken by an expletive.4 (10) forces us to assume, contraKratzer, that all subjects start internal to the predicate projection, and, contraDiesing, that there are no special thematic relations associated to individual-level predicates. Then, if something similar to the initial intuition is to bepursued, subjects of individual-level predicates must be forced out of the predi-cate projection in the course of the derivation.

In the minimalist project, this conclusion is forced onto us. There are nolevels of D-structure or S-structure. So if the distinctions noted in the literatureare real, they must be expressed at (or by) LF. We discuss this next.

4 A more traditional approach

There are three proposals in the recent literature which we want to (freely)build on. In the spirit of Kuroda (1972) and Milsark (1977), Schmitt (1993, 1996)notes that individual-level predicates introduce a depiction of their subject,while stage-level predicates present their subject as a mere event participant(see also Suh 1992 for Korean). For Schmitt, these two are different in aspectualterms, the former lacking aspect entirely. In her analysis, �-roles are notassigned in the absence of aspectual dependencies, and hence individual-leveldependencies are pure predications while stage-level dependencies are n-adicrelations of a thematic sort. Although we believe there is something predicative

c. … me … [AgrP NP [XP Agr [PRO [fiel t]]]]

D E R I V A T I O N S

216

Page 228: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

to individual-level dependencies which is not so clear in stage-level dependen-cies, we do not believe that this is to be expressed in terms of �-roles missing inthe first. Otherwise, we have to again posit a systematic ambiguity of predicatesappearing in the individual-level or stage-level mode. For us all predicates areunique in having however many �-roles they have, and if an extra predication ofsome sort is involved in individual-level instances, this is to be achieved in someother fashion.

For Herburger (1993a), which deals with the definiteness effect, it matterswhat the LF position of a subject is vis-à-vis the event operator’s. Although thisis not axiomatic for her, in individual-level instances the subject has scope overthe event operator, while in stage-level instances the event operator has scopeover a weak subject, a matter that we ultimately build on. But Herburger’s indi-vidual-level and stage-level predications have the same initial phrase marker;thus, it is not possible in her system to select one or the other type of SC.Second, for her the LF site of scope-taking elements is a matter of QR. Thisraises a problem for subjects which are quantificational and take scope outsideof the event operator. Something like everyone is available does not have toinvoke an individual-level reading (see Section 7).

De Hoop (1992) concentrates on morphological ways of signaling the indi-vidual-level/stage-level distinction, thus is prima facie interesting from a mini-malist perspective. Her system is actually different from ours, and goes into asemantic typology which we need not discuss.5 Pursuing the idea that Caseaffects interpretation, we want to claim that subjects of individual-level andstage-level predicates are marked with a different form of Case. This recalls thewell-known distinctions found in East Asian languages that present topicmarkers, and is very welcome in a system like ours where the LF mapping isdriven by the presence or absence of given features.

The gist of our proposal builds on an intuition that both Kuroda (1972) andMilsark (1977) share. Individual-level subjects are what the sentence is about.More generally, (a sub-class of) topics are what the sentence is about. These“aboutness” subjects are highlighted by the grammar in a special way: a mor-phological case-marker, a phrasal arrangement, an intonational break, etc. Wewant to propose that this and nothing else is what IL-hood is, mere aboutness ofa phrase which is appropriately (Case) marked.

From this point of view the right split is not between individual-level andstage-level subjects. Objects too can enter into this sort of relations, as is knownfrom examples like the non-contrastive Caesar, Brutus didn’t particularly like.6

This is the way in which the grammar allows us to talk about Caesar when thiselement is a grammatical object. Interestingly, strong restrictions apply in thesetopicalizations. For instance, Fodor and Sag (1982) discuss the impossibility ofindefinite topics (??someone or other, Brutus didn’t particularly like). Also, thissort of topicalization is much more felicitous with states than with active events,particularly if these are specified for space/time (??Caesar, Brutus killed in theSenate yesterday morning). This strongly suggests that, in the spirit of Chomsky(1977a, 1977b), we should take topics to be subjects of a particular kind of

T W O T Y P E S O F S M A L L C L A U S E S

217

Page 229: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

predication, and that this predication has the semantic import of holding of itssubject in a standing manner, that is irrespective of the events at which thissubject participates. Of course this looks like a description of individual-levelpredication, but it is a description of topicalization instead.

In sum, our intention is to construe individual-level predication as a sub-classof topicalization, itself a predication. To distance ourselves from other uses ofthis term, we reintroduce old terminology. We assume that the grammarencodes relations between PREDICABLES and CATEGORIES of varioussorts, and that these need not be expressed in neo-Davidsonian (event) terms.That is, we take Caesar, Brutus didn’t like to have the same eventive structure asBrutus didn’t like Caesar, although the former invokes an extra predicationbetween the displaced Caesar and the open expression left behind. More gener-ally, we take something like Brutus killed Celts to be ambiguous between theobvious statement about what happened (say, at Brigantium in the first centuryBC) and an aboutness statement concerning Brutus. That was what Brutus char-acteristically engaged in. In the latter instance, we take Brutus to be displaced toa position outside the scope of the event operator.

In order not to confuse matters with terminology from a different tradition,we adopt Kuroda’s distinction between thetic (formerly, “stage-level”), and cat-egorical (formerly, “individual-level”) predications. A stage-level predication ishenceforth referred to as a thetic-predication and an individual-level predica-tion as a categorical-predication.

It is important to note also that we take topicalization to involve a particularsite. Chapter 5 and (Uriagereka 1995a, b) argue for an F category encoding thepoint of view of either the speaker or some embedded subject, which serves asthe syntactic interface at LF with the pragmatic systems.7 We assume topicaliza-tion is to F because there are many predications that take place inside a regularsentence, but we take only one of those to be the main assertion. For example, inour proposal, the main assertion in John likes everyone is NOT about everyone(that John likes them), but rather about John, the topic of the sentence, (that helikes everyone). Basically, F is the designated position for the pragmatic subjectwhich the main assertion is about, regardless of other presupposed predications.

We have just illustrated our account with normal predicates, but a similarapproach is possible for SCs, assuming the structures we argued for in Raposoand Uriagereka (1990). As Doherty (1992) shows, different functional projec-tions can introduce SCs. This is illustrated in (11) for Irish (we assume thatalthough the facts may not be that obvious elsewhere, they still hold more orless abstractly, with the same syntax needed for (11)). Note that the subject of athetic SC (11b) receives a different Case than the subject of a categorical SC(11a). The latter is accusative, a default realization in Irish, while the former isnominative. The Agr projection introducing each SC is different as well. In thethetic SC we have a strong agreement element, the particle ina containing asubject clitic, while in the categorical SC agreement is abstract (pronounceableonly in identity predications). Auxiliary selection is different too: the categoricalauxiliary is vs. the thetic auxiliary ta.

D E R I V A T I O N S

218

Page 230: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(11) a. Is fhear e.is-cat man he-acc“He is a man.”

b. Ta se ina fhear.is-thet he-nom in-his man“He is a man (now).”

Given these facts, several things follow. First, although SCs are always identicalin structure, they are associated to two different sorts of Infl, in the spirit ofDiesing’s (1992) distinction. It is these inflectional elements (whatever theirnature) that are selected by different heads, thus solving the selection puzzle.Unlike Diesing’s Infls, though, ours do not introduce extra arguments, butsimply entail two different forms of Case realization. The default Case associ-ated to what we may call C(ategorical)-agr marks an LF topic, while the regularCase associated to an A(rgumental)-AGR does not. We assume that pragmaticconsiderations demand that sentences be always about something, and thuseven when an argument is not marked with the appropriate features to be intopic position, something else must. We may think of thetic auxiliaries as theequivalent of topic markers for thetic predicates. Recasting traditional ideas, weassume that in this instance the predicate gains scope over the rest of theexpression, which is thus, in a sense, about the predicate.8

From this perspective SCs are just the simplest instances where thesystem presented here operates.9 In the minimalist project, movements likethe displacement of material for aboutness purposes need a trigger in terms ofappropriate morphological features, and a target where the features arechecked. For this we assume the F position, whose Spec is the landing siteof aboutness phrases, among others. The appropriate features are assigned asin (12) below. Weak C-agr assigns C(ategorical)-CASE (12a), which is realizedin the Spec of FP as a default Case (accusative in Irish).10 Strong A-AGRassigns a more standard A(rgument)-case (12b), which is realized in variousforms in different languages. The latter is the usual element that signals a �-dependency.

(12)

Though ultimately this is a technical matter, we may need to relax the VisibilityCondition as in (13b), since many languages mark displaced arguments just witha C-case, which must suffice for the trace of this argument to be visible at LF. Inturn (13a) is added to restrict the kind of elements that can appear in a topicposition. Intuitively, only those elements which are appropriately marked canraise there.

a. [agrP__ [C-agr [XP NP [XP Pred]]]] (Categorical predication)

C-CASE

b. [AGRP__ [A-AGR [XP NP [XP Pred]]]] (Thetic predication)

A-case

T W O T Y P E S O F S M A L L C L A U S E S

219

Page 231: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(13) a. For X a lexical argument of PREDICABLE Y, X is the subject of Yonly if X is marked as a CATEGORY through C-CASE.

b. For X a lexical argument of Predicate Y, X is interpreted as an LFargument of Y only if X receives Case [either C-CASE or A-case].

To illustrate the mechanics, reconsider (11). In both examples, there is an SC[he [man]]. In (11b), where the SC is associated to AGR ina, “he” realizes nom-inative A-case (not C-CASE). This prevents the SC from being about a CAT se“he.nom,” given (13a). In contrast, in (11a), where the SC is associated withAgr, “he” receives C-CASE, a default accusative in Irish. The SC in thisinstance can be about a CAT e “he.acc.” But although Irish marks relations inthis particular way, other variants are conceivable. The default mark of C-CASE may be nominative or a topic marker.11

5 Some semantic questions

Given this syntactic system, we can show that our approach may have con-sequences for two well-known semantic issues. One is why subjects of categor-ical predicables do not tolerate weak quantifiers. We return to this in theAppendix, but note that from our viewpoint this must be the same reason whyaboutness topics cannot be weak quantifiers. The second question is why cat-egorical predicables are taken as standing characteristics of their subjects, whilethetic predicates are transient.

As a point of departure for that second question, consider the proposal inHerburger (1993a). Following Higginbotham (1987), she assumes that all predi-cates, including Ns, come with an event variable.12 If at LF the subject of a predi-cate is inside the scope of the event operator, this operator binds both thevariable of the main predicate and that of the N. Thus in a man is available theevent operator binds the event variable of available and the event variable ofman. This translates as there being an event of availability, at which a man isthe subject:

(14) ∃e [available(e) ∃x [man(x, e) & Subject(x, e)]]

If the subject of a predicate is outside the scope of the event operator, the oper-ator does not bind into the NP. Therefore, in the man is intelligent the eventoperator binds only the event variable of intelligent. This translates as therebeing a (unique) man for an event of intelligence, of which the man is thesubject:13

(15) [The x man(x, e)] ∃e [intelligent(e) & Subject(x, e)]

We will pursue a version of this approach, though not this one. It may seem thatthe mechanism just discussed gives us intelligence as a standing predicate,leaving availability as a non-standing predicate lasting while the man is at thatevent. However, take the situation raised in Note 5. Bobby Fischer is a genius(i.e. “genial” in the old sense of the word). Consider a logical form where we

D E R I V A T I O N S

220

Page 232: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

deny there being an event of geniality of which Fischer is the subject. This is thelogical form that would result from having an LF in which the event operatorhas scope over the subject, resulting in a thetic predication, and denying that.The question is, when we say (last night) Fischer wasn’t genial, is that contra-dictory with the statement Fischer is genial?

The answer is surely “no,” but it is not clear what in the logical form yieldsthis result. Thus, consider (16b), corresponding to the Spanish (16a) (we havereplaced the champion for Fischer in order not to go yet into the semantics ofnames):

(16) a. El campeón es genial pero no está genialthe champion is-C genial but not is-T genial“The champion is genial but is not genial right now.”

b. [[The x champ(x, e)] ∃e [genial(e) & Subject(x, e)]]& �[∃e [genial(e) [The x champ(x, e)] & Subject(x, e)]]

(16b) conjoins two statements in such a way that a contradiction ensues. Thelogic is clear. If geniality holds of the champion irrespective of his being in agiven event (and that is what the first conjunct asserts), then geniality will holdof him at all events in which he participates (and that is what the second con-junct denies). However, Spanish speakers find (16a) sensible.

Herburger (1993b) suggests that in this situation the first conjunct assertssomething along the lines pursued by Chierchia (1986): the champion is gener-ally genial. The contradiction then disappears. But this is not the way a predi-cate like genial is interpreted. To be genial you do not have to be generally that– most geniuses are rarely genial. It is unclear (and irrelevant) what counts inmaking someone genial. Whatever that may be, it is not obvious that (16) can beexplained away without complicating the semantics suggested in (14)/(15) forthe C/T distinction.

To avoid the contradiction in (16) we must modify the more general, categor-ical statement (which is falsified by the more concrete thetic statement). Twoways come to mind. We change the predicate in these instances (which is whatHerburger (1993b) suggests), or we instead change the subject (which is whatwe plan to do). That is, a (trivial) way to avoid the contradiction in (16b) is tohave the subject in each instance be a different subject. Suppose that we have afine-grained semantics that allows us to distinguish Fischer at a given event fromFischer at some other event or irrespective of the event he participates in. Thenwe could avoid the contradiction. Geniality holds of Fischer decontextualized(“one” Fischer), and lack of geniality holds of Fischer in the context of someevent (“another” Fischer).

Although syntactically straightforward, this approach may seem tricky for thesemantics of names: by “splitting” Fischer this way we get into questions aboutwhat it means for “Fischer” to be a rigid designator.

Chapter 12 addresses this matter, rejecting a treatment of Fischer as a mereconstant or a label for an object, by introducing Spanish examples of the sort in(17c):

T W O T Y P E S O F S M A L L C L A U S E S

221

Page 233: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(17) a. En España hay mucho vino.“In Spain there’s much wine.”

b. En España hay mucho torero.“In Spain there’s much bullfighter.”

c. Hay Miguel Indurain para rato porque aun queda mucho Indurainpor descubrir. De todos modos, el Indurain que más me sigueimpresionando es el contra-relojista.“There’s Miguel Indurain for a long time because there’s still muchIndurain to discover. In any case, the Indurain that continues toimpress me is the time-trialist.”

Notice how the name Indurain in (17c) appears in the same sorts of contexts asthe noun torero “bullfighter,” which in Spanish can be the contexts where massterms like vino “wine” are possible (see Chapter 15 on this).14 Some of the dif-ficulties raised by (17c), in particular, or our approach to (16) for that matter,can be addressed adapting ideas discussed by Higginbotham (1988), who buildson insights of Burge (1974). Contra an important philosophical tradition, rigidityis arguably not part of the nature of an expression, such as a name, but rather isa result of the linguistic system. In Higginbotham’s proposal, all predicatesintroduce an event variable and also a second order context variable. How thisfree variable is set is a matter that Higginbotham leaves purposefully open forspeakers to determine. It is perhaps a process involving cognitive mechanismsnot unlike those involved in the contextualization of “measured” mass terms, asthe parallelism in the examples in (17) suggests. The point is, we need contextu-alized notions such as “bull-fighter” or “Indurain” as much as we need notionslike “wine,” although presumably each of these is presented differently (e.g.individual terms vs. mass terms, etc.).

If we are ready to distinguish Fischer at a given context and Fischer at someother context (or no special context) a question to raise is what makes thatFischer. For us, this rigidity concern makes sense only as a linguistic property ofsome expression. In fact, it is relatively similar to that of what it is to be knownas (the kind) wine, (the kind) bullfighter, and so on. All that we need here is theassumption that speakers figure this out in concrete cognitive terms, however itis they do it. Furthermore speakers are capable of distinguishing that whateverit is that makes something what it is does not entail that it has to be so “unified”in all events, or (17c) would not make any sense. It is beyond the scope of thischapter to discuss a possible mechanism involved in these matters (though seeChapter 12). For our purposes here, it is of no concern what makes the two sen-tences in (16) be sentences about specifically Fischer or the champ. Since wetake Fischer to be a predicate, we may assume that what makes Fischer Fischeris (perceived or imagined) “Fischerhood,”15 just as “winehood” makes winewine, for instance.

The theoretical significance of all of this for present purposes is that we cru-cially need context variables, for it is in context that speakers present notions invarious ways. Context allows us to speak of a mode of Indurain in (17c), or a

D E R I V A T I O N S

222

Page 234: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

decontextualized Fischer or Fischer at an event in (16). Our plan now is toachieve the results sketched in (14) and (15) in terms a syntactic realization ofthese context variables, and not event variables.16

6 Contextual dependencies

Assuming every quantificational expression comes together with a context, insentences of the form “S is P” we need at least two. We require a context of “S-hood” corresponding to the subject, and a context of “P-hood” correspond-ing to the main predicate. Suppose further that contexts are set within othercontexts, much like quantifiers have scope inside one another. If so, assumingthat X is the context of the subject and Y is the context of the predicate, asequence of contexts �X, Y� is interpreted differently from a sequence of con-texts �Y, X�. The first of these sequences would introduce a context Y forpredicate P within the context X for subject S. Conversely, the second sequencewould introduce a context X of the subject within the context Y for the predi-cate. Let us say that under these circumstances a given context grounds anothercontext.

As suggested before, let both arguments and predicates have the option ofbeing displaced to a topic site. Starting out with a structure S is P, suppose thatthe subject is in the LF topic site, as is the case in a categorical predication. Herethe subject with context X has scope over the predicate with context Y in situ.

This has the effect of confining the range of the context of the predicate tothat of the subject, thus grounding context Y on context X. So a categoricalassertion introduces a context X of an individual for a context Y of a predicate.In contrast the context of the subject is not grounded on the context of the mainpredicate. This is what results in the main predicate holding of the subject as astanding predicate, for it is a characteristic of this subject in a decontextualizedfashion, not within the context of any given event.

Consider next the LF for S is P where the predicate is in the LF topic site, aswe assume to be the case in a thetic predication. The fact that thetic predicatesare thus displaced derives their transient character. The subject is inside thescope of the event operator, and now it is a subject whose context X is confinedto the context Y of the predicate. Whatever predicate may hold of the subject,then, will hold only of a subject grounded at the event the predicate introduces,not a decontextualized subject.17

The fact that in categorical predications the context of the predicate isgrounded on the context of the subject should have an effect on the interpreta-tion of the predicate, just as it does on the interpretation of the subject. Inparticular, a categorical predicable should be dependent on the context of thesubject, in a way in which a thetic predicate is not. Observe (18) and (19) in thisrespect:

(18) a. I consider the sea/the plant green.b. I saw the sea/the plant green.

T W O T Y P E S O F S M A L L C L A U S E S

223

Page 235: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(19) a. Considero el mar/la planta verde sencillamente porque consider.I the sea/the plant green simply because la planta/el mar es verde.the plant/the sea ES green

b. Vi el mar/la planta verde sencillamente porque la planta/saw.I the sea/the plant green simply because the plant/el mar está verde.the sea ESTÁ green.

It would seem that in (18a)/(19a) the green we consider to hold of the sea is typi-cally different from that we consider to hold of garden-variety plants (sea greenseems bluer). However (18b)/(19b) exhibit no such “canonicity” of green-ness.It may be the case that we saw the sea and the plant with the very same lightgreen, say, as a result of a chemical spill – the typicality of the color is notinvoked in this instance. Importantly, the causal continuations in (19) presentauxiliary selection (ser and estar) in exactly the way we would want it. It wouldbe unacceptable to use estar in the continuation in (19a) and somewhat odd touse ser in (19b).

These facts can be explained in terms of context grounding. Since the contextof green in the categorical (19a) is grounded on the context of the plant orthe sea, we get a canonical green in each instance. But our account alsopredicts that only categorical predicables are canonical, since only these aregrounded on their subject. A thetic assertion introduces the context of a predi-cate for the context of an individual. Whatever characteristics green in(18b)/(19b) may have has nothing to do with the subject of the sentence, accord-ing to fact.

Consider next a precise implementation. Context variables are free variables,whose values are set pragmatically by the speaker. On the basis of whatdoes the speaker set the value of a context variable? Minimally, backgroundinformation is necessary. More importantly for us, online information is also rel-evant.

In concrete terms, we adapt the semantic interpretation proposed in Higgin-botham (1988), following the schema in Burge (1974). For instance, (20b) is thecategorical interpretation of (20a):18

(20) a. El campeón es genial.“The champion is genial.”

b. In any utterance of (20a) in which the speaker assumes a context X,such that X confines the range of campeón to things x that Xx, for acontext Y, such that Y confines the range of genial to events e thatYe, that utterance is true just in case:El x [campeón(x, e) & Xx] Ee [genial(e) & Ye] & Subject(x, e)

To quote Higginbotham (1988), (20b) is taken to be

the normal form for linguistic data about the truth conditions of wholesentences. If so, then truth values are to be thought of as predicated

D E R I V A T I O N S

224

Page 236: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

absolutely of utterances, and the contextual features on which interpre-tation may depend are to be enumerated in the antecedent of a condi-tional, one side of whose biconditional consequent registers their effectson the sentence as uttered.

(p. 34)

The only thing we are adding to the Burge/Higginbotham semantics is theassumption that contexts confine contexts within their grounding potential. Thisis explicit in other frameworks, such as File Semantics (Heim 1982; Kamp 1984)or in the “dynamic binding” system in Chierchia (1992), from which we quote:

The meaning of a sentence is not just its content, but its context changepotential, namely the way in which a sentence can change the context inwhich it is uttered. The different behavior of indefinite NP’s and quan-tificational elements is captured in terms of the different contributionthey make to context changes.

(p. 113)

Although the mechanics implicit in (20b) are different from those implicit ineither a file semantics or a dynamic binding treatment of contexts, the concep-tual point raised by Chierchia still holds for us, even sentence-internally.19

With Burge and Higginbotham, we also assume that contextual matters affectnot just indefinites or quantificational elements, but also names and events moregenerally, as discussed in Schein (1993). Then something like the semantics in(20b) is necessary, and the main point that is left open is how from a given LFwe reach the details of the antecedent of the conditional in (20b).

There are two possibilities for this, both of which work for our examples.Hypothesis A encodes contextual scope at LF, for instance in terms of May’s(1985) Scope Principle:

(21) Let us call a class of occurrences of operators C a S-sequence if andonly if for any Oi, Oj belonging to C, Oi governs Oj. Members of S-sequences are free to take on any type of relative scope relation.Otherwise, the relative scope of n quantifiers is fixed as a function ofconstituency, determined from structurally superior to structurallyinferior phrases.

(21) is intended for standard quantification, but may extend to context second-order free variables under the assumption in (22):

(22) Given a structure …[…Xx…[…Yy…]…]…,The value of Y is set relative to the value of X [X grounds Y] only ifthe operator Ox takes scope over the operator Oy.

Hypothesis B expresses contextual grounding after LF. Assuming that syntacticLF representations are mapped onto some intentional (post LF) Logical Formencoding relations of the sort in (14)/(15), relative contextual specifications maybe read off of those representations. If this is the case, nothing as syntactically

T W O T Y P E S O F S M A L L C L A U S E S

225

Page 237: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

articulated as (21) and (22) would be at issue, and rather something more alongthe lines of traditional logical scope would determine that context at a givenpoint serves as context for what comes next.20

7 Quantificational subjects

An important puzzle for our system is posed by the fact that the quantificationalproperties of the subject do not affect the Categorical/Thetic distinction. Forinstance, cada hombre “each man” in (23) does not force a categorical interpre-tation.

(23) Cada hombre está preparado.“Each man is ready.”

Context variables are introduced so as to provide an appropriate confinementfor quantificational expressions like each man and the like. But note that if wewere to care only about the context of the entire quantificational expression(henceforth the “large” context of the NP), this would not help us in gettingthetic readings. That is, in (23) the subject quantifier has scope over the eventoperator, thus yielding multiple events of readiness. This presumably meansthat the large context of the quantifier grounds the entire expression, as itshould. The context of the restriction of each man is confined to relevant men,so that the sentence means that for each relevant man, it is the case that he isready. That is fine, but orthogonal to the point that each of the men in questionis in a state of readiness. In our system, for each relevant man there has to be acontext (henceforth the “small” context) such that this small context isgrounded on the scope of the event operator, yielding a non-standing characterfor the predicate holding of each man.

From this viewpoint the issue is to get small contexts out of expressionsapparently introducing only large contexts. Basically what we want is that the“large” context of the restriction of each be displaced together with the deter-miner, without displacing with it the “small” context somehow associated to thevariable. There is a class of expressions where this is directly doable:

(24) Cada uno de los hombres está preparado.Every one of the men is ready.

Given the right syntax for partitive expressions, we may be able to scope out each. . . of the men and actually leave behind one. If this one element brings its owncontext with it, we are right on track. The intuition is that partitive expressions arecontextually transparent, representing their large context through the partitivephrase of/among the N and their small context in terms of the element one.

The head of a partitive expression is one, of/among the N being attached at adifferent level in a phrase marker, both in Spanish and in English (and in manyother languages). Uriagereka (1993) sketches an approach to these sorts offacts, building on the analysis of possession by Szabolcsi (1983) for Hungarian,assumed by Kayne (1993) for English. In the Kayne/Szabolcsi structure, a pos-

D E R I V A T I O N S

226

Page 238: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

sessive DP is more complex than a regular DP, involving a relation (“posses-sor,” “possessed”). John’s car implies that John has a car (assigned), whichtranslates, for reasons we will not go into, to the syntax in (25) (see Note 21):

(25)

Uriagereka’s analysis suggests that, in partitive structures, the definite DP playsthe role of the “possessor” while the indefinite NP plays the role of the “pos-sessed.” One in each one of the men is to the men what car in John’s car is toJohn (both in English and in Spanish). Each one of the men implies the meninclude ones.21

The advantage of this syntax is that it provides us with a phrase marker wherethe men and one occupy entirely different nodes as “possessor” and “possessed.”In turn, the entire “possessive” DP can be the restriction of a determiner likeeach in (26) below.22 What is crucial there is that one, which is in the Spec of thepossessive DP in the overt syntax, may move autonomously. Hence it is able totopicalize to FP, or to “reconstruct” to a VP internal subject position, all of it atLF. Assuming that one moves around with its own context, we will get a struc-ture which gives us the right meaning. In particular, each one of the men meanssomething like each instance of a partition of the men has an associated one:

(26) a.

DP D'

DP

AgrP

the men Agr'

DPOSS

Qeach

Initial PMQP

c. PF: each one ofthe men

b.

Qeach

After Move �QP

Agr DPone

DPOSS

of

AgrP

D'

Agr'

Agr NPt

themen

DPone

DP

a.

DPOSS

AgrP

D'

Agr'

Agr NPcar

John

DP

Initial PMDP

c. PF: John's carb.

DPOSS

-s

AgrP

D'

Agr'

Agr NPcar

t

DPJohn

After Move �DP

T W O T Y P E S O F S M A L L C L A U S E S

227

Page 239: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

We get “variable” men from this structure because we are partitioning the set ofrelevant men in ones, each corresponding to a part of the set. We placed theword “variable” in scare quotes since the element in question is not a variable inthe standard logical sense, an artifact of the semantic system. Nonetheless, thissyntactically represented element has a denotation which is similar to that of avariable (though see below).

Even if the QP in (26) undergoes QR in the sort of structure in (24), we canstill use one with its associated “small” context to “reconstruct” inside the scopeof the event operator, so that the “small” context is set in terms of the contextof the event, as in other instances of thetic predications.

Once the syntax in (26) is motivated for partitives, we may use it even for(23), in order to separate the Q element with its “large” context, from the “vari-able” obtained by predicating one of something. This is, as we saw, a “variable”resulting from a syntactic formative (one) which brings its own context, as it isconceived differently from the object it is a part of. In effect this mechanismdivorces the quantifier from its otherwise automatically assigned semantic vari-able, and allows the “variable” to have its own context, as the predicate it actu-ally is.23 The account then follows as in (26) – except, of course, for the relativelyminor point that (23) does not exhibit an overt partitive format.

The relevant structure is (27):

(27)

Here an empty (pro) classifier is generated in place of one in (26). Presumablycada pro hombre, literally “each one man,” means something slightly differentfrom the standard cada hombre “each man.” In the latter there is nothingspecial to say about the values for the relevant variable, while in cada prohombre there is. A “variable” effect obtains only because hombre “man” standsin an “integral” relation with regards to pro (and see Note 22).24

Consider in that respect the Spanish contrasts in (28):

(28) a. Todo campeón es/*está genial.All champion ES/ESTÁ genial

a.

DP D'

DP

AgrP

hombre Agr'

DPOSS

Qcada

Initial PMQP

c. PF: cada hombreb.

Qcada

After Move �QP

Agr DPpro

DPOSS

AgrP

D'

Agr'

Agr NPt

hombre

DPpro

DP

D E R I V A T I O N S

228

Page 240: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

b. Todos los campeones son/están geniales.All the champs ES.PL/ESTÁ.PL genial

In Spanish many quantifiers may, but need not, exhibit number specifications.Curiously, when they do not, they cannot be subjects of thetic predicationseither (though see (29)). We may suppose that number licenses the formativepro in (27) and similar environments.25 Then in (28a) there would be no proclassifier, and hence todo campeón “every champion” may not be the subject ofa thetic predication. In particular, the quantifier todo forces QR outside thescope of the event operator, and since there is no separate, syntacticallyindependent “variable” with its own small context to be left inside the scope ofthe event operator, expressions with subjects of this sort must be categorical.

An important qualification is in order. (28a) becomes perfectly good if aspecific context is provided. For example:

(29) Todo campeón está genial alguna vez en su vida.All champion ESTÁ genial some time in his life

However, we submit that in this instance it is the adverbial alguna vez en su vida“some time in his life” that allows for the thetic reading, in that it provides anexplicit context for the variable of todo campeón “all champion,” cruciallyinside the scope of the event operator. Note that, to the extent that these con-siderations are on track, it is clear that contextual considerations determine aux-iliary selection for the thetic/categorical distinction, as we expect in the termsjust discussed.26

8 Concluding remarks

It cannot be decided by just looking at a given predicate whether it will have an“individual-level” or “stage-level” interpretation. Most predicates can be usedeither way, depending on the context. For us the word “context” has a ratherconcrete meaning, in terms of the grounding of an event variable, whose syntac-tic realization we assume. The essential conclusion of this work is that the maindistinction between thetic and categorical judgments arises as a result of thesyntactic structure in each instance. In a categorical judgment the proposition isabout a marked element, normally a subject but possibly an object, which haswhat amounts to a topic character. This topic’s context grounds the contextualspecifications of whatever main event is introduced in the proposition. In con-trast, in a thetic judgment the proposition is not about a salient topic. It is aboutthe event itself, which gains structural prominence through dull syntacticmechanisms (e.g. auxiliary selection or Case/agreement properties). Intuitively,then, subjects of categorical judgments are structurally higher than subjects ofthetic judgments. This has nothing to do with their thematic structure, and israther a consequence of their intentional properties (in terms of context con-finement). Aside from that main conclusion, the chapter discusses various inter-esting assumptions one needs to make for the analysis to work, in particular

T W O T Y P E S O F S M A L L C L A U S E S

229

Page 241: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

regarding the syntactic structure of quantifiers, as well as a couple of interpre-tive consequences, one of which is explored further in the Appendix.

Appendix

Suppose the condition in (1) is true:

(1) In a PREDICABLE(CATEGORY) relation, CATEGORY isanchored in time.

The intuition is that a judgment is evidential, and a point of view expressesthrough time its actualization of a given CATEGORY, a prerequisite for anyjudgment.

Assume also a syntactic condition on anchoring, as in (2):

(2) A anchors B only if B is local with respect to A.

“Local-with-respect-to” can be thought of in terms of government, or if one hasminimalist scruples with this notion, something along the lines of (i) in Note 7,repeated now as (3):

(3) B is referentially presented from [or anchored to] the point of view ofthe referent of A iff A is a sub-label of H, whose minimal domain Mincludes B.

To make (3) pertinent to (2), we could adapt it as follows:

(4) B is anchored to A iff A is a sub-label of H, whose minimal domain Mincludes B.

“Minimal domain” is to be understood in the sense of Chomsky (1993b), that isa collection of dependents of a given head H (its complement, specifier, and alladjuncts to either H, H’s projections, or H’s dependents). A sub-label of H iseither its own label or any feature adjoined to H, whose attraction triggers thetransformational mapping. Note that (3) is a sub-case of (4), whereby anchoringto a formative with evidential characteristics – typically an element assumed todenote a sentient being – results in a point-of-view effect.

From the perspective of (4), (1) amounts to this corollary:

(5) Anchoring Corollary (AC)In a PREDICABLE(CATEGORY) relation, there is a temporalelement T such that T is a sub-label of H, whose minimal domain Mincludes B.

For the AC to be met, there must be a temporal element T which is local withrespect to the relevant CATEGORY. For instance:

(6) [HP[CATEGORY] [H T[H H]] [… [t PREDICABLE] …]]

This is a situation arising in topicalization, where H is the F element discussed inSection 4, and Tense is adjoined to F.

D E R I V A T I O N S

230

Page 242: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Another possibility for meeting the AC involves a situation where there hasnot been displacement of the syntactic element which we want to treat as theCATEGORY, but nonetheless it happens to be local with respect to a T; forexample:

(7) [TP [T] [XP [NP…] [XP…]]]

If a SC is as simple as noted in (1), where NP is base adjoined to XP, a T intro-ducing SC will be local with respect to the CATEGORY NP. This amounts tosaying that, when SCs are involved, predications anchored to time are possiblewithout the need to topicalize their subject.

The AC is a condition on whatever happens to be a CATEGORY. Given anappropriate source of C-Case, as in (12a), Section 4, for a particular nominalexpression that will immediately be the relevant CATEGORY. But matters arereversed in conditions of thetic predication, where in effect the CATEGORY isthe predicate of the main event. In those circumstances, thus, the syntactic form-ative in question ought to be in the minimal domain of some item that also has aT element in its checking domain. That is the case for SCs, as in (7), since XP isalso local with respect to T. Which amounts to saying that thetic predications inSCs are possible without displacing their predicate to any topic-like position.

Those were, ultimately pragmatic, considerations about anchoring CAT-EGORIES to time. But as we discussed in Section 6, the way we obtain thecharacteristic semantic differences between categorical and thetic predicationsis through having concrete elements ground the context variables in the rest ofthe expression. We saw how that takes place when topicalization is involved, butwe have just seen that topicalization may not be needed in situations involvingsimple SCs (although it is possibly involved in languages, like Irish, where spe-cific Case markings are deployed in these instances too). The question iswhether SCs may present, at least in some languages, a structure similar to thatof adjectives, which most likely does not determine the relevant placements ofelements entering predications in terms of Case.

Note that categorical predicables can be substituted by clitic lo “it,” unlikecomparable elements involving thetic readings:

(8) a. Alcohólico, es lo que considero a Pedro.Alcoholic is it that consider-I to Pedro“An alcoholic, is what I consider Pedro.”

b. Borracho, es lo que he visto a Pedro.drunk is it that have.I seen to Pedro

This would be explained if, at the level of anaphoric linking with lo, in effectborracho “drunk” in (8b) is not acting as a predicate. That result would beobtained if borracho has to be displaced, for some reason, to a non-predicativesite.

Assume that thetic SCs are in the end more complex than categorical ones,the former including the latter plus an extra “integral” layer of the sort dis-cussed in (25) above:

T W O T Y P E S O F S M A L L C L A U S E S

231

Page 243: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(9)

If this is the correct analysis, the displaced borracho “drunk” goes through theSpec of the sort of possessive DP we discussed in Section 7, and is no longer apure predicate (thus explaining the contrasts in (8), under the assumption thatclitic lo “it” can only hook up with predicates).

Intuitively, what (9) proposes is that alcohólico “alcoholic” is something thatJuan is taken to be, whereas borracho “drunk” is a state that Juan is supposedto be in, or he possess in some abstract sense. Importantly, the contextualgroundings in (9) are of the right sort. In (9a) the context variable of borracho“drunk” grounds that of Juan, whereas in (9b) the opposite is the case, with thecontext variable of Juan being in a position to ground that of alcohólico “alco-holic.” This is what we expect, without the need to invoke the particular Casesystem discussed before.

If those are the correct structures for SCs, the default orders of SCs inSpanish should be different. This is not obvious at first, given the facts in (10):

(10) a. He visto borracho a Juan/a Juan borracho.have.I seen drunk to Juan/to Juan drunk

b. Considero a Juan alcohólico/alcohólico a Juan.consider.I to Juan alcoholic/alcoholic to Juan

Both orders in (10) are possible. Nonetheless, there is a clear preference forthe first orders given in (10), which gets accentuated in some circumstances.Thus:

(11) Q: Cómo has visto a Pedro?how have.you seen to Pedro“How have you seen Pedro?”

A: La verdad que… a. He visto BORRACHO a Pedro.the truth that… have.I seen DRUNK to Pedro

b. He visto a Pedro BORRACHO.have.I seen to Pedro DRUNK

“The truth is that I have seen Pedro DRUNK.”

a.

D AgrP

D'

Agr'

Agr SC

Juan

borracho

DP

t t

b. AgrP

Agr'

Agr SC

Juan

t alcoholico

D E R I V A T I O N S

232

Page 244: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(12) Q: Qué (cosa) consideras a Pedro?what thing consider.you to Pedro“What do you consider Pedro?”

A: La verdad que… a. Considero a Pedro ALCOHÓLICO.the truth that… consider.I to Pedro ALCOHOLIC

b. *Considero ALCOHÓLICO a Pedro.consider.I ALCOHOLIC to Pedro

“The truth is that I consider Pedro AN ALCOHOLIC.”

In question-answer pairs, while both orders are fine in the thetic instance, thecategorical one is more restricted, disallowing the SC subject in final position.This is explained if, when possible as in (10b), a post-adjectival order for the SCsubject involves focalization, which is incompatible with a second focalization asin the (12b) answer.

Suppose also that the D element hypothesized for thetic small clauses, as in(9a), has temporal properties. If so it may be enough to sanction a predicationinternal to a nominal, in satisfaction of the AC. This would account for whythese SCs, as we saw in (2)–(5), Section 1, have a characteristic independencefrom selecting V’s, and thus corresponding T’s introducing them: the AC is sat-isfied (practically) internal to the SC, which can then act as its own unit of predi-cation – if carried by the DP layer.

Let us turn finally to the general specificity of CATEGORIES. Fodor andSag (1982) present a traditional intuition about this which seems to us appropri-ate. One can only talk about specific topics. (13) states this concretely, assumingthe issue are judgments and the elements of which they hold:

(13) Judgment PrincipleJudgments hold of actuals.

Plausibly, few things count as actual (although lacking a full theory of reference,this is a merely intuitive claim). We would put in that realm classified elements(the one/pro car), prototypical expressions (the automobile), abstract nouns(beauty), and perhaps kind or generic expressions (Americans, an American).These are the sort of elements which can be in the extension of a predicable,yielding a valid judgment. Given (13), we also predict that those events whichenter into predications as CATEGORIES need to be actualized. This actualiza-tion is plausibly done through a pleonastic like there, which is otherwise mysteri-ous as a subject.

Furthermore, suppose that actualization is done in human language throughtime (14), perhaps a cognitive imposition relating to the fact that time/placecoordinates are perceptually unique:

(14) Actualization PrincipleActuals are mapped to time.

The modular interaction of (13) (a semantic condition) and (14) (a pragmaticcondition) deduces (1), repeated now as (15):

T W O T Y P E S O F S M A L L C L A U S E S

233

Page 245: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(15) In a PREDICABLE(CATEGORY) relation, CATEGORY isanchored in time.

The Judgment Principle in (13) has another important consequence. The CAT-EGORY of which a PREDICABLE holds is a classified element, or a proto-type, or a kind, or an abstract notion. We take this to yield familiarMilsark-effects, which need not affect our syntactic account as they are taken todo in other analyses.

Milsark (1977) noted that the subject of a categorical predication is specific.Indefinites (16) or weak quantifiers (17) are not subjects of categorical predica-tions (nominal predicates are used for the categorical reading and participialsfor the thetic one, as these prevent alternative readings, see Note 2).

(16) a. Un lobo es una bestia.a wolf ES a beast“A wolf is a beast.”

b. Un lobo está acorralado.A wolf ESTA cornered“A wolf is cornered.”

(17) a. Algún lobo es una bestia.some wolf ES a beast“Some wolf is a beast.”

b. Algún lobo está acorralado.some wolf ESTA cornered“Some wolf is cornered.”

The usual approach to these facts is in scopal terms, contra Milsark’s initial intu-ition (which we believe is correct). Given our analysis, there is no reason whyscope should have the intended effect. That is, for us there are designated LFsites for subjects of categorical and thetic predications, which have nothing todo with their logical scope but, rather, encode pragmatic prominence. So we canhave the paradigm follow in Milsark’s terms, from the fact that categorical pre-dicables force the actualization of their subject, given the Pragmatic Principle.Hence, unspecific subjects of all sorts will not be interpretable as pragmatic sub-jects, which we take to be the reason why topics in general must be specific – oractual in our sense. These effects now follow from the LFs achieved in syntacticterms, and not conversely.

D E R I V A T I O N S

234

Page 246: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

12

A NOTE ON RIGIDITY†

but were I Brutus,And Brutus Antony, there were an AntonyWould ruffle up your spirits, . . .

(Shakespeare: Julius Caesar, III, 2)

1 Counterfactual identity statements

Sentences like the one above pose a well-known difficulty for the now standardtheory of names. What makes Antony Antony, and not Brutus, is (in the Krip-kean view) the fact that Antony originates “from a certain hunk of matter” andhas some reasonable properties concerning a given “substance” essential toAntony. In other words, there is an object in the world with the relevant charac-ter which, by its very nature, is Antony and not Brutus.1 This is all fine, but thenthere is something nonsensical about Antony’s statement above. He seems to bereferring to a chimerical, counterfactual creature without any referent whatever.

Antony is not, of course, referring to Brutus; Brutus, being who he was (andhaving murdered Caesar), would not attempt to ruffle up the spirits of theRomans upon reflecting on Caesar’s death. Then again, Antony is not referringto himself either, or he would not be using counterfactual speech; the wholepoint of his elegy to Caesar is to pose a dilemma: Antony, as who he is, does nothave the clout to arouse the Romans into action, much as he would want to.Brutus, as who he is, (presumably) expects Caesar’s death to be forgotten –although he could, if he wanted to, move the Romans into action. So whatAntony appears to be invoking is a creature with enough “Antonihood” to actin love of Caesar, but enough “Brutushood” to matter. The question is whetherthat creature is sensible.

What is crucial for my purposes here is that the creature, call it “thechimera,” does not have obvious grounds for rigidity in the standard view. Thereason is direct. Traditionally, what makes us refer to this chapter rigidly is thefact that the object is what it is, right there in front of your eyes. You can call it“A Note on Rigidity” or whatever, but it sustains its character across counter-factuals simply because it is just whatever it is. Had there been no object withthe relevant properties, there would have been no standard rigid designation.And we have seen that the chimerical Antony/Brutus does not exist.2 But beforeone poses questions about such deep and muddy waters, I propose a modestexercise, that we see whether there are any linguistic phenomena related to theproblematic expression. If there are, they might shed some light on our dif-ficulties with its semantics. The expression is easy to paraphrase:

235

Page 247: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(1) Were some of the modes of Antony some of the modes of Brutus, . . .

Of course, there has to be some cooperation for such a paraphrase to help. Therelevant Antony modes have to be seen as those that lack “wit, . . . words,action, . . . utterance, . . . the power of speech,” what makes him but “a plainblunt man.” Those are the gifts Brutus has. Meanwhile, there have to be otherAntony modes that justify that the chimera should ruffle up the Roman spirits.It is in those modes where Antony’s love of Caesar “lives.” Put those lovingmodes together with the gifted Brutus modes and you will have a vengefulorator.

2 An excursus into wholes and parts

(1) does not seem relevantly different from the plausible assertion below:

(2) Had the Trojans been the Greeks, the Trojans would not have takenthe horse.

As far as I can tell, what (2) really means can be paraphrased in (3):

(3) Had some of the Trojans been some of the Greeks, the Trojans thatthe Trojans would then have been would not have taken the horse.

Of course, (3) is true in all circumstances in which a stronger version of (2),involving all of the Trojans and all of the Greeks, is true. But (3) has two advan-tages over that stronger version. First, it allows us to quantify over some rele-vant Greeks (who we take to be very smart, say Odysseus, Achilles and the like)and some relevant Trojans (who we take to be rather stupid, that is, cruciallynot sensible people like Laocoon, who did not want to take the infamous horse).Presumably, if stupid Trojans had been counterfactually swapped for smartGreeks, events would have proceeded differently from the way we are told – butarguably not otherwise.

Second, imagine (2) were true of all Greeks and Trojans. One could arguethat, then, there would be no Trojans left; they would all have been Greeks. Ifso, one could argue that the situation of their being able to take the horse wouldhave never arisen, this event being the culmination of a war between Greeksand Trojans. To put it differently, we need some Trojans for presuppositions ofthe assertion in (2) (which is, after all, about Trojans) to make sense.3

In the case of the counterfactual Trojans, it is all those Trojans (in fact mostof them) that got involved in a war with the Greeks, did this, that and the other,and eventually found themselves in front of a weird horse. In the case of thecounterfactual modes of Antony, it is all those modes (presumably also most ofthem) that made him a friend of Caesar, and all that. These parts of the expres-sions may be thought of as “rooting” them in some intuitive sense, thus makingthem successful ways to denote Antony or the Trojans.

At the same time, the chimeras are chimerical because not all their “chunks”are real: our recipe for their construction calls for invoking some smart Greeks,or eloquent Brutus modes, so that with the new and improved would-be crea-

D E R I V A T I O N S

236

Page 248: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

ture we are justified in asserting something about not being fooled by weirdhorses or ruffling up the Roman spirits.

Had we been using (1), (2) or (3) (perfectly sensible, if stilted, Englishexpressions), it would have been reasonable to conclude that nothing else isgoing on. In other words, we can, without much difficulty, invoke what I think isthe chimera that concerns us here by means of roundabouts like the “modes ofAntony.” It goes without saying that this raises a host of questions, starting withwhat are those modes, and going all the way down to what makes counterfactualmodes be of who we assume is the real Antony. However, regardless of how weanswer those questions, we have to say something for (1) – the tricks beingwithin the lexical meaning of the entry mode and the grammatical meaning ofsome . . . of (Antony) in the appropriate context. Granted, the bare Shake-spearean expression has no overt manifestation of anything other than Antony,but if generative grammar has taught us something, it is to expect underlyingstructures that are not visible to the naked eye.

In the remainder of this chapter, I try to show that it makes sense to assumethat sort of hidden paraphernalia for the relevant uses of names in counterfac-tual identity statements, and ponder what this means for the structure of namesin general. Before I proceed, though, I want to signal the way. The key is com-paring (1) and (2), with the intended meaning in (3). I suppose the Trojans is aname.4 Then the issue is counterfactually changing the Trojans a bit, by swap-ping some relevant players. The chimerical Trojans are similar to the real oldTrojans in that they have enough Trojans to count as “the Trojans”. One shouldstill ask how many that is, but I will leave it at that.

More relevantly for our purposes, we could counterfactually swap somecrucial modes of Antony for crucial modes of Brutus, again just enough to makethe assertion true without messing around with presuppositions. The same ques-tions obtain about how much is enough, and the same answers apply, howevermuch one needs in the group instance. This is enough for us because we willhave to say something about that instance, in which no particularly troublingconsiderations arise. The reason why this is, I believe, is simple. The ontologicalbite of the Trojans is in the group as such, not its members, which is like sayingthat the group is not equal to the sum of its parts (or the substance of the partswould essentially be the substance of the whole).

If we (can) think of Antony as an array of modes, conceiving him as a wholethat is not the sum of its modes, then we are in business. We can keep Antonyintact across counterfactuals while still swapping some of his modes.5

3 Two intriguing linguistic facts

Let us now move to two apparently unrelated and indeed very basic facts. Thefirst one is why on earth Antony claims that, if the counterfactual were met,there would be an Antony that would ruffle up the Roman spirits. Certainly,Antony is not referring to just anyone bearing the name “Antony” that wouldshow up for the occasion. The relevant Antony is the one and only, albeit with

A N O T E O N R I G I D I T Y

237

Page 249: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Brutus’s authority, power or what have you, to influence the Romans. So itseems that, in the end, Shakespeare is rigidly designating Antony, yet he isspeaking, apparently paradoxically, of an Antony, as if he were describingsomeone.

That sort of parlance is very common to sports commentators, “A fiercelyinsane Tyson bit an ear off of his opponent!” The description introducing thissentence invokes reference to the infamous Tyson, albeit in a subtle way,describing one of his modes. So there we go again: the modes (this time ofTyson’s) showing up when we want to describe non-rigidly a (certain) peak inTyson’s eventful career, while still rigidly (and perhaps indirectly) referring toTyson himself. If Antony were referring to an Antony mode when saying that“an Antony would ruffle up your spirits,” then we would be out of trouble intwo ways.

First, we would not need to invoke the peculiar idea of names ever beingindefinite descriptions. What would be indefinite is the mode; the name it wouldbe a mode of would not only be definite, but furthermore adequately rigid.

Second, if we could proceed the way I am suggesting, we might be able toshow – within the very sentence we are trying to analyze – that we need toinvoke these little mode notions in counterfactual identity statements. Shake-speare might have not only used the troubling counterfactual, but also given usa key to its logical form. From my perspective, his indefinite description of amode of Antony is very useful. To start with, it has some overt grammaticalstructure (the “a(n)” bit) that we can piggyback on when trying to construct thesupport for the elaborate structure in (1). Equally importantly, a descriptionintroduces a frame of reference. One hopes that such a frame is involved in thecounterfactual swap of modes that is at the core of this chapter.

Let me next move to the other linguistic fact, this time from Chinese:

(4) Na gen Ten Zin Gyatsothat CLASSIFIER Ten Zin Gyatso“That T. Z. G.”

This language is one of those where nominal expressions are generally intro-duced by classifiers. In many languages, this is the way to quantify over countnouns, and in the case of Chinese the practice extends to instances of demon-stration. For example, (5) is how we ostensively speak of a certain man:

(5) Na ge renthat CLASSIFIER man“That man”

Importantly, (4) denotes a given person called Ten Zin Gyatso – but as opposedto some other Ten Zin Gyatso. So, for instance, we may use (4) to contrastivelyrefer to the Ten Zin Gyatso who is currently the Dalai Lama (and not some-body else).

Two curious aspects of (4) can be highlighted. First is the fact that names,contrary to nouns, do not take classifiers. Importantly, (5) does not mean “that

D E R I V A T I O N S

238

Page 250: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

man as opposed to some other man,” or any such thing; it simply means “thatman.” In turn, if you want to name the Dalai Lama in a non-contrastive way yousimply say Ten Zin Gyatso, as you would in English. Of course, this is neithermore nor less remarkable than the fact that (4), in English, is also contrastive.6

Since English does not take (overt) classifiers, we say that names are not intro-duced by demonstratives – and this could be the explanation for the onlyreading available in (4).

But then there are clearly some instances where a name can be introduced bya demonstrative. For example:

(6) That happy Ten Zin Gyatso

Again, this is (several ways) ambiguous. It can refer to some particular Ten ZinGyatso who is a happy fellow (the contrastive reading).7 And it can refer to amode of the Dalai Lama – say one that could be witnessed last month. Thatrelevant mode reading is highlighted in (7):

(7) Last month’s happy Ten Zin Gyatso contrasts sharply with that TenZin Gyatso we once saw leaving Tibet.

The most natural Chinese counterpart of such a complex nominal is (8):

(8) Shang ge ye xinfu de Ten Zin Gyatsolast CLASSIFIER month happy of Ten Zin Gyatso“Last month’s happy T. Z. G.”

Importantly, though, (9) also yields the desired reading (although it apparentlyallows a contrastive interpretation as well):

(9) Na ge shang ge yue xinfu de Ten Zin Gyatsothat CLASSIFIER last CLASSIFIER month happy of Ten Zin Gyatso

I find (9) a nice example because of its intricate syntactic properties.My hope here is that Chinese offers some clues to the structure of what I am

calling “nominal modes”. In a nutshell, I expect that what the (first) classifier gein (9) classifies in the relevant reading is a mode of Ten Zin Gyatso. If this is thecase, we would have found direct syntactic evidence for the alleged modes. Inturn, the fact that these modes do not lexicalize in (6) and other Englishexamples seen here (including, in my view, the Shakespeare quote) is no deeperthan the fact that the relevant classifier does not lexicalize in the Chineseexample in (8). I do not know why that is, but whatever is going on with theclassifier lexicalization of (9) vs. (8) may be invoked for the much more generalChinese vs. English distinction (granted, a promissory remark – but one of afamiliar sort within generative grammar).

A N O T E O N R I G I D I T Y

239

Page 251: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

4 On the issue of names and determiners

Classical name theory makes much of the fact that names do not generallytake determiners (or as we have seen, classifiers – which as far as I know is anundiscussed fact). The received (Russellian) wisdom is that a determiner intro-duces a quantification, something considerably more complex than is needed fornames, traditionally taken as logical constants.

It was Burge (1974) who first argued that classic rigidity effects can be cap-tured even if names are not represented as constants, so long as they are intro-duced by a covert demonstrative. There are two different aspects to this view.First is the idea that a name can be used predicatively:

(10) a. She is a Napoleon, although of course he wouldn’t have approvedof her.

b. Every Napoleon has hated the name, after he gave it such a badpress.

c. Napoleon admirers were devastated when he was deported to Elba.

Whatever is concocted for these examples should not be too fancy, sinceNapoleon here serves to anchor the anaphor he. Likewise, the theory of (gener-alized) quantification is based on treating Napoleon in (10b) as the first argu-ment of every, generally in the same vein it would treat a noun like man.

The second important aspect of Burge’s proposal is that demonstrativesinduce rigidity (for the same reason that names are classically taken to invokerigidity; they pick out an object in the world). Put that fact together with thepredicative analysis of names and you can have it both ways, name predicates,but aptly rigidified through a grammatical tool that picks out an object (of whichthe name holds as any other predicate would). The problem is that, as Higgin-botham (1988) notes, Burge’s proposal is as interesting as it is wrong.

Higginbotham’s demonstration is an old linguist’s trick. Make Burge’sdemonstrative overt and see what happens. We already saw what happens in(4), (6) or (9): an entirely new reading emerges.8

Higginbotham’s arguments also unearth new and interesting facts which anytheory will have to account for. Thus, compare (6) to (11):

(11) Happy Ten Zin Gyatso

He points out that (11) can only be read non-restrictively – contrary to what wesaw for (6), which admits a restrictive reading. He accounts for this fact byclaiming that happy in (11) modifies a whole noun phrase, which as such doesnot take restrictive modification. In contrast, he takes an example like (10c) asan argument that names can, in certain contexts, be smaller than noun phrases –under the assumption that this example involves noun (as opposed to nounphrase) incorporation. Whatever the merit of that argument, it creates its ownwrinkle, given a modification like the one in (12) – which must be read non-restrictively:

(12) Every boxing aficionado is a good-old-George-fan.

D E R I V A T I O N S

240

Page 252: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Be that as it may, (12) is a very serious counter-example to Burge’s view, sinceincorporation does not extend to nominal instances introduced by demonstra-tives (or other determiners). At the same time, (12) warns us against any trivialassociation of the rigidity of names to their (alleged) noun-phrase status; afterall, by hypothesis (12) involves the nominal incorporation of something smallerthan a noun phrase, and yet it still invokes reference to the one and only George(Foreman). Note, in particular, that (12) cannot mean that every boxing afi-cionado is a fan of anyone who is good and old and called George. (This may bean implausible reading, but a good-old-Kennedy-fan could reasonably, but maynot factually, denote admirers of good old Kennedy fellows.)

Taking stock, after exploring the factual base, we appear to need a treatmentof names that, apart from yielding their rigidity, allows us to have them“chunked down” into mode components, associate to (generalized) quantifiersand appear in various predicative contexts, and resist restrictive modification intheir bare guise while allowing it when associated to demonstratives.

5 Assumptions for a new proposal

Let us now take seriously the idea that, just as there is a sound relation betweena group and its members, so too there is some meaningful relation between anindividual and its modes, whatever those are. In the case of the group, the rela-tion can be of intricate complexity (cf. a team, family, corporation, country).Somewhat analogously, an individual can also be conceived in intricatelycomplex ways (cf. a lump, object, organism, collective . . .). One may choose toexpress this through different formal mechanisms, but it is a mere and indeedobvious fact about human cognition.

Syntactically, it is quite clear what is going on. We must phrasally supportexpressions like some (individual(s)) among the Greeks or some (modes) ofAntony, which I hope are semantically alike in reasonable ways. The latterexpression may sound a bit odd and formal when fully expanded, but is quitecolloquial in its reduced guise, as the famous song All of me directly attests.Similarly, we can naturally say all of Jordan was invested in that game, most ofMadonna is just hype, some of Che Guevara has remained in all of us, and soforth. (At least) the of in these instances signals a partitive syntax, which syntac-tically places the name in the realm of some “space” which is partitioned. The“mode” I keep invoking should be seen as the relevant (abstract) partition.

These days we have a reasonable picture of what might be going on in(pseudo-) partitive expressions. We invoke for them the syntax of possession(roughly of the sort in Szabolcsi 1983, Kayne 1994, and Chapter 9), whichallows both for nominal manifestations in partitive guise and in fact for gardenvariety possession, as in the Greeks had some outstanding warriors (chiefs, philo-sophers, individuals. . .) or Antony had some remarkable modes (moments,attributes, aspects. . .). For reasons of space, I will merely refer the reader tothose sources, assuming without argument the sorts of structures in (13) (seeChapter 15):

A N O T E O N R I G I D I T Y

241

Page 253: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(13)

D is a quantifier like some, whose argument is AgrP. This is a referential phrase,whose details are shown immediately. The lexical components of the expressionstand in a relation which is syntactically coded as a small clause (SC). Withinthis lexical domain, we encounter some conceptual space (in this instance,Greeks, which can be thought of as a space of “Greekhood,” whatever that is),and an operation on that space that presents it in a particular way, giving it acharacteristic shape and a corresponding semantic type (here, individuals).

All theories have to code that sort of “integral” relation, whether they makeit explicit or hide it into some lexical notion (as in “relational” terms). The onlything (13) is adding to traditional analyses is an explicit syntax in terms of well-attested properties of possessive constructions.

Observe that (13) invokes no reference unless we code what that is; as itstands, DP is a quantification over Agr, which for mnemonic purposes I willhenceforth write as R for “reference.” “R” in turn syntactically associates tosome integral presentation of an abstract space as this, that or the other. Wemay code a given lexical item with a crucial referential feature of the checkingsystem – in Chomsky’s (1995b) sense. Say this is individuals in (13). Then thisitem must move to the checking domain of R, leaving a trace. In other words:

(14)

The syntax in (14) corresponds to some space of “Greekhood” presented inindividual guise – although it could have been in other guises: pairs, groups, etc.,up to cognitive limitations. What those are depends on the properties of therelevant spaces, as well as those of the modeling presentation.

For example, if we are dealing with generalizations of Euclidean spacescalled manifolds, the relevant spaces will exhibit special connectivity properties(see Kahn 1995: Chapter 7). Thus we can express an observation of Russell’s

individual[�r]

R'

RP

SC

Greeks t

R

Dsome

DP

Agr SC

AgrP

individualGreeks

Dsome

DP

D E R I V A T I O N S

242

Page 254: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(1940: 33) which Chomsky (1965: Note 15 to Chapter 1) reinterprets on psycho-logical grounds; a name is assigned to any continuous portion of space. Thus,Chomsky notes, humans do not name the four limbs of a dog as a single notionLIMB, of which the proposition “LIMB reaches the ground” could be true. Itturns out that LIMB cannot be described as a (low dimensional) manifold.9

At the same time, a manifold of any given dimensionality acquires specialproperties depending on the sorts of twistings, foldings, gluings, etc., that oneperforms on it. This is what distinguishes a “cigar band” from a “Moebius strip”from a “torus” and so forth (all of them creatures of the same dimensionality). Icall this sort of operation on a space a presentation because I would like torelate it to the Fregean notion of a “mode of presentation.”

As is well-known, for Frege the meaning (or sense) of a term is its “mode ofpresenting” its bearer (or referent). I want Frege’s notion to correspond to themathematical tool I am alluding to. This view would not please Frege, for whomit is the referent that is presented in a given guise, not a mental space corres-ponding (somehow) to that referent. I have little to say here about the referent,beyond the fact that it is modeled (by the speaker/hearer) in terms of a type ofspace presented in some fashion. I syntactically code this modeling relation as“complementation to R.” In (14) the referent encoded through R is modeled interms of a presentation of a space of Greekhood as individual.

Then what counts as a successful model for reference is still open. About the“successful” bit, I have nothing to say (like everyone else – pointing to an objectonly coding this problem). In any case, we must leave room for speakers’ inten-tions, under the assumption that terms do not refer, but rather it is speakers thatuse terms to refer. I code this by way of a context variable associated to the Qposition in (14). This coding does not really address anything profound, but thematter will not affect what I have to say in this chapter.

On the basis of (14), something like some modes of Antony must be analyzedas in (15) below, whereby a space of Antonihood presented in a “mode” guise isquantified over, yielding an array of Antony modes:10

(15)

This syntax, in itself, adds or subtracts nothing to the analysis suggested for (1),or any variant. What matters for the counterfactual to be sensible is thatthe spaces of Antonihood or Brutushood (which model certain “wholes”) be

modes[�r]

R'

RP

SC

Antony t

R

Dsome

DP

A N O T E O N R I G I D I T Y

243

Page 255: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

constant across situations or worlds; the same would be said about spaces ofGreekhood or Trojanhood. The way of having the cake and eating it too is thatsome spaces can be defined in a certain way regardless (at least up to a point) ofwhat they are composed of. For example, the United States did not cease to beitself after acquiring Arizona, or Cervantes did not cease to be himself afterlosing his arm. Mathematically, one can concoct homomorphisms between theold and the new US or Cervantes, of course without anybody yet having a clearpicture as to when the homomorphisms break down when things get multidi-mensional. In any case, I am adding to these familiar issues nothing beyondtreating an individual as an array of smaller elements, which allows us to speakof counterparts in a way that we need anyway for standard groups. That is notsaying much, but it is avoiding an absurdity and focusing the solution in terms ofa problem that is at least well understood.

Two further problems remain: first, even if I am allowed to claim that (15) isthe syntax for some modes of Antony, have I said anything about the syntax ofAntony? Second, what does it mean for Antony to rigidly designate?

6 Toward a definition of name

I propose that a name has no internal conceptual structure, that is, is purelyatomic. In contrast, a noun, as we saw, is arguably an articulated manifold, witha corresponding elaborate syntax. Thus, compare the following expressions:

(16) a. [Napoleon]b. [SMALL CLAUSE [SPACE man] [presentation CLASSIFIER]]

The intuition is to liken all languages to East Asian variants, where classifiersintroduce nouns, but as we saw not names. The classifier, of course, is intendedas the relevant presentation that articulates the n-D manifold in specific ways(see Muromatsu 1998). Evidently, no such classification is morphologicallynecessary in most Indo-European languages, but it does show up here and therein terms of gender agreement and, most importantly, the genitive/possessivesyntax of integral relations (wholes and parts, kinship relations, elaborateshape/function expressions, masses and their measures, and so on). Traditionalgrammar treats these elements as derivative, in some form or another, on basiccount terms denoting individuals. I am not: individuals are expressed throughcomplex topologies, just as the rest is. Since I speak of these matters in Chapter15, I will not repeat myself here. The important addition I am making is thatnames do not participate in any of this.

The cut I am suggesting in (16) gives us the phenomenology of rigiditywithout invoking reality. The name as characterized in (16a) is a primitivespace. The essential difference between the name in (16a) and the noun in (16b)is that the latter has necessary properties expressed through the classifier. Ishould insist that I am not talking about properties of the bearer in reality of thename Napoleon or the description man; rather, I am talking about these termsthemselves. For being man (ren in Chinese), a term must be a concrete kind of

D E R I V A T I O N S

244

Page 256: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

manifold (which in Chinese gets classified as ge, and by hypothesis in a similar,say, 4D fashion in all other languages). In contrast, for being Napoleon a termmust simply be a mere, arbitrary, abstract manifold of just about any dimension-ality. Thus, Napoleon can, in principle and in practice, name a (4D?) dog, a(3D?) ship, a (2D?) cognac, or even a (1D?) style.

What makes these syntactically opaque names denote rigidly across counter-factuals? The key is in what allows modeling, which is something with parts(and/or flexibility) to it. If I am going to model such-and-such, I need various(and/or flexible) so-and-sos to tinker with, and twist, fold, glue, etc., until I amdone. If you give me a single rigid so-and-so, I can only model that very so-and-so, and no such-and-such, with it. I am supposing that a name is a rigid so-and-so for the very simple reason that, by syntactic definition, I literally do notgive you any access to its internal structure. A name is the ultimate, unsplitatom. In contrast, a noun in my view is a flexible space which is, as it were,warped in a way that its classifier specifies.

Descriptive richness comes, at least in part, from the fact that a noun has notjust one, but n-levels of structure, as many as dimensions it is expressed in as aspace. So for instance a noun like man is intended as having a dimension corres-ponding to what Chinese speakers classify as ge, but what this morpheme actu-ally classifies is itself a space of lower dimensionality, which in turn is presentedin some other way specific to that dimensionality, and so on. All the way downto that dimension where the system bottoms out. The point is, being an n-Dmanifold allows for n ways in which to enrich the basic syntactic parameters thatdefine you as whatever it is you are. In the meantime, there are various subtleways in which an articulated model can be built, roughly corresponding tomodes of presentation in terms of substance, change, movement, etc. (seeMuromatsu 1998). All of this is missing from a name.

These articulated spaces that allow for complex models sanction reference interms of descriptions. By its very nature, an articulated complex model can beapplied to different referents, those that match its specifications. In contrast, amodel with no flexibility can only either model something isomorphic to itself,or else be arbitrarily assigned to something else. Thus, whereas the question“why do you call this a dog?” is sensible and can be answered in terms of fixingthe relevant n-D parameters in one way or another (“Does it move like a dog?”“Does it have dog parts?” etc.). The question “Why do you call him Fido?” is amere topic for discussion, and can only be addressed in terms of an arbitrarymatching between the term and the individual, with no syntactic systematicity(“I liked the name,” “He’s called that by his owner,” etc.). Whereas the rigidityof a term has a consequence for what things can be referred to by it, it is in thisview not determined by the referent, but the other way around.

Note that a kind of rigidity arises, also, at the level at which any descriptionbottoms out as an abstract concept. For example, man (as used in man comesfrom Africa) designates as rigidly as Mandela does in Mandela comes fromAfrica, a point first raised by Putnam (1975) (for certain kinds anyway, and forentirely different reasons). It is not that the terms man (used as a kind) and

A N O T E O N R I G I D I T Y

245

Page 257: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Mandela have the same dimensionality. That cannot be, since we know thereare predicates that are picky with regards to the dimensionality of their argu-ments, and these clearly yield different results with man and Mandela (cf. Manwaned from the forest vs. *Mandela waned from the forest). The point is,however, that Mandela has whatever dimensionality it has, but behaves like ele-ments of the lowest dimensionality do in that neither has components. In thecase of names this is so by definition, while in that of lowest dimensionality ele-ments this follows from their very form.

7 Modes of names are nouns

This view of things allows us to treat modes of names in a straightforward way.A mode classifies a name, that is turns it into a noun; which is good, for tworeasons. First, it gives us a way of reconciling the fact that, although in Chinesenames do not take classifiers, they do when a mode reading is invoked, as wesaw in (9) above. The mode reading is by hypothesis a noun reading, which thusfalls under the generalization that nouns (and not names) are classified.

Second, we have seen that modes of names behave as descriptions. This too isexpected. It is the mode that allows the name space (otherwise rigid) to act as aflexible tool for describing something or other. Granted, that something orother has to be confined to the range of modes of which the name holds (forinstance, Antony modes), but that is neither more nor less surprising thanhaving the referent of a noun confined to whatever falls under the range of thatvery noun – ultimately the name for a kind. That is why we only call dogs“dogs.”

Assuming these ideas, we can now argue that Shakespeare’s quote need notbe more troubling than (17), which of course troubles nobody:

(17) If a friend (of Caesar) were an enemy (of Caesar), . . .

The term friend can be applied to Antony just because the relevant syntacticparameters of this term can be fixed with what we know about Antony. Similarly,enemy can be applied to Brutus. But putting reference aside, both friend andenemy are, say, 4D manifolds that bottom out as different 1D spaces,11 in theprocess going through various presentations (classifiers and all that). There isnothing particularly interesting about a counterfactual in which Brutus could bemodeled in terms of the apparatus associated to friend, and Antony, of that inenemy – a mere shift in parameter settings, often a matter of speaker’s opinions.

But the fact that Shakespeare’s quote is not troubling anymore does notentail that we know what it is supposed to mean. We still have to make sure thatthere is a rationale for the counterfactual speech, where the chimera is invoked.This is why I suggested a paraphrase of the sort repeated in (18):

(18) If (some) modes of Antony had been (some) modes of Brutus, . . .

Given the lexical and logical semantics that I am assuming and/or proposing inthis chapter, we find all the aspects sketched in (19) in the relevant expression,

D E R I V A T I O N S

246

Page 258: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

regardless of whether the identity statement has the completely obvious form in(18) or the much more sketchy one in Shakespeare’s quote:

(19) a. If there existed an Antony space presented in a mode guise,b. and relevantly confined by the speaker to such-and-such,c. such that this space so modeled were actuallyd. a Brutus space presented in a mode guise,e. and relevantly confined by the speaker to so-and-so, . . .

Let us consider each of these lines in turn.(19a) and (19d) are what introduces each nominal expression, and corre-

spond to the quantificational structure that is coded as Q and R (the quantifierand variable positions) in the syntactic structure (15). The “presented” bit, ineach instance, codes the integral relation between the Antony/Brutus spacesand their modes of presentation.

(19c) is the identity statement. I ignore matters pertaining to the mood of thisstatement and all issues that arise with regards to the quantificational structureof the Brutus expression. This is because all of those considerations (though cer-tainly important to the logical form of the chimerical sentence we are dealingwith) arise more generally, and are not exclusive to names.

Finally, (19b) and (19e) code speakers’ intentions, corresponding to the con-textual specifications associated to Q and R in (15).

Now let us talk about the “swap” of Antony modes for Brutus modes that isessential to Shakespeare’s sentence, and generates the chimera. Two importantfactors make the swap on the one hand possible, and on the other relevant andsound. First is the fact that there is something to swap. If we had not split Antonyand Brutus down to constituent elements, all we could do is swap Antony forBrutus, which as we saw is of no great help (in fact, leads to absurdity).

Second, the mode expressions are descriptive, which means they are quantifi-cations of some sort with appropriate contextual confinements (as (19b) and(19e) show). This, in itself, means two different things, both crucial.

What is basically going on in the “swap” is that the speaker pretends some ofthe modes that he knows are Brutus’s are really not his, but Antony’s. Howcould this be done if there were no coding for relevant Brutus’s and Antony’smodes? Surely, not just any old mode of these two people could be invoked forthe consequent of the conditional to be meaningful. The question, then, iswhere relevance is coded. The being itself (Antony’s being Brutus) is not veryuseful, since it is not clear what a relevant being would be here. All that is leftthen are the terms Antony and Brutus, by hypothesis reduced to modes. Thenspeaking of relevant modes (and given that the quantificational syntax encodesstandard context confinement) is as trivial as everyone left making reference to agiven set of individuals, and not applying to everyone in the universe.

But apart from giving us relevant modes, the assumption that we are dealingwith descriptions, which involve the space associated to a name (as shown in(19a) and (19d)), directly “roots” the expressions in the individuals that matter,Antony and Brutus. This is in fact what holds the referential import of Antony

A N O T E O N R I G I D I T Y

247

Page 259: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

and Brutus in place, even when relevant modes of each are swapped around andthe consequent of the conditional is carried out, as it were, on the shoulders ofan entirely chimerical creature.

8 A dynamic theory of naming

I think ultimately everything boils down to names, be they like Antony orBrutus, or like man or dog. The answer to the Shakespearean riddle “What’s ina name?” is simple: nothing. At the same time, it is useful to think of a name asa space which can be conceived flexibly, and be warped into the sorts of nui-sances that make up a noun, e.g. a mode. Nothing may be in a name, but thename itself is something, a space of some sort, if I am right.

The direction I have suggested has some of the virtues of Burge’s, withoutfalling into any of its pitfalls. As in Burge’s proposal, names for me are predi-cates, which can thus appear in predicative contexts like be a Napoleon, be argu-ments to quantifiers as in every Napoleon, incorporate as modifiers as inNapoleon admirer, and whatever else you think predicates should do. LikeBurge, I also think that some (nominal) predicates can designate rigidly in somecircumstances, but contra Burge, my name predicates are not rigid because ofany demonstrative (which is what Higginbotham showed is wrong). Rigidity forme is a property of opaque structures themselves. I have extracted this conclu-sion fairly directly from the fact that Chinese and similar languages do not clas-sify names, while of course they classify nouns in relevant contexts. That issurely a central fact for me, and my thesis can be directly destroyed by showinga language which systematically classifies names.

For whoever is interested in finding such a language, you must rememberwhat will not do: a contrastive reading of a classified name, of the sort seen in(4). It is actually very interesting why an expression like that Ten Zin Gyatso (asopposed to some other Ten Zin Gyatso) is not simply ungrammatical, andinstead this contrastive possibility arises. Note what I am forced to say; inas-much as this expression is indeed classified in Chinese, it must be that it actuallyinvolves a noun, somehow. My hunch is that this sort of noun corresponds to thecolloquial English expression that Ten Zin Gyatso dude/bloke.

If so, the issue is what specific operation dude/bloke, or correspondingChinese classifiers, perform in the Ten Zin Gyatso space. Everything discussedso far had non-trivial effects within the relevant spaces. In particular, classifiers,measures, modes, etc., yield component elements of a given manifold. But onecan imagine a trivial, identity operation whose value is the manifold itself. Saythat is what dude/bloke does on the Ten Zin Gyatso space; then the question iswhat is the difference between Ten Zin Gyatso, the name, and a correspondingform to which the identity presentation has applied. Reference-wise, they arethe same – we are picking out the same man. But we have syntactically pro-duced a (trivial) description of that man. It is the x such that the Ten Zin Gyatsospace obtaining of that x is presented in a dude/block guise.

When I normally refer to Ten Zin Gyatso, and there is no other Ten Zing

D E R I V A T I O N S

248

Page 260: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Gyatso around to bear that name, all I need to do is utter the name in question.However, if another Ten Zin Gyatso becomes relevant in context, an obviousdifficulty arises. We have two concepts each associated to different individuals,yet the linguistic term used for each is indistinguishable from that used for theother. Then we must go into something more complex, a description, which aswe saw comes together with its handy context variable. Now we are in business.The context variable allows us to speak contrastively of this relevant Ten ZinGyatso as opposed to that other, irrelevant, Ten Zin Gyatso.

Similar issues arise when comparing Happy Ten Zin Gyatso with that happyTen Zin Gyatso, the second one of which allows a restrictive reading for happy.The reason for this is that, given the form of the demonstrative expression, it isa description, which means it involves not just the Ten Zin Gyatso space, butalso an associated presentation. It is this associated presentation that makes theexpression a description of an individual (dude, bloke or whatever), to whomthe restrictive modification applies, precisely via the presentation. In contrast,when just the name is introduced, the only modification we can have is that ofthe conceptual space itself, which yields an associative reading of the sort foundin paratactic appositives (Ten Zin Gyatso, a happy man, . . .).

It is important not to equate non-restrictive readings with “true” names andrestrictive readings with “predicative” uses of names. Incorporated nominals, asin Kennedy-admirer, show this conclusion to be factually wrong. They are rea-sonably modificational, therefore predicative; yet modifications of incorporatednames can be non-restrictive, as in good-old-Kennedy-admirer. As far as I cansee it does no harm to consider all names predicative, basically following Burge.Burge’s problem was not there, but in looking for a rigidifier external to thename itself, such as a demonstrative. I have shown a way in which the name canbe rigid in and of itself, which is useful in these incorporated instances wherethere is no way a demonstrative could be incorporating.

Of course, if all names are predicates, we crucially need the sort of syntax in(13), or else we will not have a way of telling them apart from nouns. With thatsyntax, though, if all the components are invoked (in particular, the small clausebit), we will have a noun; otherwise, a name. This, as we saw, will mean thatmany apparent names will really be nouns, simply because they have the appro-priate syntax. For instance, a fiercely insane Tyson, that Ten Zyn Gyatso(dude/bloke), or even Antony and Brutus in were I Brutus and Brutus Antony.

An interesting topic for future research is whether every Napoleon takes abona fide name or rather a noun as its restriction. Two facts suggest that thelatter should at least be an option. First, it is fine to speak of every Napoleondude/bloke, which in my terms signals a noun. Second, instances of conjunctionof the sort in every Napoleon and artillery expert suggest that the semantic typeof these names can be like that of nouns. But I do not know whether theseplausibility arguments are enough to reach a stronger conclusion. The matter isinteresting among other things because *the Napoleon is not good in English(although of course it is in many languages). In any case, I do not see howanswering that question one way or the other will alter anything I have said.

A N O T E O N R I G I D I T Y

249

Page 261: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Also for future research are the exact syntactic conditions under whichnames like Antony get the more specific (some) mode of Antony structure. Insome instances this is just a matter of lexical choice. However, I take it that theinteresting ones are those where the mode or quantificational parts are not pro-nounced, and yet they are interpreted, as in Shakespeare’s quote.

9 Some conclusions

I know I have said several non-standard things, syntactically, semantically, andphilosophically. What follows is a brief summary.

Syntactic structure (13) is meant to replace more traditional noun and deter-miner phrases. It is still a determiner phrase, in the sense that it is headed by aquantificational element, but the complement of this element is not a simplenoun phrase. Instead, it is a referential phrase, which itself introduces an “integ-ral” relation as its complement. This relation is coded as a small clause, typicallyassociating something like a noun to a classifier. So between the noun and thedeterminer there is much more than is usually assumed. I have not really givenarguments for this view in this chapter but have done so in Chapters 10, 11 andwill again in 15. It turns out to be crucial for the present analysis of names.

I have also argued elsewhere for the semantics associated to (13). It has alexical part and a logical part, the former responsible for its intentional proper-ties and the latter for its conceptual specifications. The lexical part correspondsto the small clause, and can be conceived as an operation on an abstract space,possibly an n-dimensional manifold. This concept so arranged creates a predi-cate which holds of the variable that glues the expression together. The variablecorresponds to the referential position in the syntax, itself the complement ofthe quantificational site. That bit, quantifier-variable-predicate, constitutes thelogical arrangement of the expression.

In the normal instance, to say that a nominal predicate holds of a variableentails that the speaker that invokes this predicate can use it to model some-thing or other out there. The difference with standard analyses here is subtle butimportant. A predicate like dog does not denote the dog kind, the set of indi-vidual dogs, dogs in all possible worlds, or any such thing. For me, a predicate ofthat type is a complex array of spaces of different dimensions, all built on thebasis of previous ones. That creates a mathematical apparatus – a topology ofsome sort – which may be used to model dogs.

Once this door is open, one certainly wants a picture of what these spacesare, how they can be tinkered with, and so forth. I have hand-waved in thedirection of manifolds because of the restrictiveness of Euclidean spaces. Ifpushed, I can even hand-wave in the direction of inter-modular connectionsbetween the linguistic and the visual systems (to distinguish spaces that bottomout as, say, red vs. blue) and the motor system (for quick and slow) and so on.Which means this is all a program, albeit a fairly clear one (see Chapter 15).

The only reason that is interesting now is this: it is all irrelevant when itcomes to names. That is the main idea of the chapter. A name does not have

D E R I V A T I O N S

250

Page 262: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

any of these intricacies, and hence serves no modeling purposes. That creates asort of elsewhere scenario for nominal predicate association to the relevant vari-able. When there is nothing to model because there are no toys to tinker with,then the nominal predicate arbitrarily picks out something out there, of whichthe predicate holds as a designation, not a description.

That takes rigidity as the defining characteristic of a name. It is rigid becauseit does not have parts. In this way, we come up with an operational, dynamicdefinition of names and nouns. For instance, names which are added a classify-ing device turn out to be descriptive, hence nouns at some level. This provides asolution to the problem of counterfactual identity statements.

I will not review that solution again, but I do want to point out that I havekept in mind, in particular, Lewis’s (1973) notion of a counterpart. Of course,nothing in my analysis makes ontological commitments about alternative worlds– or even the present one. Indeed, nothing I have said bears much on what ref-erence is, altogether. The chimerical Antony is a counterpart of the real Antonyonly in that the speaker somehow models an Antony that has some, but not allof Antony’s modes. It is a counterpart to Antony inasmuch as the syntax saysso, and it says so by making the relevant expression a description based on theAntony space.

I do not have a clue about this: what are those modes that we have to assumeare Antony’s for him to still be Antony? Or, how many modes is enough? Oreven, is this the right way to pose that question? So long as we understand thatthere is some relation between Antony and his modes roughly of the sortholding between the Greeks and people like Achilles or Odysseus, we are safe.The bottom line is: something having to do with how the Antony space is pre-sented is enough for the speaker to “root” the name Antony in some appropri-ate individual. I am calling that process a modeling, and basically whatever itimplies will do for my analysis of the Shakespearian quote, although of course Iam very interested in the full details for the larger picture.

For what it is worth, I think we are going to need something of this bizarretype anyway for fictional characters, but I doubt that anybody that has not beenmoved by what I have said so far will be moved by the rigidity of Pegasus orSanta Claus. I think it is there, and it certainly has nothing to do with objects inthe real world. Of course, the modeling system presented here treats Santa, orfor that matter Rudolf the Reindeer, with equal seriousness, and makes quite abit of Rudolf not being Pegasus, and so on and so forth.

Why bother with such arcane problems? Not because of their rarity, butbecause treating them pushes the theory in a direction that, for better or forworse, has much to say about intentionality and conceptualization. As a matterof fact, my approach purposely bites the bullet of separating intentional andconceptual representations. To my knowledge only an unwarranted assumptionforces us to treat intentional and conceptual matters as part of, or the output of,the same level of representation. In “the other side of the grammar,” where wedo treat matters of articulation and perception in a unified guise, we have a verygood empirical reason to do so, the Motor Theory of Speech Perception. But in

A N O T E O N R I G I D I T Y

251

Page 263: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

“this side of the grammar” we lump together concepts and intentions, as far as Iknow, because a) they relate to “thought,” and b) we have always done so.12

I cannot finish without a philosophical reflection. The philosopher is notinterested in rigidity as a linguistic phenomenon. What he or she wants are moreserious things, like grounding objects, ultimately scientific ones, at least enoughto avoid relativism. There is much to say about whether this is the way toaddress that question, or whether the question (serious though it is) is evenmeaningful. But be that as it may, this chapter has tried to show that rigidity is adefining property of names, and thus a linguistic phenomenon. To the extentthat rigidly designated objects are themselves rigid in a sense that interests thephilosopher, this is an issue of what counts as a valid model, or how the linguis-tic objects that we come up with are appropriately used to designate whateverthey designate. There may well be an issue there, but if so I do not see that it isof much interest to the linguist, or that it pertains to the Shakespearean quoteanalyzed here, or (more importantly) that it helps us understand much aboutlinguistic categories and how they get to categorize.

D E R I V A T I O N S

252

Page 264: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

13

PARATAXIS†

with Esther Torrego

1 Introduction

We would like to explore briefly two sorts of sentential dependencies. Theparatactic view holds the following. To assert that Galileo believes thatthe earth is round is to assert something akin to “Galileo believed that,” withthe object of believe being cataphorically related to the separate sentence, “theEarth is round”. This approach goes back to Andrés Bello’s original insights,and is defended, classically, by Davidson (1967b). In turn, the hypotactic view isfamiliar to syntactic analyses stemming from Chomsky’s. This view contendsthat there is a single clause, a complement, which rather than being nominal isan entire clause. We will argue that both types of dependencies are realizedin UG.

We will concentrate here on two non-interrogative finite connectives fromthe Romance languages, in particular Spanish, que (that) and como (how).1 Webelieve that these two exemplify canonical hypotaxis and parataxis, respectively.

2 The distribution of como

Descriptively, como has a far more restricted distribution than que. Clausesintroduced by como can appear after the verb (1) but not before. That is, theycannot be subjects (2), topics (3) or left-dislocated constituents (4):

(1) Verás/te darás cuenta como tu madre llevaba razón.“You will see/realize how your mother was right.”

(2) que/*como la tierra es redonda es verdad.“That/how the earth is round is true.”

(3) que/*como la tierra es redonda, veréis algún día.“That/*how the earth is round you’ll see some day.”

(4) que/*como la tierra es redonda (lo) veréis algun día.“That/*how the earth is round you’ll see some day.”

Selection of como is also restricted in lexical terms. Nouns/adjectives (5), andprepositions (6) do not take como-clauses:

253

Page 265: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(5) a. No me gusta la idea/el hecho ?(de) que/*comoI don’t like the idea/fact that/*how . . .

b. Estoy harto ?(de) que/*como. . .I’m fed up that/*how . . .

(6) a. Para que/*como b. Con que /*comoSo that/*how Inasmuch as that/*how

c. Desde que/*como . . . d. Entre que /*como . . .Since that/*how . . . While that/*how. . .

As for verbs, several disallow como-clauses, for instance volitionals, factives,causatives:

(7) Quiero/lamento/hice que/*como. . .I want/regret/caused that/*how.

Note, also, that whereas there are various idioms of the form nominal-sentencewith que, none comes to mind that invokes como:

(8) a. Juan se tragó la bola de que/*como . . .Juan swallowed the ball of that/*how. . .“Juan believed the lie that . . .”

b. Juan nos vendió la moto de que/*como . . .Juan to.us sold the scooter of that/*how. . .“Juan lied to us that . . .”

c. Juan nos contó la película de que/*como . . .Juan to.us sold the movie of that/*how . . .“Juan was bullshitting that/*how. . .”

So let us proceed tentatively under the assumption that these robust facts show,at least, two types of structures. Furthermore, recall that we want to argue thatit is como structures that are paratactic. A strong prediction of this approach isthat syntactic dependencies across como are barred, if parataxis involves twoseparate texts. This prediction is borne out, again with non-subtle data. Overtwh-movement is disallowed in the relevant contexts:

(9) qué os enseñó que/*cómo estaba escribiendo?What did s/he show to you that/*how s/he was writing?

Similarly, predicate raising across como yields ungrammaticality:

(10) A punto de llorar vieron que/*cómo estaba!Ready to cry they saw that/*how s/he was!

Likewise, “Neg”-raising and polarity items also show the opacity of the como-clause:

(11) a. No verás que/*como diga la verdad jamás.Not will-see.you that/*how say.s/he the truth ever“You’ll see that she never tells the truth.”

D E R I V A T I O N S

254

Page 266: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

b. No verás que/*como venga bicho viviente.Not will-see.you that/*how arrive bug living“You won’t see a soul coming.”

More generally, a paratactic analysis predicts the absence of even weaker syn-tactic dependencies, such as bound variable binding. Again, the facts confirmthis prediction:

(12) a. Nadie ve que pro es tonto.Nobody sees that he is stupid.

b. Nadie ve como pro es tonto.Nobody sees how he is stupid.

While (12a) allows a variable reading, (12b) does not.

3 A possessive structure

In essence, we would like to suggest that the sort of structure involved in theSpanish (1a) is akin to the one in (13):

(13) You will realize/see the truth of your mother being right.

This raises the question of what sort of specific structure (13) is. Perhaps obvi-ously, it does not involve a relative clause (cf. *The truth which the earth is flat).However, we can show that it is not a standard Complex NP either, of the sortin (14):

(14) John heard the rumor that the Earth is flat.

Stowell (1981) argued that all “nominal complements” invoke a predicationrelation between the nominal and the clause. While this is essentially correct,some interesting differences arise between the two sorts of structures, as thecontrasts in (15) suggest:2

(15) a. The truth is that the Earth is round.a.� That the Earth is round is (only) the truth.a.�� *That the Earth is round is a truth.b. The rumor is that the Earth is flat.b.� (*) That the Earth is flat is (*only) the rumor.3

b.�� That the earth is flat is a rumor.

Note that these structures may or may not be transformationally related to (13)or (14). But it is a fact that the paradigm with truth and the paradigm withrumor differ, which indicates that we must distinguish two sorts of clausaldependencies on nominals. Largely for concreteness, we aim to capture the dif-ferences in association as follows. For rumor we will assume a standard mergeranalysis (16a). In contrast, we will argue that the structure relating truth tothe CP is not a new category. Rather it is a segmental structure, as depictedin (16b):4

P A R A T A X I S

255

Page 267: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(16)

In other words, we are proposing that (16b) involves a “base-generated” adjunc-tion, or a small clause.

Thus far we have only provided a structure for (13). We must next addressthe question of the other structures with be in (15) (assuming they are transfor-mationally related to the structure in (13)), and also – for completeness – thosein (17):

(17) That the Earth is flat has (some) truth to it.(Cf. *That the Earth is flat has (some) rumor (to it).)

We have introduced three types of structures. One involving a relational predi-cation with truth (13), another one involving a be Auxiliary and raising of thetruth (15a) and finally, a structure involving a have Auxiliary and clausal rais-ing (17).

In recent literature, there is a structure reminiscent of the one above, whichalso has three variants. Consider (18):

(18) a. This child of mine.b. This child is mine.c. I have a child.

It is intuitively clear that these three expressions should have a common source.Kayne (1994), building on ideas of Anna Szabolcsi, proposes that relationalterms such as child come in different guises, depending on a variety of factorshaving to do with definiteness. For instance, the examples in (19) are out:

(19) a. (*)A child is mine.b. *I have this child.

Whatever this follows from, observe the similar facts in (20):

(20) a. The truth is that the Earth is flat.a�. (*) A truth is that the Earth is flat.b. That the Earth is flat has (some) truth to it.b.� *That the Earth is flat has the truth (to it).

The particular structure proposed in the Kayne-Szabolcsi analysis involves twolayers: an AgrP and a DP:

a. CATEGORY

Xrumor

CP

{X,{X,CP}}

b. SEGMENT OF CATEGORY

Xtruth

CP

{�X,X�,{X,CP}}

D E R I V A T I O N S

256

Page 268: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(21)

The important idea to keep in mind is that both possessor and possessed canraise to the specifier of D, and eventually to the matrix. When the possessedraises that far, Auxiliary be shows up. If the possessor does, the D elementincorporates to the Auxiliary, and it is spelled out as Auxiliary have for irrele-vant reasons (see Chapters 9 and 10). The point is that we can immediatelyaccommodate the relevant facts in (13)–(17) to this sort of analysis.

We have noted above that the basic relation between the CP and a DP-likethe truth is predicational, and have suggested a concrete structure for it. Wemust thus enrich the structure in (21) to something along the lines of (22), a pro-posal independently argued for in Chapter 10 for related structures:

(22)

The intuition is that the following three expressions have the same structuralunderlying source (modulo definiteness):

(23) a. The truth that the Earth is round.b. The truth is that the Earth is round.c. That the Earth is round has (some) truth to it.

Suppose that (23b) is a raising counterpart of (23a), and (23c), in turn, is theanalogue of derivations involving the morphological suppletion of “be�D” ashave, as in (18c).5

We furthermore raise the following data from Spanish:

D AgrP

D

Agr'

Agr XP

DP

CP XPtruth

D AgrP

D

Agr'

Agr Possessed

Possessor

DP

P A R A T A X I S

257

Page 269: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(24) a. Juan explicó la verdad de que la tierra es redonda.Juan explained the truth that the earth is round.

b. Juan explicó como la tierra es redonda.Juan explained how the earth is round.

c. *Juan explicó como (de) que la tierra es redonda.Juan explained how that the earth is round.

The gist of the proposal is that (24b) has the semantic interpretation of (24a),although it involves a rather different syntax. The hunch is that the sententialconnective como induces the same sort of effects that the more complex laverdad does. However, important differences arise as well, as (24c) shows. Con-trary to what we see in (24a), como is incompatible with que. In what follows,we argue that como is not a complementizer at all.6

4 Implementation of the analysis

Etimologically, como derives from the Latin quod modo. It is then tempting toargue that como is bimorphemic in the synchronic system as well, involving a Dand a predicative part. In the most radical version of our proposal, it is literallyco- that surfaces as a D, while -mo would be the predicate.7 The architecture ofthe derivation then allows us to analyze (25b) as in (25a):

(25)

This predicts the absence of como idioms as in (8), since the -mo part occupiesthe lexical space of the nominal chunk of the idiom.

Recall also that no overt complementizer can appear in the CP of como-clauses (24c). Of course, no overt complementizer appears in main clauseseither. It is then possible that this is happening in this instance as well. Thedependent clause is a root clause.

Chomsky’s recent suggestion for why complementizers do not have a PFrealization in matrix clauses has to do with the fact that, in general, lexicalinsertion after Spell-out is not an option, since the extension condition in (26)forbids it:8

a. …

… Agr'

AgrPDco-

DP b. … como tu madre llevaba razónhow your mother was right

Agr XP

CP XP-mo

tu madre llevaba razón

D E R I V A T I O N S

258

Page 270: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(26) Extension ConditionA Generalized Transformation (GT) extends the entire phrasestructure containing the target of GT.

This restricts to the root node the possibility of inserting lexical material. Inturn, if lexical material is inserted after Spell-out, the grammar cannot deal withits phonological features. Thereby, any such post-Spell-out insertions mustinvolve no PF features. Finally, since involving no PF features is less costly thaninvolving them, the radically null option prevails.9

One significant difference between clausal dependents of como and those ofnominals such as la verdad “the truth” is that the dependent clause is in oneinstance associated to the genitive marker de, whereas in the other this is notthe case:

(27) a. . . . la verdad *(de) que la tierra es redondathe truth (of) that the Earth is round

b. . . . como (*de) la tierra es redondahow (of) the Earth is round

We think that this reflects a structural difference between the two, much alongthe lines of the contrast in (28):

(28) a. . . . the sister *(of) John’sb. . . . John’s (*of) sister

Within the Kayne-Szabolcsi’s analysis, (28) indicates different structural rela-tions in the overt syntax. John’s is lower in (28a) than it is in (28b). If we applythis sort of criterion to our structures, we are led to the conclusion that thedependent clause in (28b) is structurally higher than the one in (28a).

Within the minimalist system this can only have a cause. The higher elementhas had a reason to move by Spell-out, whereas the lower element has procrasti-nated. This implies that the moved element is attracted to a strong feature, notpresent in the instance without movement. Consequently, we must postulate astrong feature in structures with como, unlike in structures with la verdad (thetruth). The natural step to take is to say that whereas the D element which wehypothesize for como structures selects for a functional category with a strongfeature, the same is not the case for the D heading the other structures.

The logic of the proposal makes one wonder whether the strong feature ofthe functional category hypothesized for como could not be licensing a null pro.Suppose it does.10 This makes a prediction. Dependent clauses introduced bycomo may have a null pro-like expression, relevantly licensed in discourse, justas null pronominals are in general. The same should not be true of clauses intro-duced by la verdad (the truth). Surprisingly, this obtains:

(29) a. A: La tierra es redonda.The Earth is round.

B: Ya verás como *(sí)!Indeed (you) will see how yes(You will see how that it is indeed true.)

P A R A T A X I S

259

Page 271: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

b. A: La tierra es redonda.The Earth is round.

B: *Ya verás la verdad (de) (sí).Indeed (you) will see the truth of yes

(You will see how that it is indeed true.)

The emphatic marker sí is necessary for this sort of sentence to be grammatical,as shown in (29aB). This suggests that yet a further category exists between Agrand the lower level of structure, and that it is this category which must beselected by D, across a semantically inert Agr.

The natural candidate is Laka’s (1994) Sigma. The postulation of such a cat-egory has another advantage within the minimalist framework in Chomsky(1995b: Chapter 3). Strictly, Agr is not the locus of Case checking; rather Agrplus some other category is. Note that in the sort of structures we are hypothe-sizing the CP dependent is an argument of a predicate such as the truth or the -mo part of como. Assuming with Chomsky and Lasnik (1993) that all argu-ments need to check Case, it follows directly that such a CP must be in a posi-tion to check its Case by LF. This directly entails the existence of an extracategory between Agr and the structure including the CP. Furthermore, Martins(1994) has argued that Sigma is the sort of category which is responsible for thetraditional nominativus pendens. It is then reasonable to propose such an exten-sion, which in turn motivates the presence of a lexically realized sí in (29b). Yet,(30) is not an option:

(30) *Ya verás como la tierra es redonda sí.Already (you) will see how the earth is round indeed

This suggests that the pro-clause element is licensed only if Sigma isspecified for speaker-oriented features, such as those involved in the emphasisencoded by sí. We assume that this is an interpretative condition taking placeafter LF. If so, the following sentence is grammatical, indeed technically inter-pretable, but unintelligible, a straightforward possibility within the minimalistsystem:11

(31) a. La tierra es redonda.The Earth is round.

b. #Ya verás como.You’ll see how.

Here, the pro-clause after como cannot be interpreted in the absence of theemphatic, point-of-view-dependent sí.

In turn, this suggests that the problem with (30) is the unnecessary spell-outof Sigma as sí. The matter directly relates to the familiar contrast in (32), ana-lyzed in terms of economy:

(32) a. John (*did) leave.12

b. *John not left.c. John didn’t leave.

D E R I V A T I O N S

260

Page 272: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

It is reasonable to expect the emphatic sí to correlate with the emphatic do (asin Laka’s proposal). If this connection is granted, the matter of sí’s economy isvery much the same as that of do’s, all other things being equal. In particular,there is no need for sí to be the spelled-out Sigma in (30), since, in our terms,there will be in fact no pro-clause to be licensed at LF.13

Finally, we propose that even in those instances where the CP is apparentlyassociated to como in a rather direct way, this is only true in the LF component.In fact, it is always the case that como introduces a pro-clause item in the overtsyntax. The gist of the analysis is that this pro-clause remains at the LF compo-nent if and only if it is appropriately licensed by a point of view element such asemphatic sí (as in (29b)). That is to say, when the pro-clause is so interpreted.However, in all other instances, a pro-clause is also generated in the initialphrase marker as the subject of -mo, ultimately moving to the Spec of Agr:

(33)

Two questions then arise: (a) “Why does it seem as if a whole clause is thedependent of como?” and (b) “Why can it not be the case that a real clause isgenerated in place of the pro-clause?”

The answer to question (a) relates to instances of tough-constructions, as inChomsky’s (1995b) analysis. The main feature of structures as in (34) below isthat they involve a Generalized Transformation merging two different phrasemarkers:

(34)

In the minimalist system, there is no D-Structure level of representation. There-fore, it is possible (in fact, necessary in a case like (34)) to build separate phrasemarkers and merge them in the derivation.

In the spirit of Lebeaux (1988), we propose that a Generalized Trans-formation is responsible for paratactic dependencies. In the initial phrasemarker, a pro-clause occupies the place which is otherwise taken by an entireclause. It is this item that enters into the syntactic derivation, engaging inchecking just as any other syntactic formative would. At LF, however, twooptions exist. Either pro remains as such (in which case a point of view salience

A man[who t is easy Op PRO to please t]is easy Op PRO to like t

pro Agr'

AgrP

XP

t XPt

Agr

Dco-mo

D'

P A R A T A X I S

261

Page 273: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

is necessary for interpretation), or else a separate sentence, literally a separatetext, substitutes into the pro-clause (35):14

(35)

As for question (b) (Why can a clause NOT be base-generated in place of pro?),Note the following property of paratactic dependencies. They need not invokean overt Comp, and hence they cannot, within the logic of minimalism. To put itdifferently, if a main clause-like dependent is possible, then it must be chosenover a subordinate-like dependent. Economy alone grants this conclusion. Inour terms, this means that, whenever possible, the grammar will prefer the pres-ence of a pro-clause, instead of a full clause, perhaps much as overt pro-formsare avoided.

Finally, the matter arises of why a pro-clause is impossible in instances withla verdad “the truth” or more generally hypotactic dependents. This is now amatter of pro licensing, as in (35). In our terms, pro-clauses are licensed only inthe Spec of an AgrP associated to a strong, point-of-view dependent Sigmahead. There are no pro-clauses elsewhere, anymore than pro items in general donot appear other than associated to AgrP Specs whose head has the appropriatecharacteristics (in terms of strength or whatever else is relevant). The reasonwhy parataxis is so restricted is straightforward. It requires the presence of apro-form, which is itself extremely restricted, in familiar ways.

That is intended also as a general way of predicting (2) through (7). In allthese contexts by hypothesis the relevant syntax is impossible. Thus observe:

(36) a. (*La verdad de) que la tierra es redonda es un hecho.the truth of that the earth is flat is a fact

b. (*La verdad de) que la tierra es redonda aceptaréis algún día.the truth of that the earth is round you’ll accept some day

c. (*La verdad de) que la tierra es redonda lo aceptaréis algún día.the truth of that the earth is round it you’ll accept some day

Dco-mo

AgrP

D

Agr'

Agr XP

pro

DP

t XPtCP

D E R I V A T I O N S

262

Page 274: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

d. No me gusta el hecho de (*la verdad) que la tierra es redonda.

“I don’t like” the fact of the truth that the earth is round

e. Desde (*la verdad de) que la tierra es redonda, . . .Since the truth of that the earth is round,

f. Lamento (*la verdad de) que la tierra sea redonda.I regret the truth of that the earth be round

Why point-of-view dependent Sigma heads are impossible in all these contextsis of no concern to us right now, although of course that issue is in itself ulti-mately very important. All that we are saying now is that absence of construc-tions with la verdad in these instances correlates with absence of como, which isnatural in terms of the sort of syntax we have argued for.

5 Some further considerations on complementizers

To conclude, note that given the logic of what we have said, complementizerswhich are not pronounced should outrank pronounceable counterparts (just asSigma realizations have a PF matrix only if independently needed). However,consider (37):

(37) a. Galileo believed the Earth was round.b. Galileo believed that the Earth was round.

If the absence of a pronounced complementizer in (37a) were to be preferred,the option with the overt complementizer should be impossible, contrary to fact.This suggests that the null complementizer in (37a) has a PF representation,residual as it may be, and it is in fact a totally different lexical item from its overtcounterpart. The point is that, inasmuch as the two complementizers are differ-ent lexical items, a comparison of derivations involving either of them would beillicit.15 Presumably, the null complementizer in (37a) arises as the result of cliti-cization to the matrix verb. Of course, in the matrix a null complementizer couldnot cliticize to any host, and a derivation involving it would crash.

This is all to say that the facts in (37) do not involve parataxis, even in (37a),where the complementizer is missing. Rather, (37a) involves a clitic complemen-tizer which is a different lexical item from a full version, and is possible only ininstances where cliticization is independently allowed.16

In contrast, what we are suggesting for radically null complementizers isChomsky’s proposal that elements with no PF features can be inserted in the LFcomponent. These newly inserted null complementizers outrank their overtcounterparts because they are taken to be the same lexical item.

Then we must determine why clauses introduced by radically null comple-mentizers (i.e. main clauses) can only substitute into the sites which we arehypothesizing as paratactic, and not into the sites which everyone assumes arehypotactic. But now the answer is clear. The relevant substitution is a textual

P A R A T A X I S

263

Page 275: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

reconstruction into a pro-clause, and only that (35). Therefore, once again, thedistribution of pro-clauses holds the key to parataxis.

This approach predicts some recalcitrant data suggesting that the phenome-non of null complementizers is not unified. We will venture this specific claim:while the complementizer in (38) is a clitic, the one in (39) is the result of LFinsertion:

(38) a. Deseo lleguen bien.I wish you arrive well. DESIDERATIVES

b. Quiere no les falte de nada.He wants they miss nothing. VOLITIONALS

(39) a. Dijeron habían llegado ayer.They said they had arrived yesterday. DECLARATIVES

b. Lamento no estés contento con tu trabajo.I regret you are not happy with your work. FACTIVES

The analysis predicts that the clitic complementizer should be restricted by thepossibilities of cliticization, thus explaining the required adjacency betweenmatrix and lower verb in (40):17

(40) a. Deseo (*los niños) lleguen bien.I wish the children arrive well.

b. Quiere (*a sus hijos) no les falte de nada.He wants for his children nothing is missing.

In contrasts, declaratives, epistemics and factives tolerate a preverbal subjectafter the null complementizer, as noted in Moll’s (1993) dissertation:

(41) a. Decía los estudiantes apenas se habían quejado.He said the students had hardly complained.

b. Lamentamos a tu hermana no le hayan dado el trabajo.We regret to your brother they haven’t given the job.

c. Pensaba a ellos les iban a hacer este honor.He thought to them they were going to do them that honor.

d. Dijo a su confesor le había de contar tales cosas.He said to his confessor s/he would tell him such things.

Particularly interesting in this respect are instances of wh-extraction. The pre-diction is that movement across (radically) null complementizers should bebarred, since such are, in effect, main clauses. In contrast, movement acrossclitic complementizers should be possible. We believe the prediction is borneout:18

(42) a. *Qué libro dijeron/pensaron/creyeron no habían leído?What book did they say/think/believed they hadn’t read?

b. *Con quién lamentas/das por sentado hayan hablado?With whom do you regret/conclude they may have spoken?

D E R I V A T I O N S

264

Page 276: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

c. Qué libro quieres/deseas/esperas hayan leído?What book do you want/wish/expect they may have read?

Recall, at last, that epistemic and declarative verbs allow dependent clauseswhich may or may not be introduced by que. Why are overt complementizerseven possible in these instances? Notice, crucially, that the cliticization option isnot at stake in the relevant Spanish instances. Therefore, we are led to concludethat the contrast involves two entirely different lexical structures, one paratacticand one hypotactic.19

The tests introduced in Section 1 confirm this extreme hypothesis. The factsare as in (43) through (45):

(43) Predicate raising:A punto de llorar pensaba/decía *(que) estaba.Ready to cry w/he thought/said *(that) s/he was.

(44) “Neg-raising” and the licensing of negative polarity items:20

a. No pienso *(que) diga la verdad jamás.I don’t think *(that) he’ll ever tell the truth.(I think he won’t ever tell the truth.)

b. No pienso *(que) venga bicho viviente.I don’t think *(that) a soul will come.

(45) Bound variable binding demanding an overt complementizer:Nadie piensa (que) es tonto.Nobody believes (that) he is stupid.

This is possible without the complementizer only if the embedded subject isread referentially.

We must thus allow declaratives and epistemics in two distinct subcategoriza-tion frames. By hypothesis, one of these must involve the complex sort of struc-tures we have tried to motivate here. But at the same time, we must allow thesesorts of verbs to appear together with simpler clausal structures. The lattershould be the source of hypotaxis.

P A R A T A X I S

265

Page 277: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

14

DIMENSIONS OF NATURALLANGUAGE

with Paul Pietroski

1 Introduction

Human language is manifested primarily through the one-dimensional channelof speech, in which temporal order reflects certain linear relations among theparts of complex expressions. To a large extent, the linguist’s task is to uncoveraspects of grammar not manifested in the patent linguistic signals. For instance,a little reflection shows that language is at least two-dimensional, if only becausehierarchical relations matter. A string of words like she saw the man with binoc-ulars corresponds to more than one expression; and John thinks he likes spinachdiffers from he thinks John likes spinach in ways that go beyond mere differ-ences in the linear order of constituents. But how far should such reflections bepursued? Do adjuncts, for example, “inhabit a different dimension” from argu-ments? Do causative verbs like kill exhibit a “higher” dimensionality thanadjectives like dead? Or are these just metaphors?

In this chapter we explore two related theses. The adjunct system is brutelyconcatenative and thus essentially flat (apart from asymmetries induced by thehistory of concatenation). But the thematic system brings in dimensionality –and, as a consequence, nontrivial asymmetries. If correct, these claims bear onmany current topics, including the nature of the substantive and grammaticalsub-systems of the lexicon, as well as the place and overall nature of lexico-conceptual notions within the system at large. It may also shed some new lightinto the ongoing debate between atomists (like Fodor) and those (like Puste-jovsky) who advocate lexical decomposition. For while a proposal thatsystematically codes structurally rich notions into elements bearing thematicrelations cannot be purely atomistic, inasmuch as those notions cut across theentire fabric of the linguistic system (and perhaps beyond), neither will it bedecompositional in the usual sense.

2 The asymmetric nature of language

Let us start by thinking about when talk of dimensions is appropriate. Whilephysicists speak of space-time as having four (ten, twenty-one, . . .) dimensions,Euclidean geometry provides the most obvious examples. In terms of succes-

266

Page 278: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

sively lower dimensions, we contrast cubes with squares, line segments andpoints, or triangular pyramids with triangles, line segments and points, etc. Asdiscussions of “Flatlanders” make vivid, one can capture all the facts about, say,two-dimensional (planar) objects without doing the same about three-dimensional objects. Unfortunately, linguists find themselves in the role of Flat-landers. We experience a one-dimensional object and, through various sorts oftests, must somehow figure out what higher dimensions it may correspond to.

While nothing guarantees success in that task, in other domains too thedimensionality exhibited by a given range of objects is far from obvious. Forexample, one can represent the numbers as points on a line, running east towest, with each point standing for a number greater than the number represen-ted by any point to the west. Are all numbers formal objects of the same dimen-sion? Arguably not, since there are more real numbers between 1 and 2 (in asense made precise by diagonalization proofs) than positive integers. This dif-ference in cardinality is masked with the “decimal point” notation system inwhich digits to the left of the point are associated with increasing positivepowers of ten, while digits to the right of the point are associated with increas-ing negative powers of ten. This lets us say both that 101 is greater than 11 andthat 1.01 is smaller than 1.1; and it lets us accommodate infinitely many realnumbers between 1 and 2, like �/2, that can only be approximated with (non-repeating) decimal expansions. Correspondingly, the (one-dimensional) numberline fails to encode many facts about numbers. If point P maps to �/2 and pointQ to �, the distance between P and Q does not itself reflect the fact that thenumber corresponding to Q is twice the number corresponding to P.1

One can also speak of different dimensionalities, without considering sets ofdifferent cardinality, if one is thinking about differences between certain opera-tions. While there is an intuitive sense in which subtraction, division and roots are“inverses” of addition, multiplication and powers (and vice versa), there is also anintuitive asymmetry here. If we start out thinking about positive integers, adding,multiplying or exponentiating will not force us to consider anything new; whereassubtracting, dividing and taking roots will lead (via expressions like “1�1,”“1�2,” “2/3,” or “��1”) to zero, negative numbers, fractions or imaginarynumbers. Familiar considerations, reviewed in the appendix, suggest natural waysof thinking about each expanded class of numbers in terms of different dimen-sionalities corresponding to an inverse of an intuitively more basic operation.

We mention these points, first of all, as stage-setting for a discussion of somelinguistic facts which suggest that natural language presents different kinds ofunboundedness. In the most boring sense, sentences can be very very longbecause words like very can be repeated ad nauseam. We may call this iteration,a process which is easily describable in terms of simple-minded finite-stateautomata.2 A slightly less boring fact is that connectives like and, or, but, etc., inconjunction with finitely many “core” sentences, allow for endlessly many(more complex) sentences. Verbs that take sentential complements, as in Patsaid that Chris believes that Tom sang, also introduce that kind of open-endedness. In neither of those instances is it enough to invoke iteration, and we

D I M E N S I O N S O F N A T U R A L L A N G U A G E

267

Page 279: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

rather need some recursive mechanism of the sort provided by rewrite systems.3

In turn adjuncts (as opposed to arguments) may present an especially interest-ing case of unboundedness, even more so in the case of “disjunct” adjuncts. It isimportant to determine whether these kinds of unboundedness somehow relateto the matter of dimensionality.

Moreover, we know that language manifests all sorts of asymmetries, hier-archies and sub-case conditions. Some are obvious like words are arranged intophrases. Others are more subtle, but have been explored by linguists over theyears. For example Kayne (1994) has argued that asymmetries in mechanisms ofphrasal ensemble result in particular differences in word order. Work by manyin the psycho-linguistic arena demonstrates that children can somehow select agiven structural sub-case as the initial hypothesis, and then learn the elsewherecase, if different from the initial hypothesis, in the presence of positive data (seeCrain and Thornton 1998). If the relevant organizing force is language itself,these sub-case relations must reflect a cut in the very structure of the system. Inthe last half century or so, researchers have also found, on solid empiricalgrounds, thematic, aspectual, nominal or referential hierarchies. Of course forsome purposes it may be enough to just describe these “levels” and either takethem as primitive or blame them on some outside reality (e.g. by correlating thethematic hierarchy with causality). But if one wants to understand how comethese hierarchies arise in natural language, and how they relate to one anotherand to other such asymmetries, including those found in language acquisition,one ought to go deeper. In our view, one should then ask whether what onemight call “the asymmetric nature of language” somehow reflects the differentdimensionalities in its very fabric.

That question turns out to be specific. As an example of the general point, weconsider below a much-discussed fact concerning causative constructions,namely, the “one-way” character of the typical entailments. For example:

if x boiled y, then it follows that x did something that caused y to boil;but if x did something that caused y to boil, it doesn’t follow that x boiled y.

Why should this be so? One can encode the facts by saying that x boiledT ymeans “x directly-caused y to boilI,” where subscripts indicate transitive/intransitive forms of the verb and “directly-caused” is a term of art intended tocapture the difference between “x boiledT y” and “x caused y to boilI.” But notonly does this seem circular; it fails to explain a crucial fact, “Why is the entail-ment one-way?” A related fact, also requiring explanation, is that “x boiled y onMonday” fails to be ambiguous in a way that “x caused the soup to boil onMonday” is. We take this to be a special case of a more general question. Whydoes natural language exhibit the asymmetries it does?

3 “Accordion” events

As many theorists have discussed (see, e.g. Parsons 1990),4 sentences like

(1) Pat boiled the soup.

D E R I V A T I O N S

268

Page 280: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

have meanings that seem to be structured along lines indicated by

(2) ∃e∃x{Agent(e, Pat) & R(e, x) & Boiled(x) & Theme(x, the soup)};

where “R” stands for some relation that an event e (done by the Agent) bearsto the boiling of the soup, and “Boiled” captures the meaning of the intransitiveverb in

(3) The soup boiled.

We assume that the meaning of (3) is correctly represented with

(4) ∃e{Boiled(e) & Theme(e, the soup)}

If (2) is true, so is (4), and arguably, this explains why (3) is true if (1) is.Following Chomsky’s (1995b) development, via Baker (1988), of Hale and

Keyser (1993), suppose the syntax of (1) involves a hidden verbal element, likethe overt causative element in many languages, with which the intransitive verbboiled combines. If the syntactic structure of (1) is basically

(1S) {(� Pat) [(� v–boiledj) [� tj (� the soup)]]},

where the intransitive predicate boiled (which originally combines with aninternal argument) raises to combine with the covert v (thereby forming acomplex predicate that combines with an external argument) then the questionof why (1) implies (3) reduces to the question of why (1S) has a meaning of thesort indicated by (2). And while there is room for debate about the details(including the nature of v in general), it is not hard to see how such a storywould go, at least in outline.

This does not, however, explain why (1) differs semantically from

(5) Pat did something that caused the soup to boil.

or

(6) Pat made the soup boil,

both of which can be true in cases where (1) is false. Suppose Pat is an arsonistwho torches a house that, unbeknown to Pat, contains a pot of soup. As a resultof Pat’s action, the soup boils. So (5) is true, and our judgment is that (6) is alsotrue. Yet (1) is false. This shows that “R” cannot stand for a simple (extensionaland transitive) notion of causation, thus raising the question of what “R” doesstand for. But even if one has a hypothesis, say in terms of “direct” causation,that accommodates the facts, the question remains, “Why isn’t (1) synonymouswith (5)? Why does (1) have a natural meaning that restricts it to a sub-class ofthe cases that would verify (5)?”

Moreover, as Fodor (1970) notes,

(7) Pat boiled the soup on Monday.

is not ambiguous in a way one might expect given (1M). (7) cannot mean thatPat (directly) caused an “on-Monday boiling” of the soup, leaving open the pos-sibility that Pat’s action occurred on Sunday; and

D I M E N S I O N S O F N A T U R A L L A N G U A G E

269

Page 281: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(8) ∃e∃x{Agent(e, Pat) & R(e, x) & Boiled(x) & Theme(x, the soup) &On-Monday(x)}

is not a possible meaning of (7). But neither can (7) mean that Pat acted onMonday and thereby caused the soup to boil, leaving open the possibility thatthe boiling of the soup did not occur until Tuesday.

We take this as evidence that, for reasons that we return to, “R” stands for awhole-to-part relation. Call that Assumption One. The idea, developed in someform by many authors,5 is that if (7) is true, Pat is the agent of a complex “accor-dion-style” event whose final part is a boiling of the soup and whose first part isan action by Pat, and this event, which includes both Pat’s action and the boilingof the soup, occurred on Monday. Thus, we would specify the meanings of (1),(3) and (7) with

(1M) ∃e{Agent(e, Pat) & ∃x[Terminator(e, x) & Boiled(x)] & Theme(e,the soup)}

(3M) ∃e{Boiled(e) & Theme(e, the soup)}

(7M) ∃e{Agent(e, Pat) & ∃x[Terminator(e, x) & Boiled(x)] & Theme(e,the soup) & OM(e)}

where “Terminator” expresses a kind of thematic role. If an event x is the Ter-minator of an event e, then x “participates in” e by virtue of being e’s final part.This instantiates our Assumption One.

As it stands, (3M) does not strictly follow from (1M), without a furtherassumption. The Theme of an accordion-style event e is the Theme of any Ter-minator of e. Call this Assumption Two, formally:

Terminator(e, f)→ [Theme(e, x)↔Theme(f, x)]

This is a plausible assumption about natural language, if, as Tenny (1994) andothers have argued, Themes “measure out” events with duration by somehowestablishing their “end points.” We also return to this central assumption,although in less explanatory terms than we have for the first one.

That handles some of the facts Fodor (1970) stresses; see also Fodor andLepore (1998). If Pat sets the house on fire, thereby causing the soup to boil, itdoes not follow that there is any event e such as that Pat is the Agent of e andthe Theme of e is some boiling of the soup, that is, there may be no single(accordion-style) event that includes both Pat’s action and the subsequentboiling.6

Still, Fodor’s main question remains. Why can (8) not mean that Pat is theAgent of an accordion-event that ends with a boiling of the soup on Monday?Why is

(8M*) ∃e{Agent(e, Pat) & ∃x[Terminator(e, x) & Boiled(x) & OM(x)] &Theme(x, the soup)}

not a possible meaning of (8)?

D E R I V A T I O N S

270

Page 282: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

If the syntactic structure of (8) is

(8S) {(� Pat) [�[(� v–boiledj) [� tj (� the soup)]]] (on Monday)}

where the predicate “(� (� v–boiledj) [� tj (� the soup j)])” combines with theadjunct “on Monday” to form a still more complex predicate (that combineswith the external argument “Pat”), one can hypothesize that independent syn-tactic principles block the adjunct from combining with “v–boiled.” And if

(8S*) {(� Pat) [�[(� v–boiledj) (on Monday)] [� tj (� the soup)]]}

is not a possible structure of natural language, that might well explain why (8)fails to be ambiguous in the way Fodor stresses. Of course, the question is todetermine precisely why (8S*) is bad.

4 Two approaches to sub-event modification

At this point we must extend our database in ways generally inspired byexamples in Pustejovsky (1995), adapted to our purposes. Consider (9):

(9) Jack grew shiitake mushrooms for weeks at a time in the 1990s.

The art of mushroom growing involves spore inoculation into a log, which thenjust sits there for half a dozen years or more. All that most shiitake farmers do iswait for a good rain, and then mushrooms grow like, well, mushrooms. This nor-mally happens twice a year. It seems that if (9) is true, then some event of Jack’sgrowing mushrooms lasted close to a decade. However, there are various sub-events of growing mushrooms involved as well, each lasting less than a month.At first, this might suggest that one can (after all) use adjuncts to describe“internal events” that are integral parts of the larger matrix event.

In support of this idea, one might also note a classic example from the 1960s:

(10) The king jailed the prince in the tower.

has a reading that seems to mean (roughly) “the king brought it about that theprince was jailed in the tower.” These facts are hardly positive for Fodor, sincethey suggest meanings that ought to be unavailable on his view. But they arealso puzzling if (in reply to Fodor) one holds, like we have, that adjuncts cannotmodify an incorporated element.

On the other hand,

(11) Jack grew shiitake mushrooms in 1995.

is not ambiguous in the relevant way. It cannot be true if Jack inoculates his logsin 1993, dies in 1994 and the mushrooms finally come out in 1995. Which meansthere are two sorts of facts at issue. Modifiers denoting open temporal events(like for weeks) can, if appropriately chosen, be used to say something aboutsub-events. By contrast, modifiers denoting concrete times (like in 1995) modifythe whole event.

There are at least two approaches one can take to the facts just mentioned.

D I M E N S I O N S O F N A T U R A L L A N G U A G E

271

Page 283: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

One is to assume that some adjuncts can adjoin to verbs like “grew” (or adjec-tives like “jailed”) prior to incorporation, and then go along for the ride whenincorporation occurs. Call that Hypothesis A. Another possibility, however, isthat all these adjuncts are formally predicates of the matrix event (correspond-ing to the post-incorporation transitive verb), but some predicates apply to anaccordion-event by virtue of how certain parts of that event are related to thewhole (as per Assumption One). That would be Hypothesis B.

To clarify Hypothesis B, consider an old puzzle concerning sentences like

(12) Jack took the beans.

(13) Jack took the beans intentionally.

If Jack tried to take the beans and did so, (12) and (13) are both true. But ifJack successfully tried to take a box, which unbeknown to Jack contained thebeans, (12) is true but not (13). This makes it hard to see how (13) could be trueiff ∃e[Agent(e, Jack) & Took(e) & Theme(e, the beans) & Intentional(e)].

But the first part of an accordion-event will typically be some action such asan attempt to do something by the relevant agent. That is, for accordion events:

Agent(e, x)→∃a[Initator(a, e) & action-of(a, x)].

And suppose that actions are associated with propositional satisfaction con-ditions; see Pietroski (1998, 2000). Then, as a first-pass approximation, an accor-dion-event e is intentional if the condition associated with the action thatinitiates e is satisfied by the occurrence of e itself. In which case, an event ofJack taking the beans is intentional if it starts with Jack trying to take the beans,while an event of Jack taking the beans is not intentional if it starts with Jacktrying to take a box.

Similarly, one might say that a complex event e of shiitake-growing satisfiesthe predicate “for weeks at a time” if e has sub-event parts of the same sort eachof which lasted for weeks. Again, this shows Assumption One at work, a singlecomplex event, composed of multiple episodes of shiitake-growing-by-Jack,could be both “in the 1990s” and “for weeks at a time.”

In either hypothesis A or B one has to understand how come certain adjunctsmodify into sub-events, either through direct adjunction prior to incorporation-to-v (in Hypothesis A) or by way of modifying the whole in a way that dependson certain part-whole relations (in Hypothesis B). In other words, in bothinstances adjuncts have to be selective, as a result of their lexico-semantic prop-erties (say, whatever relevant point distinguishes for weeks from in 1995), muchin the spirit of Ernst (2001). Hypothesis A, however, makes those selectionproperties relevant as the derivation unfolds, with modifiers becoming activepretty much at the point that the sub-event they modify becomes active. Thisview is obviously decompositionalist. In contrast, in Hypothesis B modifiers areactive only “at the tip of the iceberg,” and manage to modify into the internalstructure of complex predicates as a result of the part-whole make-up of thesepredicates. Of course, Hypothesis B is more congenial to an atomist treatment

D E R I V A T I O N S

272

Page 284: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

of lexical items than Hypothesis A is. (Thus our reply to Fodor is compatiblewith at least some of Fodor’s theoretical commitments.)

5 Context anchoring of sub-events

We have seen sub-events active through the use of targeted modifiers, some ofwhich may reach them. Other grammatical mechanisms, this time of an argu-mental sort, yield similar results and allow us to illustrate the role of Assump-tion Two. To see this, consider the effect of clitic climbing on sub-events. This isan ideal test-ground because, although pronouns can be standard arguments ofgiven predicates, they climb when in clitic guise, in which case they end up intro-ducing peculiarly narrower readings.

Consider this situation. Somewhere, at high noon, a terrorist launches amissile that strikes a hospital at 12:20. In that time interval, doctors remove apatient’s heart (which was in its original place at 12:00) and store it in a labora-tory at the other end of the hospital, while a new heart is placed (at 12:15) in thepatient’s chest. The missile strikes and the following Spanish sentences areuttered in the news to describe the horrifying event:

(14) a. El terrorista destrozó el corazón del paciente.the terrorist destroyed the heart of-the patient

b. El terrorista destrozó su corazón del paciente.the terrorist destroyed his/her heart of-the patient

c. El terrorista le destrozó el corazón al paciente.the terrorist DAT destroyed the heart to-the patient

The question is, according to these reports, where did the missile hit? Was it theoperating room where the patient was, or the laboratory where the originalheart was at 12:20?

It is not easy to translate those sentences into English, as they all mean “theterrorist destroyed the patient’s heart.” As it turns out, though, Spanish has away of determining whether “the patient’s heart” in question is the one in thechest. This subtlety is not exhibited in (14a), which has pretty much the importof the English counterpart. It is not shown in the very stilted (14b) either, whichwould be the equivalent of the English “the terrorist destroyed the patient’sown heart.” Unfortunately this is of little help in this instance, as the heartstored in the laboratory is definitely, inalienably, indeed genetically the patient’sown, albeit non-functioning heart and the minute the new heart is connected inthe patient, barring a rejection, that heart is also definitely, inalienably, if notgenetically, the patient’s own. The important sentence is (14c). There is nodoubt in this instance. The destroyed heart is the new heart, not the old one.

The precise mechanics for why that is are immaterial (see Uriagereka 2001a).Descriptively, we must ensure that clitic le, referring to the patient, serves ascontextual anchor for the heart’s destruction.7 If the destroyed heart is not justany old heart, but the heart at the patient, then the intended semantics wouldfollow.8 But the important thing is this. The verb destroy has the rough logical

D I M E N S I O N S O F N A T U R A L L A N G U A G E

273

Page 285: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

form of boil in the examples above, lexical differences aside. One must thenensure that the clitic le contextually anchors just the destruction part. At thetime of the missile’s launching by the terrorist (12:00), the heart which would behit was still attached to the patient’s body. Qua that causing event, then, le (thepatient) as contextual anchor should be possible. The issue is how contextualanchoring of the embedded sub-event works.

On an A-style explanation each sub-event is syntactically active, and thuscontext-confinement via le can take place of the result sub-event without affect-ing its associated, causing sub-event. Again, if this is the only possible analysis ofthe facts, it constitutes a counter-example to the atomist view.

However, a B-style explanation is also possible. The key from that perspect-ive is that we cannot just contextualize the causative part of the event ignoringthe rest of the accordion-event. Which heart was at the patient’s chest when theaccordion-event started is quite irrelevant to the event at large. The heart atthe patient’s chest when the event ended is what is crucial, even if, in the casethat concerns us now, that heart was not around at the event inception. Thelexico-conceptual contribution of the heart is tied to the intransitive (pre-incorporation) verb “destroy.” So the relevant heart is the one at the patient’schest when its destruction takes place, and, as per Assumption Two, this is theonly theme for the accordion-event at large. Once again, this sort of approachsaves the atomist perspective, as no direct contextualization of the internalevent ever takes place in this view.

6 The internal make-up of events

To sum up what we have seen in these last couple of sections, note that either Aor B-style approaches to the possible modification or contextual anchoring ofinternal events require that these sub-events be somehow represented, either atsome initial syntactic level or in logical form. The main point of this chapter isthe nature of that representation. Our suggestion is that it has dimensionalcharacteristics. Intuitively, if you detect a causative layer in an accordion-eventyou a fortiori know now that there are lower layers in much the same way thatan n-dimensional apparatus underlies an n�m one.

That sort of reasoning is direct for A-style approaches, as the different syn-tactic layers correspond to the various dimensions. But even in B-styleapproaches, mindful of atomistic considerations, the reasoning follows. Afterall, what allows us to recover the presence of information in the “viscera” of anevent is its specific information make-up. This make-up may not be syntactic, inthe sense of being available for various syntactic operations; still, it has enoughproperties to support modification (by appropriate adjuncts) and various sortsof contextual anchors. We think this kind of make-up suggests that the relevantexpressions exhibit dimensionality that is not initially obvious.9

The main argument for our claim stems from the fact that it provides anexplanation to what we have called Assumption One. To repeat both it andAssumption Two:

D E R I V A T I O N S

274

Page 286: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Assumption OneIf an event x is the Terminator of an event e, then x “participates in” e byvirtue of being e’s final part.

Assumption TwoThe Theme of an accordion-style event e is the Theme of any Terminatorof e.

Consider the intuition that events (as opposed to states) introduce the idea ofchange over time, and processes somehow “extend” that idea to introduceAgents responsible for the change; see Mori (forthcoming) for this general sortof idea, and references. Modulo concerns of the sort raised by Fodor (discussedabove), the inferential relations among:

(15) Jack opened the door,

(16) The door opened,

(17) The door was open,

suggest a hierarchical relation among the adjectival, intransitive and transitiveforms of open. To a first approximation, one can think of the adjective open as apredicate of individuals, thus representing the meaning of (17) with“the(x):door(x)[Open(x)].” Or one might think of the adjective as a predicate ofstates, conceived as eventualities that “hold” through time (whereas events “cul-minate” with their themes being in certain states), rendering the meaning of(17) with “∃s[Open(s) & the(x):door(x) [Theme(s, x)]]”; see Parsons (1990,2000) for discussion. Either way, one can go on to think of events as changes (ofstate) in individuals, thus treating intransitive verbs as predicates of changes,and accordion-events as processes that terminate in an event of some individualcoming to be in the relevant state.

Correspondingly, one might think of all predications as involving (at the veryleast) ascription of a property to an object y. A more sophisticated and informa-tive predication would have the implication that y was not always open. Itunderwent some change of state (over time). We could represent this kind ofpredication with “∃e∃s[Open(s) & Theme(s, y) & Change(e, s),” where“Change(e, s)” means that e is an event in which the theme of s, and thus theTheme of e, comes to be in state s. A still more sophisticated and informativepredication would have an implication concerning the source of the relevantchange, and thus implicate another event participant – namely, the Causer. Wecould represent this kind of predication with “∃e{Agent(e,x) & ∃f[Terminator(e,f) & ∃s�Change(f, s) & Open(s)�] & Theme(e, y)]}.” Clearly, (18c) implies(18b), which implies (18a):

(18) a. ∃s�Open(s)� STATEb. ∃f[∃s�Change(f, s) & Open(s)� EVENTc. ∃e{Agent(e,x) & ∃f[Terminator(e, f) & ∃s�Change(f, s) &

Open(s)�]} PROCESS

D I M E N S I O N S O F N A T U R A L L A N G U A G E

275

Page 287: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

It does not immediately follow that the relevant representations should bedimensionally expressed. One could, for example, just state the facts in terms of“meaning postulates” and leave it at that. But this leaves the question of why, inaddition to the brute hierarchical array in (18), the more subtle central factsembodied by Assumptions One and Two should also hold.10

The central intuition behind Assumption Two is the idea that an eventualityis somehow built around a “lower” Theme. In particular, the thesis

Terminator(e, f)→ [Theme(e, x)↔Theme(f, x)]

states that if we extend a simple event f into a process e, then the Theme of ejust is the Theme of f. That does not follow on any simple-minded interpretationof the hierarchy in (18). Why could it not be, for instance, that if we extend finto e then the Theme of e is an entirely separate entity? Hierarchies, as such,are a dime-a-dozen. For example, one can hierarchically arrange straws accord-ing to their length, without any given straw standing in any particularly interest-ing relation with regard to any other. Accounting for Assumption Two evidentlyrequires a hierarchy that arises because its levels are, in some sense, defined interms of dimensions involving the Theme space.

Similar considerations apply with regard to Assumption One. Nothing in thesort of hierarchy loosely represented in (18) entails anything with respect to agiven sub-event being part of a larger event. Again, in the straw hierarchy wejust mentioned there is no sense in which a small straw is part of a larger one.Yet without our narrower assumption, coupled with the other assumption justdiscussed, we would not be able to address Fodor’s important concerns, and ourtheory would be wrong.

To repeat, we are not saying that the tight nature of the hierarchy hinted at in(18) must be dimensionally represented. There are other, more or less cumber-some, ways of adding the extra assumptions. Indeed, one could just list them.For now on, our point is more modest. A dimensional view of the hierarchywould fit nicely with our (empirically motivated) Assumptions One and Two.

Even that last sentence ought to be clarified. Nothing that we have saidforces Themes, specifically, as a foundation for the dimensional system. But it iseasy enough to see the kind of extra step one would need in order to makeThemes privileged. For instance, making the very definition of a verb workaround the designated Theme role, in much the same way as a dynamic function(e.g. the derivative over time of space, that is, velocity) is built around somestatic space. But that extra step is not formally central to the concerns of thischapter. One could have had a dimensional approach to the issues of concernnow without events being built around themes. The fact that they are still needsan explanation, though we will not provide one in this chapter.

7 The locus of dimensional shifts

We have suggested that the apparatus implicit in (18) is, in fundamentalrespects, like the one underlying familiar notions from geometry or arithmetic.

D E R I V A T I O N S

276

Page 288: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

An analogy is that processes are to events are to states as cubes-over-time are tocubes are to squares:

Process (e) →∃f [Terminator (e, f)] → ∃s [Change (f, s)]4-Dimensional (x) →∃y [Temporal-Projection-of (x,y)] →∃z[Depth-Projection-of (y,z)]

But even if something along these lines is correct, one wants to know whetherspecific lexical items (or something else in the grammar) are responsible forthese dimensional shifts.

On the neo-Davidsonian account defended here, one specifies the meaning of(7), repeated below

(7) Pat boiled the soup on Monday.

as:

(7M) ∃e{Agent(e, Pat) & ∃x[Terminator(e, x) & Boiled(x)] & Theme(e,the soup) & OM(e)}

The transitive verb (derived from the intransitive) is a monadic predicate ofevents, as is the partly saturated phrase on Monday. Likewise, Agent(e, Pat) is amonadic predicate of events. Pat, like Monday, the object of a preposition, makesits semantic contribution as the argument to a binary predicate that expresses arelation between events and other things. Similar remarks apply to the soup.

As is often noted, thematically elaborated event analyses treat argumentsand adjuncts on a par, since both are treated as conjuncts of a complex eventdescription. Indeed, verbs are treated the same way. Whatever special roleverbs may play in sentence formation, for purposes of interpreting the sentenceformed, verbs are (like arguments and adjuncts) treated semantically as con-juncts of an event description. This is a simple and fairly radical idea. For thesuggestion is that, modulo an occasional existential closure, phrase markers areinterpreted as conjunctive predicates.

This requires that arguments like Pat and the soup be interpreted via them-atic roles, as by themselves they are not predicates of events. Thus, neo-Davidsonians are committed to a limited kind of type-shifting. When Patappears as the subject of a verb like boiledT, it is interpreted as the monadicevent predicate “Agent(e, Pat)” – or making the argument position moreexplicit, “∃x[Agent(e, x) & Pat(x)]”; and when the soup appears as the object ofsuch a verb, it is interpreted as the monadic event predicate “Theme(e, thesoup)” – or making the argument position more explicit, “∃x[Theme(e, x) &the-soup(x)].” In this sense, arguments and thematic roles introduce a twist tothe basic compositional apparatus. This is unsurprising, in so far as event analy-ses are designed to account for the compellingness of inferences involvingadjuncts, like “Pat boiled the water on Monday, so Pat boiled the water,” asinstances of conjunction-reduction (in the scope of an existential closure; seePietroski (forthcoming a, b) for details).

D I M E N S I O N S O F N A T U R A L L A N G U A G E

277

Page 289: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

We find it significant that no language we know of has lexical items synony-mous with the (metalanguage) expressions “Theme,” “Agent,” “Benefactive,”and so on. One can say that there was a boiling of the water by John; but “of”and “by” do not mean what “Theme” and “Agent” mean. This is of interest.Languages have words for tense, force indicators, all sorts of arcane quantifica-tions and many others. Yet they do not lexically represent what seems to be acentral part of their vocabulary. Similarly, case-markers are not correlated with�-roles (except, perhaps, for a handful of restricted, so-called lexical cases).Thus the accusative him can bear all sorts of �-roles :

(19) a. I like him.b. I lied to him.c. I believe him to be a genius.d. I literally used him as a counterweight to lift the piano.e. I lifted the piano with him as a counterweight.

This is typical, and so familiar that it is hardly ever noticed. Why is there no lan-guage that distinguishes him in, say, (19a) and (19e) as in (20)?

(20) I like theme-him, but I used instrumental-him to lift the piano.

It is not clear to us why most familiar analyses do not predict some variant of(20), with either morphemes attached to him or entirely separate words in eachinstance (with a paradigm for pronouns of the sort witnessed for person,number, gender, definiteness, and so on).

We think this sort of fact reveals a simple truth. �-roles are not part of theobject-language. This makes perfectly good sense in the neo-Davidsonian view,where the normal mechanisms for composition that language has are utterlytrivial, dull predication. But as we just saw, language has a mechanism for “step-ping out” of this simple-minded predicative routine. And perhaps this is rele-vantly like the way that “inverting” arithmetic operations can lead to “steppingout” of a given domain (see Appendix). �-roles let speakers use otherwisesimple linguistic expressions, initially “designed” for simple predication, todescribe domains (and causal relations) with elaborate structures.

More technically, we take a �-role to be a type-lifter, which raises the type ofan argument to that of a predicate, which can then relate appropriately to otherpredicates. Whenever the system uses one of these type lifters, bona fide func-tions with an argument and a value, it must step out of its boundaries. This isempirically reflected on the fact that these devices are not syntactic formatives.And it strengthens the argument for the dimensional view. It is natural to thinkof these external-to-the-lexicon items as co-extensive with dimensional cuts (aview first presented in Mori 1997).11

8 The place for adjuncts

We have only given plausibility arguments for dimensions. It is natural to havethem as points where �-roles come in, and the resulting representations are

D E R I V A T I O N S

278

Page 290: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

tightly articulated in interesting ways (with part-whole implications built aroundthe notion of a theme space). But in truth those same results could be achievedin other ways, perhaps less elegantly. In this section, we would like to explorethe possibility of a stronger argument. It will be even more speculative, butperhaps also more tantalizing in its form.

The issue is where adjuncts fit in the general minimalist picture of grammar.In a “bare” phrase-structure system of the sort explored by Chomsky (1995b),one can easily define heads, their complements and their (multiple) specifiers,but adjuncts are a nightmare to define. Similarly for other important works,which fit adjuncts at the price of assimilating them to specifiers (Lasnik andSaito 1992; Cinque 1999) or conversely (Kayne 1994). Adjuncts are special inother respects as well, having no complex syntactic properties to speak of. Pureadjuncts are not selected, do not move to check features, do not get bound orbind, do not surface as empty categories (hence do not control and are not con-trolled). In addition they disallow movement across and their wh-movement (ifpresent at all) is destroyed by the weakest of islands.

Uriagereka (2001b) develops an argument, suggested by Chomsky in recentclass lectures (Spring 2001) and reminiscent of ideas presented in Lebeaux(1988), that adjuncts inhabit their own (especially simple) dimension. Supposethat phrase-markers with adjuncts cannot be appropriately labeled, since label-ing mechanisms reflect finitely many different types of core linguistic relations.12

In essence a verb-phrase does not change its type because of relating to someadjunct, and this sort of relation is in principle unbounded. If adjuncts do nothave labels, how does the system tell apart one syntactic object X with adjunctY from the same syntactic object X with adjunct Z?

In one sense, it ought to be simple. The adjunct, after all, is there. And from aneo-Davidsonian semantic perspective, the adjunct is indeed “just there” as amere conjunct (among potentially many) in an event description. But it is notclear what formal properties (if any) the grammar tracks when dealing with therelevant object, X associated to Y or to Z, if (by hypothesis) it is not trackingsub-parts of the object by their labels. Moreover, one can keep adding modifiersas in

(21) Beans grew for weeks, for years, for decades.

Suppose we code each modified sub-expression with a number, so that beansgrew for weeks is labeled “1,” beans grew for weeks, for years is labeled “2,”beans grew for weeks, for years, for decades is labeled “3,” and so on. Then thealgebraic structure of these syntactic objects, resulting from unbounded modifi-cation, will be like that of the numerals that stand for positive integers. Nextconsider (22):

(22) Jack [grew beans for weeks, for years, for decades…] twice, threetimes, four times …

Imagine Jack living long enough to have grown beans for weeks at a time foryears, and to have done this for decades, etc., and to have done all that twice,

D I M E N S I O N S O F N A T U R A L L A N G U A G E

279

Page 291: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

and to have done all that three times, etc. Continuing with the notational task,let us code each new modification (at the next level) with a second numeral inan ordered pair, so that Jack grew beans for weeks, for years, for decades twice islabeled “(3, 1),” Jack grew beans for weeks, for years, for decades twice, threetimes is labeled “(3, 2),” etc. Now the algebraic structure of the relevant syntac-tic objects will be like that of the numerals that stand for rational numbers(. . .3/1, 3/2, . . .), and as discussed in the Appendix, this is plausibly viewed as adimensional difference.

That suggests, though it does not prove, that adjunctions stem from a verysimple (perhaps brutely concatenative) method of expanding symbols (unlikethe richer and more constrained hierarchical thematic system). In essence,within the confines of a given dimensionality of lexical expression adjuncts arejust derivationally added, without ever being attached to the phrase-marker. Noissue, then, arises about their bare phrasal representation. Similarly, their dis-continuity, and lack of transformational syntax, follows. In turn, their semanticscope for the purposes of compositionality ensues from the sheer order of acti-vation in the derivational workspace.

9 Infinite regress

The last comment we made in the previous section about the scope of modifica-tion ought to provide another test scenario for the idea that adjuncts are verydifferent from arguments. Suppose that two (or more) adjuncts are simulta-neously activated in a derivational workspace. Then they ought to show no rela-tive scope differences. Examples of that sort exist in natural language, andoccasionally go by the name of “disjuncts.” Thus, for instance, aside from theobvious differences in meaning between (23a) and (23b),

(23) a. Lawyers behave nicely rudely,b. Lawyers behave rudely nicely,

there is a certain, open-ended reading for which those two sentences mean thesame thing, something paraphrasable as: lawyers behave nicely, rudely . . . ,rudely, nicely . . . who knows? both ways, as life is messy in court. Disjuncts,aside from having a peculiar intonation, must come to the right of the head theymodify (cf. nicely, rudely, (*. . .) lawyers behave). This allows us to construct amore radical sort of test for the view that different sorts of adjuncts associatewith different dimensionalities. It has to do with the possibility that sentencescontaining adjuncts could be, in some non-trivial sense, infinitely long.

If that were the case, as Langendoen and Postal (1984) have shown, then theclass of sentences involving these disjuncts would be transfinite, thus of a differ-ent dimensionality from that of sentences not involving them. We emphasizethat latter point because it makes no sense, in our terms, to have sentences withinfinitely many arguments, assuming that these require transformational syntaxto converge (e.g. in terms of Case assignment). A transformation maps a defi-nite input to a definite output, thus cannot involve infinitely long inputs or

D E R I V A T I O N S

280

Page 292: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

outputs. However disjuncts, adjuncts more generally – do not, by hypothesis,require transformational syntax to converge. If so, all bets are off with regardsto them (at least in these particular terms) and the size of the sentences thatbear them.

It is virtually impossible to establish whether there are sentences of infinitelength, given contingencies about both human existence and the linear nature ofour phonetic system. However, it might be possible to test the idea with regardto pure LF representations, with no associated PF and hence no obvious need tobe linear. One particularly interesting ground to construct an experiment comesfrom domains in the literature where we know that infinite regress puzzles arisein ellipsis. We can see what sort of intuition speakers have about these whenthey involve disjuncts and when they involve arguments.

Consider these Antecedent Contained Deletion (ACD) examples:

(24) a. Inevitably Monday follows every Monday that Sunday does.b. Inevitably Monday follows Monday, which Sunday does as well.

meaning “a Monday follows every Monday a Sunday follows” and “a Mondayfollows a Monday, which Sunday follows too.” For these readings to be possible,the direct object (every) Monday must be capable of scoping out of the verb-phrase, so that the elided material (follows Monday) does not involve the ellip-tical phrase contained in its very antecedent (that/which Sunday does), or else inthis phrase too we would have to recover the ellipsis, and so on, ad infinitum.(24a) is an example of the sort first discussed by May (1977), and (24b) one dis-cussed by Hornstein (1995a) and Lasnik (1999). In both instances some trans-formation carries the material we do not want within the ellipsis out of theantecedent domain, Quantifier Raising in (24a) and standard A-movement forCase/agreement checking in (24b). In this context, though, we are interested inseeing what happens when we do not provide a mechanism for that, and so wein fact force an infinite regress.

The verb follow is normally interpreted transitively. However, it also has anintransitive reading (as in a news bulletin follows) which allows us to build a testcase, when Monday is interpreted as a bare nominal adverb, with the meaningon Monday. Observe then (25):

(25) Inevitably Monday follows (on) every Monday that Sunday does aswell.

This is possible with the reading “Monday follows (intransitively) on everyMonday that Sunday (transitively) follows.” This much is not surprising, as thequantifier adjunct every Monday can in principle move out of the verb phrasethrough Quantifier Raising. (26) is more interesting:

(26) Inevitably Monday follows (on) Monday, which Sunday does as well.

In spite of involving no Quantifier Raising (Monday is not a quantifier) or anymovement for the purposes of Case/agreement checking (Monday is not anargument in this instance), the sentence has a meaning along the lines of

D I M E N S I O N S O F N A T U R A L L A N G U A G E

281

Page 293: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

“Monday follows (intransitively) on Monday, which Sunday (transitively)follows.” Note that we elide just “follows,” and not “follows (on) Monday,” thusagain avoiding the infinite regress. Apparently that is possible with adjuncts.One can target the ellipsis of an X to which some adjunct Y has adjoinedwithout having to include Y as well.13 Now we are ready for the test case, involv-ing disjuncts.

The relevant example is (27), where we have added the word etc. in order tosuggest an open-ended, raising intonation on as well, thus at least allowing thedisjunct interpretation:

(27) Inevitably Monday follows (on) Monday, which Sunday does as well,etc.

For speakers we have consulted, this can roughly mean “Monday (intransi-tively) follows on Monday, which Sunday (transitively) follows as well onMonday, which Sunday (transitively) follows as well on Monday, etc.” It is asort of interpretation that somehow invokes the infinity, indeed monotony, oftime. The sentence, which is a generalization, is false if taken strictly (assumingtime will come to an end), though perhaps it is still felicitous as a generic claimwhose exact truth-conditions would be hard to pin down precisely.

Having seen an acceptable, interpretable (in the sense that some well-knownEscher lithographs are) sentence involving an infinite regress with adjuncts, wemust now consider what happens when arguments are involved. For that wehave to find a situation where the argument does not have an “escape hatch” byway of either Quantifier Raising or any other movement. Observe (28), wherewhich crucially modifies Monday and not the day before Monday:

(28) Inevitably Monday follows the day before Monday, which Tuesdaydoes too (etc.).

In this instance there is no Quantifier Raising, and if A-movement carries some-thing out of the VP, that ought to be the day before Monday. This, though, doesnot provide a relevant ellipsis. In particular, there is no way to get the meaning,“Monday follows the day before Monday, the day before which Tuesday followstoo.” That is expected, but why can the sentence not just mean, “Mondayfollows the day before Monday, the day before which Tuesday follows too, theday before which Tuesday follows too, the day before which Tuesday followstoo, etc.?” Arguments disallow infinite regress.

It may be hard for speakers to force the which in (28) to go with Mondayinstead of the day before Monday. Since otherwise the test is irrelevant, consideralso (29):

(29) Inevitably male descendants follow the family of their ancestors, . . .a. . . . who of course female descendants do too.b. . . . which of course female descendants do too.

The claim is that (29b) is better than (29a), even though there are, in principle,infinite regress readings for both of these sentences (thus for (29a), “male

D E R I V A T I O N S

282

Page 294: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

descendants follow the family of their ancestors, who female descendants followthe family of, who female descendants follow the family of, who. . .”). By way ofmovement of the family of their ancestors (29b) does not need to go into an infi-nite regress to be grammatical with the meaning “male descendants follow thefamily of their ancestors, which female descendants follow as well.” But thatoption does not exist for (29a), hence an ungrammaticality ensues. This showsagain that infinite regress is impossible with arguments.

The latter conclusion is of course not novel, although the possibility, high-lighted in (27), that such regresses are possible with adjuncts (and more specifi-cally disjuncts) is, so far as we know, an original claim. For our purposes, thepoint is that a system that allows infinitely long expressions ought to be of ahigher dimension. Since the argument we have just provided can (in principle)be repeated at any level that disjuncts are possible (the domains that �-roles/type-lifters define) we expect each of these levels to correspond to dif-ferent dimensions.

We should clarify that of the three types of unboundedness mentioned at theoutset; the first and third kind might actually be related. Iterativity is easy tomodel in a finite-state automaton, which cannot model recursion, and recursionis easy to model in a phrase-structure grammar, which cannot strictly modelanything involving strings of infinite length, of the sort we have just seen. Still, itis interesting to note that iterative structures, like very very difficult do notascribe meaning to each very (created by a separate loop in the system). Thereis no real meaning to the idea that a very very very difficult proposal is only (say)three fourths as difficult as a very very very very difficult proposal. Both of theseare just emphatically difficult proposals, period (the amount of emphasis being afunction of the speaker’s passion, dullness, stuttering or whatever). But thisarguably means that what we need in this instance is a very loose system,perhaps not even something as fancy as a loop; possibly modifications are “justthere” and do not add much to meaning because, strictly, they are not beingadded to the semantic representation. Be that as it may, this opens the questionof what, precisely, modification is (see Uriagereka 2001b on this), and suggeststhat the open-endedness we are now experiencing is related to iterativity, thelatter being just a trivial sub-case of the former. What that open-endednessreally amounts to, especially in the case of disjuncts, is something we will not gointo now.

10 Some conclusions

We have shown that a dimensional interpretation of familiar hierarchies is atleast possible and perhaps even plausible. We have done this within two closelyrelated views, stemming from a minimalist syntax and a neo-Davidsioniansemantics. In essence we take adjunction to be a more basic syntactic relationthat Merge, and correspondingly, predication to be a more elementary semanticnotion than �-role assignment.

We have not explored the syntax of either of these notions in any detail, but

D I M E N S I O N S O F N A T U R A L L A N G U A G E

283

Page 295: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

the assumption has been that adjunction is flat, whereas hierarchies emerge as aresult of argument taking, for which the system steps out of its boundaries, quiteliterally, creating new representational spaces or dimensions. It is throughdimensional shifts, associated with argument taking, that asymmetry enters thepicture, thus predicting characteristic entailments of the sort analyzed in thischapter. We also mentioned, though have not examined, how this kind of imbal-ance should be responsible, in the end, for just about any area in languagewhere some sort of hierarchical, sub-set or otherwise asymmetric relationobtains (see Chapter 15).

In the last couple of sections we have explored the intriguing possibility that,within dimensions induced by arguments, adjuncts just “sit there” with no realsyntax to speak of and, as a consequence, the possibility of not just unbounded,but also infinite expressions ensues at least for disjuncts. The latter result isobviously tentative, but potentially very significant, since if true it wouldsuggest, in the spirit of conclusions reached by Langendoen and Postal (1984)for the system at large (which we do not embrace), that a full representation ofat least disjunct properties is impossible (in computational terms).

For that chunk of language responsible for disjuncts it is possible that expres-sions exhibit systematicities, but not strictly compositional properties. We wouldnot be surprised if, for instance, the Escher-style ACD examples involving dis-juncts are interpreted, to the extent that they are, in roughly those ways. Ofcourse, semantics is compositional, so strictly speaking disjuncts fall outside ofthe realm of standard semantics. But disjuncts are just a sub-case of adjuncts,which do seem to have standard compositional properties (if not necessarilystrict compositionality, as they arguably do not compose as part of a phrase-marker). So in a hierarchy of strictures, disjuncts precede adjuncts precedearguments, in terms of mere systematicity, compositionality, and strict composi-tionality.

That suggests a kind of open-endedness for a chunk of language that can onlybe understood biologically if the system is so underspecified that it has virtuallyno cognitive limits, and thus is, in itself, relatively limited as a system of thought,communication, and so on. Yet that very system, with a minor improvement(argument taking) all of a sudden becomes constrained, plastic, creative inuseful terms, and otherwise familiar as our human language. For anyone inter-ested in understanding the evolution and emergence of language, this curioustransition should be of some interest.

Our conclusions also have a bearing on an important debate between atom-ists and decompositionalists, mainly carried on the pages of Linguistic Inquirythe last couple of years, which has circled around the following four theses:

Thesis One: The lexicon is productive.Thesis Two: A simple, typically first order, formal language is enough to

capture the subtleties of natural language semantics.Thesis Three: In fundamental respects all languages are literally identical.Thesis Four: Analiticity is an achievable goal.

D E R I V A T I O N S

284

Page 296: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

The two sides on the debate have had opposing views on these theses. Forinstance Pustejovsky (1995) assumes all four, while Fodor and Lepore (1998)deny all of them.

Nothing that we have said here entails that the lexicon should be productive,in the generative sense that Pustejovsky and others (going all the way back togenerative semantics) advocate. In fact, we do not even know how to show thatthe lexicon is potentially unlimited,14 so we assume it is not. Nonetheless, thefact that we do not assume a generative lexicon does not entail that we allow nosystematic structure for words. This is where our dimensions come in.

Nothing that we have said, either, entails that natural language semanticsshould involve any fancy mechanisms. Quite the opposite, we have made a bigdeal of the fact that language has �-roles and these are not pronounced, henceby hypothesis are not part of the substantive lexicon. Our view, thus, is that“higher order” devices, if that talk is even appropriate for the sorts of entitieswe are analyzing,15 lay outside of the object language. Again, that is wheredimensional cuts are signaled.

So in those first two theses we align with atomists, and contra (standard)decompositionalists. However, our view is the exact opposite with regards totheses Three and Four. With Chomsky throughout his career we assume, andfind ample reason to believe, in the deep uniformity of languages. Similarly, alsoin the spirit of much of Chomsky’s heritage, we do believe in analyticity. Itis analytic that y boiled if x boiled y, and that y was open (at some point) ifx opened y; see Pietroski (forthcoming a, b). But this is not because wordsreduce to other words. It is because language exhibits hierarchies and (one-way)relations between “dimensions” like event and state. By trying to arguefor certain dimensions of language, we have attempted to explore the prolegom-ena of a theory that seeks to determine what the analytical foundations oflanguage are.

We interpret the project of Hale and Keyser (1993 and elsewhere) in roughlythese terms, so do not claim any originality for the broad picture we present. Inother words, we seek some constrained analyticity, in particular through the useof dimensions that cut across the lexicon. There are two major ways of imple-menting this overall program, which we have termed Hypothesis A and Hypo-thesis B. The former is less sympathetic to atomism than the latter, which is whywe have pursued Hypothesis B (after all, we are atomists at least in terms oftheses One and Two). But in either instance there has to be some component ofthe system that analyzes lexical concepts in dimensional ways.

Appendix

Imagine a familiar kind of invented language in which “1” is a symbol, the resultof concatenating any symbol with “*” is a symbol, and if “X” and “Y” aresymbols, “�(X, Y)” is a symbol, as is “�(X, Y).”16 The symbols of this languageinclude “1*,” “1**,” . . . , “�(1, 1*),” . . . , “�(�(1*, 1**), 1*),” “�(1*, �(1, 1)),”and so on. An obvious possible interpretation is that “1” denotes the smallest

D I M E N S I O N S O F N A T U R A L L A N G U A G E

285

Page 297: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

positive integer. “X*” denotes the successor of whatever “X” denotes. “�(X,Y)” denotes the sum of whatever “X” and “Y” denote, and “�(X, Y)” denotesexactly one of two arbitrarily chosen objects, call them “T” and “F”, dependingon whether or not “X” and “Y” denote the same thing.

There are endlessly many expressions of this language, and one can speak ofseveral different types of expressions. There is also a sense in which complexexpressions of the language exhibit hierarchical structure. For example, in thesentence �(1****, �(1**, 1*)) there is an asymmetric relation between “�” and“*.” Likewise, there is an asymmetric relation between “1****” and “1**.”Strictly speaking, there is also an ordering of the asterisks in “1**,” which is theresult of concatenating the symbol “1*” with “*”; though intuitively, this is a lessinteresting kind of hierarchy.

It is easy to imagine simple mechanical procedures for determining whethersentences of this language denote T or F. An expression of the form “�(1m,1n),” where “m” and “n” stand for numbers of asterisks, can be replaced with anexpression of the form “1m�n�1” by alternately erasing components of “1n” andadding asterisks to “1m,” and then erasing “�” (its brackets and comma) whennothing remains of “1n.” A slightly different procedure, of erasing a componentfrom both “1n” and “1m” until at least one of them is completely erased, couldbe used to evaluate expressions of the form “�(1m, 1n).” If both “1”s are erasedin the same round, replace “�(,)” with “T.” Otherwise, replace whatever is leftwith “F.” The language can also be expanded via the following rules. If “X” and“Y” are symbols, so is “#(X, Y)” and “%(X, Y)”; where by stipulation, “#(X,Y)” denotes the product of whatever “X” and “Y” denote, and “∧(X, Y)”denotes whatever “X” denotes raised to the power of whatever “Y” denotes.17

One can use erase/write procedures for evaluating the resulting expressions, atleast in principle, in terms of addition; see, for example, Boolos and Jeffrey(1980). If “#” and “∧” are defined in these terms, one can speak of proceduralmeanings for expressions of the language, where the meaning of each expres-sion determines its denotation.

One could use mechanisms of the sort just described to determine the(absolute value of the) difference between two unequal integers, and one canstipulate that if “X” and “Y” are symbols, so is “�(X, Y).” But this is not yet tointroduce a device for representing subtraction. One needs to know howsymbols like “�(1, 1)” and “(1, 1*)” should be interpreted, since the proceduresdescribed thus far do not settle the question. A now obvious, but initially dra-matic, thought is that the successor function is “reversible,” and one can encodethe idea of a predecessor function by allowing for symbols like “*1,” “**1,” etc.Given the possibility of both right-concatenation and left-concatenation, therecan be symbols like “**1***” and “***1**.” These are easily transformed, viaalternating erasures, into symbols like “1*” and “*1.” But this still raises newinterpretive questions, in a way that introducing “#” and “∧” did not. Putanother way, the meanings thus far assigned to expressions do not yet deter-mine the denotata of sentences like “�(�(1, 1), *1)” and “�(�(*1, 1), **1).”Do they denote T or F or neither? This extension of the basic language requires

D E R I V A T I O N S

286

Page 298: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

a conception of the relevant domain (of possible denotata for symbols of thelanguage) as including more than just the number 1 and its successors.

Similar considerations arise if the language is extended to allow for symbolsof the form “%(X, Y),” interpreted as whatever “X” denotes divided by what-ever “Y” denotes. For while one can think of division as an inversion of multi-plication, the denotatum of “%(1***, 1**)” is hardly determined by theprocedures described above, and such expressions cannot be reduced to expres-sions of the form “�(X, Y).” In this sense, the meaning of “%(1***, 1**)” issomething new under the sun, expressible in a new dimension, whereas “#(1***,1**)” is equivalent to “�(�(1***, 1***), 1***).” Correspondingly, one nowneeds to think about the relevant domain as including all the rational numbers.18

D I M E N S I O N S O F N A T U R A L L A N G U A G E

287

Page 299: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

15

WARPS

Some thoughts on categorization†

288

1 Introduction

Language presents paradigmatic regularities, together with the usual syntag-matic ones that syntax is designed to capture. This chapter proposes a way ofderiving systematic hierarchies by analyzing linguistic categories through thealgebraic structure of numbering systems (hence by way of dimensions, eachrecursively defined on the previous). The goal is not to equate “vertical” struc-turing and “horizontal” syntax, but rather to explore the properties of theformer in order to predict certain well-known implicational facts. New recalci-trant data are also brought to bear on the issue, as well as a proposal for acquir-ing lexical categories in present terms which, it is argued, successfully mimicsthe acquisition sequence by infants.

Combinatorial or “horizontal” approaches to linguistic structuring, by theirvery nature, are not designed to capture implicational or “vertical” properties ofhuman language. An old debate, recently refueled on the pages of LinguisticInquiry,1 is concentrated on whether these “vertical” properties are real. I willsuggest below that they are, at least to some extent. More importantly, though,all present analyses of the implicational properties of language merely restatethem formally. That is, category X is taken to implicate category Y only becausesome formal relation stipulates the entailment in some level of linguistic or non-linguistic representation. I do not see the explanatory power behind that.

The basic idea in this chapter is that there is a more profound way of doingthings, with interesting empirical consequences. When all is said and done,current approaches stipulate each class of implications possible from a givencategorial unit. In the alternative I suggest, a single stipulation is intended towork for all classes of implications. Furthermore, the stipulation in question isneeded independently to conceptualize the structure of numbering systems.

I make much of the fact that the species which is able to speak is also capableof counting, subtracting, doing fractions, and so on. I think the structure of bothsystems is not just related, but in fact identical, when seen at the appropriatelevel of abstraction. This suggests that the stipulation required in order toobtain implicational structure is derivable from deeper properties of cognition.Interestingly, if this is the case, language will have a curious “dimensional”

Page 300: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

property to it, which I suspect relates to other cognitive abilities that make useof this property. Differently put, human lexical concepts will be recursively builtin layers that can be expressed derivationally, but it need not be, which meansthe system in question need not be of the same sort as standard syntax.

The present proposal has certain structural properties following from cognitiveprinciples. However, in other proposals I am familiar with, standard conceptualframes (usually expressed in a variant of a first-order predicate calculus) are takento underlie syntactic structures. What we will develop here is intended to mapsomething with conceptual and associated intentional properties. However, in itselfthe underlying structure is purposely abstract, hence not even remotely similar towhat is assumed in current versions of generative semantics or related sub-disciplines. This is all to say that I take the autonomy of syntax rather seriously, butbecause of syntactic reasons I propose a kind of unfamiliar structure to account fora phenomenon that has been, so far as I know, never analyzed in these terms.

2 The problem with categories

Recent work within the Minimalist Program has explored the possibility thatthe faculty of language is a pure derivational system. Nowhere has this ideabeen as explicitly advocated as in Epstein and Seely (2002), who citing the workof complexity theorists claim that theoretical appeal to macro structure proper-ties (of the sort apparent in levels of representation) fails to explain macrostructure itself. In essence, for Epstein and Seely, “if you have not derived it,you have not explained it.” In this chapter, I concentrate on an issue that is stillleft open even in the radically derivational system that Epstein and Seely advo-cate. It operates with representations such as V, N and so on. If one has notderived those, has one explained them?

Arguably we do not need to explain that. Any system has primitives, andperhaps those are the primitives of ours. It is slightly troublesome, though, thatsome of those (V, N and the other lexical categories) have been postulated overtwo millennia ago, roughly contemporary with Democritus’s atom. Perhaps lin-guists had the right cut at the time, although physicists did not. Then again,perhaps linguistic understanding has not advanced as much as physical under-stand has. On a related note, the set of functional categories is growing by themonth. Cinque (1999) has shown some forty categories of the relevant sort.Does that mean we have forty functional primitives?

None of these questions have a priori answers. One can deny the facts, butthe linguists that have proposed them are clearly competent and honest. Onecan try to divert the facts to a different domain, for instance insisting on notdecomposing whatever one’s favorite lexical atom happens to be, and blamingthe putative structure that “clouds the picture” on the structure of thought, theworld or something else. Or one could also try to bite the bullet and reflect onthe nature of the structure of these creatures that derivations start on, how itdiffers from syntactic structure, and yet how it seriously affects it.

We can pose the question thus. Should linguists stop their research when

W A R P S

289

Page 301: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

faced with V, N and so on? Or should we split the lexical atom to see what kindsof interactions are possible at that level of linguistic reality? And if the latter,what is the nature of those interactions?

Perhaps it is worth pondering for a moment a relatively similar situation thatarose within physics in this century. Atoms were not expected to have parts,which is why they were called atoms, but they did. Then the issue was howmany, with what properties and so on. By the 1940s and 1950s, sub-atomic par-ticles were being found by the dozens. Did that mean that the primitives ofphysics were counted in dozen? For some (unclear yet powerful) reason physi-cists do not like that. They observed that those various particles exhibited regu-larities coded in terms of “conservation laws,” which led to the postulation of“particle families.” Soon new regularities emerged, and the quark model waspostulated in the early 1960s.

We may also keep in mind that the realm of the sub-atomic (where electro-magnetism and the strong and weak nuclear forces obtain) obeys principles whichare mathematically incompatible with those obeyed by the realm of the super-atomic (where gravity reigns). Contemporary physics has lived almost a century inthis contradiction, without the project having stalled. Certainly many physicists tryto unify the situation, but a successful, grand unified theory is still missing.

Demanding more from present-day linguistics than from physics seemsunreasonable. That is to say that I am not going to be troubled if the result ofour research into sub-lexical units forces us into a theory that is different from,or even incompatible with the one we have for super-lexical stuff. If that is thecase, so be it. We shall seek unification in due time.

3 A vertical dimension in language

Returning to basic facts, it is obvious that language is organized not just syntag-matically (in a “horizontal” way), but also paradigmatically (in a “vertical”way). Syntagmatic organization is what our familiar derivations are good at cap-turing: morphemes, words, phrases, transformations. About paradigmaticorganization, much less is understood.

Take for instance a Vendler-style classification of verbs into states, activities,achievements and that sort of thing (on these matters, see Pustejovsky (1995)and his references). Standard derivations have nothing to say about this. For if aderivation is to run smoothly, it makes no difference if states are higher or lowerin the classification than achievements. The same can be said about thematichierarchies (Theme low, Agent high, etc.), complexity hierarchies withinnominal structures (mass low, count high, etc.), auxiliary hierarchies (modal,perfective, etc.), and just about any other hierarchy that has been proposed inthe literature. It is easy to see, for instance, that a Phrase-structure Grammarwould have different rules depending on whether we want modal to appearhigher than perfective or vice versa, but the grammar itself does not care aboutthe class of productions it admits, so long as they are permissible.

The reality of these hierarchies should be hardly in doubt. There are many

D E R I V A T I O N S

290

Page 302: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

quibbles about the details of how they ought to be expressed in syntax (“Istheme higher or lower than goal?” and so forth; see for instance Baker 1996),but it is clear that language presents vertical cuts, and little of substance is eversaid about them.

There is actually not much to say about them, in principle, in standard deriva-tional terms. A Chomsky-style derivation is a generalization of a concatenationalgebra which is designed for that, concatenation, or horizontal organization(see Chomsky 1955). But one wonders, in the Epstein/Seely spirit, whether itshould not be the case that the very fabric of grammar yields both the familiarhorizontal ordering and the vertical one as well.

It is of course tempting to blame the nature of the vertical cuts on the natureof reality, which language is supposed to represent. That would work like this.Reality is (appears to humans) as vertically structured into familiar classes. Lan-guage is used to speak of those classes. Ergo it is normal for language to reflectthe structure of what it speaks of. For instance, say that an expression like achicken presupposes some chicken stuff, whereas an expression like chickendoes not presuppose a (whole) chicken. But is that a linguistic fact? Is it notenough to say that all chickens are made of chicken stuff, whereas not allchicken stuff is necessarily attached to a chicken?

In the absence of a theory of reality, this is hard to ascertain, but one can cer-tainly play the game. So suppose the world does work like that for the chickeninstance. Unfortunately, things get more messy right away. To see this, supposewe formalize the chicken example in a bit of detail. Say we have an “ontology”of individuals (like chickens, lambs and so forth), and since these are made ofwhatever they are made of, we have a corresponding statement in the theory tosay that individuals have some mass.2 The mass of an individual is denoted withsuch words as chicken, whereas the individuals themselves are denoted with thatpredicate plus some determiner. With that machinery in mind, let us move on tomore interesting examples.

Let us start with something as everyday as fish. Surely, that fits the picture wehave just sketched. But now consider a dish of very small fishes, baby eels.When we eat that, we say that we are eating fish, although at no point are weeating parts of fish, rather, we gobble entire fishes, dozens of them. But that isfish too, so fish has to be not just the stuff that makes up an individual fish, butalso whatever makes up sets of those, if they are tiny. That qualification isimportant. Imagine I have eaten one thousand sardines in my life; I could referto the entire event of sardine eating in my life as a fish-eating event. I ate bothfishes (one thousand of them) and fish (in a given amount). People considerboth of those propositions natural. In contrast, suppose I have just eaten babyeels in an amount which, if I count, involves precisely one hundred individualbaby eels. The proposition that I have eaten fish is natural, although theproposition that I have eaten one hundred fishes seems somewhat odd to speak-ers, although it is still reasonable in terms of the individuals I have eaten.

What is going on is not too arcane. Somehow a perspective about the size ofwhat I am eating is relevant in natural language. Fish, as used to denote what

W A R P S

291

Page 303: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

humans eat in their meals, is canonically measured in terms of the size of a dish.It does not matter whether what I eat is part of a large salmon, an entire sardineor a whole set of baby eels. What counts is to fill up a plate. And what fills up aplate is fish, regardless of its individual nature.

That tells us that a connection to reality is not as clean as suggested above.3 Iunderstand, in biological, chemical, physical terms what it is to say that an eel, asardine, a salmon, have eel, sardine, or salmon flesh inside their skins. But, inthose terms or any terms other than cognitive-linguistic ones, what does it meanto say that for fish to be naturally used to denote some food, you must havemany baby eels, one sardine, or a chunk of salmon: the size fitting a dish?

Similar considerations apply for Vendler-type ontologies. If I compose atune, it comes into existence. My tune composing event in time entails apermanent state of tune existence. The tune I have composed will remain com-posed forever. But is that a fact about the world? Suppose we say that all that isrequired to understand the relation between my composing the tune and itsstate of existence is something basic like an (obvious) causal relation.4 Is Xbeing caused by Y a necessary or sufficient condition for Y to stand in somekind of implicational relation with X, expressed through lexical information?

The existence of my tune may have caused my neighbor to cry in despera-tion. If just about any causal relation can find its way to the linguistic systembecause it exists, I should be able to say that “*My tune cried my neighbor” or“My tuned Xed my neighbor” (for X any verb) with the entailment that myneighbor cried because of my tune. Certainly I can say that my tune botheredmy neighbor (say), but that does not entail anything about his crying, even ifthat is what actually happened. In fact, I do not know how to express such athought other than in a sentential manner (as I have). So causal relationsbetween given events in the real world are clearly not sufficient for these eventsto relate to each other in purely lexical terms.

In turn, causal relations between given events in the real world do not seem tobe necessary for them to relate to each other in lexical terms. If Marlow followsShorty to the crime scene, it makes sense to say that Marlow’s following evententails Shorty’s followed state because Marlow’s action caused Shorty’s state; but,if Marlow follows his sister in his family, although it is true that Marlow’s follow-ing condition entails his sister’s state of being followed, it is less obvious that thecondition caused the state. Presumably, what caused the state is something aboutMarlow’s parents and their moods, chance, and whatever is involved in familyplanning. It is there where standard causality would seem to stop, but one can stillspeak of events, conditions and actions that go beyond basic causal relations.

One could stretch one’s theory of causality to accommodate these sorts ofsituations, but it seems beside the point. Instead of making cumbersome theor-ies of reality or causality bend themselves to the demands of human concepts asexpressed through language, it is worth exploring the alternative view. Thatreality, causality and all that are fine as they are, but human perspective throughlexical concepts has much to add, in particular the generative edifice thatimposes the vertical cut in the implicational reasoning entertained above.

D E R I V A T I O N S

292

Page 304: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

In what follows I suggest one way of approaching this paradigmatic, verticalaspect of language, which manifests itself through lexical regularities. I believethat this aspect and the sort of mechanism I have in mind for expressing it canbe useful in understanding the nature of syntactic categories more generally(not just stative and eventive verbs, or mass and count nouns, but even the verydistinctions between verbs and nouns themselves). Given what I have said sofar, it should be obvious that I am not making any direct claim about “outside”reality. Only “inside” cognition is at issue.

4 A word on existing theories, plus an alternative

There are two theoretical takes on the vertical cut of language. A Fodor-style,atomistic approach denies that any of this is linguistic (see Fodor and Lepore1998). What it is from this perspective (whether something in the structure ofthought, or in that of reality) has not been worked out in any detail, so far as Iknow. A Jackendoff-style, decompositional approach manifests itself in variouslinguistic guises, with more or less emphasis on syntactic (à la Hale and Keyser1993), or semantic (à la Pustejovsky 1995) aspects of the system. All of thesepresent serious conceptual differences among themselves, but share, I believe,the following formal property.

Say I want to state that denotation X (or syntactic object X corresponding todenotation “X,” or, mental or real, world structure X corresponding to denota-tion “X”) is in some sense a proper part of structure Y. Then I propose a formalmechanism that basically states that fact in some first-order language:

(1) X is a proper part of Y.

(There are fancier ways of stating (1), but plain English will do.) If we allowourselves statements of this kind, we may construct all sorts of elaboratearrangements, trivially. Kinship trees, for instance, work just like that, where thenotion “descendent” substitutes for “proper part” in (1), and allows familiarensembles. In linguistic terms, a statement like (1) allows one to express, forinstance, a Vendler-style classification, directly through relevant denotations(“state X is a proper part of event Y”), or corresponding syntactic objects (VP isa proper part of vP), or even related (mental) world objects.

Consider, in contrast, how we would classify something else in a differentrealm. For instance, numbers, abstractions with no meaning which may allow usto reflect on the structure of the problem without attaching any extrinsicsignificance to the formal question. Say we want to express the fact that the setof objects like 2 is part of the set of objects like �2, and that set in turn is part ofthe set of objects like ���.5 Try (2):

(2) a. The set of natural numbers is included in the set of whole numbers.b. The set of whole numbers is included in the set of rational numbers.

These statements are obviously true, but unsatisfactory.First of all, there is a reason behind these claims. The naturals are part of the

W A R P S

293

Page 305: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

whole numbers because those can be obtained by inverting the mechanism thatgives us the naturals (succession or addition). When we start subtracting nat-urals we stumble onto the “rest” of the whole numbers, the negatives. Andindeed, all naturals have an expression in terms of the mechanism that yields therest of the whole numbers (thus �(�1) is the negative expression of 1, and soon). Similarly for the whole numbers and the rationals, which can be obtainedby inverting a natural mechanism within the whole numbers (multiplication, anaddition of additions). When we start dividing the whole numbers we stumbleonto the “rest” of the rationals, the fractionary. And indeed all whole numbershave an expression in terms of the mechanism that yields the rest of therationals (2/2, 3/3, etc., are fractionary expressions of 1, and so on).

Second, the reason (2a) is true is related to the reason why (2b) is true,although at a different dimension. The previous paragraph expresses this intu-ition. Division (the inverse of multiplication) is to the rationals as subtraction(the inverse of addition) is to the whole numbers. We can think of each of theseas generating functions, or the particular kind of relation whose effect on thegenerating set (the naturals for subtraction, the whole numbers for division) isthe “larger” set, which the original set was a part of.6 Indeed, we could proceedgenerating “larger” sets in roughly the manner indicated, by choosing anothernatural operation within, say, the set of rational numbers (powers, a multiplica-tion of multiplications) and inverting it. That way we obtain irrational numbers,so that the following statement is also true:

(2) c. The set of rational numbers is a subset of the set of real (rationaland irrational) numbers.

All of the statements in (2) are true, but not primitive. They follow from thealgebraic structure of numbers and how they are constructed. One can give acomputational characterization of the “generating” operations outlined above(subtraction on the naturals for the whole numbers, division in the wholenumbers for the rationals, and so on; see Chapter 14: Appendix). What is more,a higher order characterization would be possible as well. In English:

(3) The inverse �O of a closed operation O in number set X yields a setX� which X is a part of.

(3) is a defining statement about what I have been calling a “generating func-tion,” and is obviously not a number statement, but a statement about numbers.If a statement about 1 or 2 is of order n, a statement about “inverting” opera-tions within the system that 1 or 2 obtain is clearly of a superior order n�m.

The moral is an important one, I believe, if we judge from the fact that, justas natural language has obvious horizontal and vertical cuts, so do numbers.Thus, 2/3�2 is a linear expression whose syntax could be trivially generated by aconcatenation algebra, just as that of John loves chicken can, but the fact that“�” is qualitatively different from “2,” and the set of objects like “2/3” includesthe set of objects like “2” is not expressed in concatenation fashion, any morethan the differences between John and loves or John and chicken are. In the

D E R I V A T I O N S

294

Page 306: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

case of numbers, however, we need not blame those vertical correspondenceson any outside system (the world, some semantic translation, or whatever). Thevery formal structure of numbers necessarily yields the vertical dimension thatwe are alluding to. That is, 2/3 would not have been 2/3, or 2, 2, without theimplications we have been discussing. Could this way of dealing with the verticalproperty of numbers tell us something about the vertical property of lexicalexpressions?

One is tempted to answer in haste, of course not. There is nothing necessaryin a chicken having chicken stuff, or ontologies dividing into individuals andactions. Those are contingent properties. It could have turned out (I suppose)that the universe came out with no individuals whatever, and had just stuff, orthat stuff was composed of individuals and not the other way around, or that theuniverse were static, with no actions. Ah, but those are all considerations aboutreality, which I said I was setting aside. In a sense the question could be posedthis way: “Is there meaning to the claim that the vertical structure of lexical con-cepts (their classifications, entailments and other ‘algebraic’ properties) happensto be necessary, and to some extent at least, essentially the one found in thealgebraic structure of the numbering system?”

That question is more sensible than it might seem at first, although it pre-supposes a view of cognition that is worth exposing. Once we grant that there is“independent” structure to lexical concepts, somehow determined by the prop-erties of mind/brains, then it becomes an empirical issue what that structure is.That it might share aspects of the structure of numbers, or any other structurepresent in the mind, should not be particularly troubling, especially in a biologi-cal creature, subject to the twists of evolution. It is a fact that humans have thecapacity to understand numerical relations, and furthermore that they are theonly species with that capacity. Then that the structure in question may be usedfor this or that purpose is entirely within the realm of possible uses of evolution-ary results. Just as one does not find it philosophically hard to swallow that thesame mental capacity that underlies mathematics should underlie our musicalabilities, a priori it should be equally possible that this very structure (or at anyrate some structure with the relevant algebraic properties) might underlielexico-conceptual knowledge. Consider how that would work.

The good news is obvious. We can import the algebraic structure ofthe mathematical system (whatever is responsible for (3)) to give us the observ-able hierarchies and other relations. No need, then, to restate the relevant rela-tions, blame them on reality or anything of the sort. At the same time, the devilis in the details. How would that tell us the difference between V and N, or thevarious classifications of each V and N, and so on?

The remainder of the chapter proposes a way of addressing that sort of con-ceptual question. But I want to be very clear about the difficulty that the relatedintentional question poses. Suppose I convince anyone that N is “this algebraic-object” and that V is “that-algebraic-construct”. Still, what does it mean tosucceed in describing John by way of the expression “man” or to denote apeanut eating event by saying that “a man ate peanuts”? Throughout this

W A R P S

295

Page 307: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

chapter, I only have speculations about that very tough question. I should insist,however, that the way the truth calculation of “a man ate peanuts” is usuallydone is different in spirit from what I will attempt here. As far as I know, allstandard theories assume a given ontology (of men, peanuts, eating, and soforth) which already suffers from the difficulties I am trying to avoid, presup-posing the vertical relations that I want to account for. My problem is thereverse. By postulating an abstract mathematical structure for the array oflexical concepts, the (relatively) easy part is capturing the vertical relations.They are there (“inside”). The hard part is to make that correspond to anythingfamiliar out there (“outside”), since by going abstract in the vertical dimensionwe will distance ourselves from an apparatus of easily observable items, dis-cerned through pointing and other such devices which (I think) researchersimplicitly assume when deciding on their ontologies.

That is all to say, my peanuts or my eating is not going to be anything astrivial as a little picture of either. What it will be remains to be seen, but I hon-estly do not think that presently existing alternatives are as straightforward ascustomarily assumed. That I can point to a bag of peanuts (also a unique humanactivity) already presupposes an incredibly difficult intentional operation. How Igot my mind there is what I would like to understand, and theories so far merelypresuppose that it got there. But if things were so straightforward, it should beeasy to program a robot (or get a dog) to understand that innocent pointing. Iknow of no results in that respect.

5 The basic idea

Humans are good at what, as a homage to Star Trek, I will call the “warping”task. My wife twists and folds a thread (a unidimensional entity) until withenough knots it creates a variety of forms in what can be seen as a two-dimensional lattice: a scarf, a sock, a sweater. Even I can twist a piece of paper(a bidimensional entity) until after a few calculated folds and pulls I can create afairly decent airplane, an entity which can be described as a three-dimensionalobject. Some of my Japanese friends do better and produce frogs, birds, bal-loons. And almost everyone in my department plays a game which consists oftaking a three-dimensional object from one part of a three-dimensional field toanother. We do that for a variable amount of time, and whoever puts the objectmore times into a designated part of the opponent’s field is said to win.7

What goes on in those situations is mathematically fascinating, a trans-formation of n dimensional space into an n�1 dimensional object. How is that“magic” performed? We could think of it this way. The trick is to subvert therules of some n dimensional space, by literally warping it. When one thinks ofthe characteristic Cartesian coordinates one draws in a notebook, one playswithin the rules of Euclidean space. Inside those one, two or three lines onedraws mundane things such as segments, triangles, fake cubes, and so on. Butone can do better by tearing off the page from the notebook and warping theCartesian axes until the page forms, say, a tube. That very tube (topologists call

D E R I V A T I O N S

296

Page 308: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

it a “cigar band”) is already a pretty nifty trick. It looks like a three-dimensionalobject from our three-dimensional space, although of course it is still a two-dimensional thing in its own world, since all one did to get it was warp the page.In turn, if one warps the “cigar band” carefully (particularly with soft paper),one can insert one end of the tube into the other. At this point one gets a sort ofdoughnut (topologists call it “torus”).

Those activities are uniquely human, and I suspect entirely universal. I woulddescribe them thus. Humans have the capacity to exploit a given formal systemso much that they can even use its formal limitations to come out of that system,and take the system itself as an object at a higher dimension. This of courserecalls Goedel’s results. Simply put, Goedel found that no formal system whichis complex is complete as well. There is always a basic statement that must beexpressed outside the system.8 That fact has serious consequences for what isprovable, as opposed to “merely” true. Not all true statements can be proven.Furthermore, that fact creates a hierarchy of the sort we need, because of theformal limitation we are describing.

Recall the number instances we were playing with before. For example, inaddition within the set of naturals, adding any two of those yields another andcloses the operation in that set. But humans cannot help but ask: “If I can add xto y to yield z, I should be able to express the relation backwards, so that takingz and y I get x.” It is only natural to explore this “inverse” situation. But watchout. Everything is fine if z is larger than y, but if z is smaller than y (a possibilitywithin the naturals), then the result of that operation is not defined in the ori-ginal set. Here the boring mathematician will blame us for cheating. But thechild in us asks why; can we not have a new type of number? Therein is born alittle monster whose existence arises because imagination cannot be constrainedby the answer “you cannot do that.” Of course, the imaginative creature is notan “anything goes” kind of object. It has properties. It resulted precisely whereexpected. It has the form it should have there – and it opens a whole new worldof possibilities.

The same occurs with the folded Cartesian axes. Normally we are notallowed to do that, we must draw within them. But any normal ten year old isprone to ask why she or he should draw the coordinate lines straight. Why canthey not be circles (or wavy lines, etc.)? At that point we are on the verge ofstepping out of the more basic object. The price will be a very hard time withproofs, but the prize will be new objects, and for free.

Given what I have just said, I obviously have (and believe there is) no stan-dard way of proving this, yet it may be true:

(4) If operation �O is not closed in system X, applying �O to the objectsx of X creates new sorts of objects x�, so that a new system X� iscreated with the x� objects, such that X is a part of X�.

We have seen this with numbers in the previous section ((3) is a sub-case of (4)).But (4) is true even in the topological examples discussed in this section.

What we did when we identified the two edges of the page to form the “cigar

W A R P S

297

Page 309: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

band” is prohibited in the standard Euclidean plane. Prior to the trick, we hadthe two parallel lines in (5):

(5) ________________________

Warping the paper makes us identify these two parallel lines. Euclid’s fifth pos-tulate prevents two parallel lines from ever meeting. Ours, however, meet at allpoints after the warp. That is cheating within the rules of classical space, but thereward is something which, when looked at from the perspective of the standard(Euclidean) three-dimensional plane, is a new object.9

All that was mathematics. What does it have to do with concepts? If thatobvious human ability is at use in concept formation in the linguistic system,then we will get a couple of things in the same package, vertical hierarchies, andindeed new objects with no dependencies on the real world, of course, for betterand for worse.

I say the latter because this is not what is going on: we mentally warp a mentalpiece of paper into a mental torus, and we use that to pick out doughnuts. Itcannot be that simple (think of abstract or intangible concepts, to point out theobvious). The picture I am about to suggest is considerably more abstract thanthat, and any temptation should be abandoned at this point of relating everythingout there to concepts “in here” which can be iconically “likened” with theobjects. We must just bask on a more modest glory. We can obtain necessaryhierarchies from the warp situation, and some objects arise as well, even ifuseless – at least until I say something else – for intentional purposes.

6 The duck problem and porous modularity

The intentional problem I have just noted relates to another traditionalproblem, which has been emphasized in recent years in the work of Jackendoff(e.g. 1990: 32). There are some featural distinctions that seem reasonable, likemass and count, which correlate with obvious selectional restrictions and mani-fest themselves in various morphemic ways across languages. However, think ofthe featural distinction that separates a duck from a goose. What is it? Is it plusor minus “long neck”?10 In present systems, the way to associate phonetics tomeaning is precisely through semantic features which happen to fall in the wordthat bears the relevant phonetic distinctions. There is no “direct” connectionbetween phonetics and meaning, if standard models are correct. Obviously, weuse phonetic differences to tell lexical types apart, and we assume that thosetypes correspond to different semantic types. But how do we do the latter? Thatis where we need something like a lexico-semantic difference, so that then thetruth or presentation conditions of I saw a duck come out different from thoseof I saw a goose. Atomists like examples of this sort, since they make themsuppose that the only real feature distinguishing those two is plus or minus“duck.” Of course that is like saying that “duck” is a primitive concept.

Jackendoff uses this sort of case to suggest a different kind of solution. A duck

D E R I V A T I O N S

298

Page 310: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

looks different from a goose. Why could we not distinguish ducks and geese theway other species (presumably) do, by the visual, or other systems? If the lexicalentry for duck has a visual instruction to it, say, then the problem of telling a duckfrom a goose in linguistic terms becomes much more manageable in principle.11

But what kind of system is that? One of the most cherished aspects ofFodor’s modularity thesis (1983) is that information is encapsulated within givensystems. The linguistic system is not supposed to know what is going on in thevisual system, or vice versa. If we start allowing the linguistic system to usevisual information (or auditory, olfactory, tactile information, more generally)then it is hard to see how in the end we will not have an anti-modular, connec-tionist network in front of us.

Now think of the issue from the perspective of multiple dimensions of mentalsystems. One might suppose that, while systems are encapsulated interdimen-sionally (the modularity thesis), they are intradimensionally porous (a restrictedconnectionist thesis).12 That is, suppose all relevant cognitive systems that enterinto the computation of lexical meaning are dimensional, or layered, in thesense above. One might suppose that the layers in question communicate,though the systems are otherwise encapsulated. The picture would be as in (6):

(6)

Suppose the “long-neckedness” of geese is a property that the visual system dis-tinguishes in its 3-dimensional (3D) representations; then the linguistic systemshould be able to use that very representation in one of its 3D representations.More generally, since a 3D visual representation implies a 2D representation, ifit is the case that some 2D visual representation is accessible through the 3Done, the linguistic system should be able to access indirectly that as well fromthe perspective of a 3D representation. What should not be possible is for a 3Dlinguistic representation to have access to 4D visual representation, or evendirectly to a 2D visual representation which is not accessible to the 3D visualrepresentation. We can think of this as “porous modularity.”

I should emphasize that, when I speak of dimensions in the visual system, Ido not mean the obvious Euclidean ones, or even necessarily what has beenproposed in the literature, following work by Marr (1982). It could happen to bethat those particular dimensions are the ones we need for porous modularity towork, but it is also logically possible that we require something more (or less)abstract. That is, at any rate, an empirical issue.

If porous modularity holds of the human mind, then the lexicon can be amore interesting thing than is usually assumed. The familiar set of idiosyncratic

Language Vision Audition Motor etc.

4-D

3-D

2-D

1-D

W A R P S

299

Page 311: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

associations between sound and meaning would really be a set of intra dimen-sional associations among several modules. The dimensional system that I amtrying to sketch can be the natural locus of the association.

Seen in this light, the temptation of trivializing the dimensional picture in termsof the origami metaphors reduces quite considerably. True, part of the lexicalmodel of a goose might be something very much like an origami model, qua itsvisual aspects. But the linguistic representation would be richer than that, includ-ing other sensory information, as well as any kind of information that happens tobe there in the human mind and is used for these tasks. That of course still leavesthe question open of what are the dimensions in each kind of system, and in shortI have no idea. Luckily, there are indirect ways of going at this which, if coupledwith a serious study of the different modules involved, might give us a betterunderstanding of what I suspect is a phenomenally complex task.

7 Structural complexity, semantic complexity?

One of those indirect ways of finding the dimensions of each kind of systeminvolved in the linguistic faculty might be to look at the structural complexity ofexpressions. Stating this is easy. If expression X is syntactically more complexthan expression Y, we expect expression X to correspond to a semanticallymore complex object than expression Y. But do we?

It depends on our assumptions about the syntax-semantic interface. To see thedifficulty in full generality, consider a simple mathematical result which is, again,based on Goedel’s work. Part of Goedel’s strategy for his IncompletenessTheorem was to come up with what is usually called Goedel’s numbers, naturalnumbers associated to any arithmetic operation. You can assign a number x to2�2�4, a number y to 2�3�5, and so on.13 Arithmetic has the same expressivepower as any first-order language, including predicate calculus or the sort of gram-mars that yield familiar phrase-markers, so there are also corresponding Goedelnumbers for those. The problem is that once you assign a number to a phrase-marker chunk (any number and any phrase marker) you can assign anything youwant to that phrase-marker (the objects in the periodic table, the humans that everexisted, the stars in the universe). There is no “interpretation” of the phrase inquestion which can be seen as more or less reasonable, a priori, than its alternat-ives. Anyone who has examined an unfamiliar writing system realizes that much.

So then, what does it really mean to say that a syntactically complex expres-sion should correspond to a semantically complex expression? Why should itnot correspond otherwise, or be a random combination? Few ways around thisdifficulty exist.

One possibility is to go iconic, but it is never very clear what that means.Another possibility is to attempt a minimalist reasoning. Nothing demands thecorrespondence we are alluding to, but if language is an optimal solution to aninterface problem, it is not unreasonable that the complexity we see on one sideshould correspond to the complexity we see in the other.14

Tacitly, something very much along those lines is already assumed, for

D E R I V A T I O N S

300

Page 312: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

instance by way of the Compositionality Thesis. At times it is forgotten that thisthesis is an empirical one. Human language need not have been compositional;15

it is just that this is the way we think it is. Thus if I see nine symbols in this sen-tence, I expect some factor of nine to tell me the number of semantic relations Imust pay attention to (usually many more than that is expected). With that sortof empirical thesis in mind (whatever its ultimate justification), we can comeback to the problem of finding dimensions in human expressions.

Take, for instance, the count/mass distinction of chickens vs. chicken. In theformer instance there are more symbols involved than in the latter; chickens ischicken plus -s. Does that reflect something about the semantics of the expres-sions?

Semanticists do not usually think that way. If their ontology works with indi-viduals (as most do), then that is where it all starts. They of course have treat-ments of mass terms, which are generally thought of as lattices of individualatoms. Here, the mass term is more complex than the individual term, ignoringthe grammatical fact that languages never express the mass term as grammati-cally more complex than the count term.16

If we go with syntactic complexity, as presented in the languages of the worldthrough agreement, selection, displacement possibilities, ellipsis, and similarsyntactic devices, a hierarchy of the sort discussed by Muromatsu (1998), orCastillo (2001) arises (see in particular (7c)):

(7) a.

b. 4-D: Mutatio3-D: Forma

2-D: Quanta1-D: Qualia

W A R P S

301

Page 313: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

c.

The observation that bare nouns are mere predicates goes back to generativesemantics, and is explicitly defended in current cognitive grammar.17 From thatpoint on, grammatical complexity takes over, through measure phrases (for2D), noun classifiers (for 3D), and still unclear elements for the animates, sug-gested to be coding “change potential” in Bleam’s (1999) thesis. Traditionalanalyses have little to say about these added morphemes, or about the fact thatthey are needed. Why, for instance, do languages resort to noun classifiers toindividuate nouns which are counted (as in Japanese) or demonstrated (as inChinese)? If matters were as trivial as implied by standard semantics, should itnot be enough to say “three cats” to mean three cats? Why do languages botherto express that as three instances of cat, with that very partitive syntax?18

If one is moved by that grammatical fact, it is not unreasonable to take one’ssemantic ontology not to have such things as individual cats. Rather, one dealswith a raw conceptual “cat” space, instantiated into a token cat if an elementlike a noun classifier is added to the expression.

The reader might be wondering what a raw conceptual “cat” space is. I dotoo; but I also wonder about what a primitive cat is, as in “the one in my ontol-ogy.” The question is equally puzzling in both instances (answering that the catis there will not do, since how we know it is a cat, is part of the question.Besides, many things are not there and we want to refer to them as well). At anyrate, for present purposes it suffices to say that the conceptual space in questionis some network of associations between the different modules involved in thislexical entry. It is, I suppose, the kind of lexical space that tells us of a certainlook, smell, this and that, but crucially nothing about the set of cats, or that setin all possible worlds, or any such thing that involves individual cats. From thisperspective the “concept” cat is a mental network. Token cats are denoted onlyif a grammatical formative enters the picture to say, essentially, “having thatraw cat space, now you present it as an individual entity.”

This might be a complex way of doing things, but that should hardly be anissue. What we have to determine is if this (coherent) view is how natural lan-guage actually works. This is really what is being said. At 1D, a nominal concep-tual space is a given network of associations with other mental modules; giventhe porous modularity thesis, the associations must be confined to unidimen-

MEASURABILITY

massnoun

COUNTNESSinanimate noun

ANIMACY

personalnoun

NOUNS

abstractnoun

D E R I V A T I O N S

302

Page 314: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

sional ones (the system cannot see highly elaborate dimensions unless they arebuilt in, thus certainly not from the 1D layer). One might think that this alreadykills the idea, since after all the way a cat looks is presumably a visual 3Drepresentation. But this is not true.

I said we cannot be naive about visual representations. So far as I know, whatcounts as a 1D visual representation has nothing to do with width, length orheight. Whether it has to do with colors, contours, shades, or simultaneous mapsof all that, is an empirical question. Whatever the answer, the linguistic systemcould use that for its 1D representation of the noun cat. Say for concretenessthat this has to do with simultaneous visual maps (color, contour, shade, etc.),then that kind of information would go into the lexical entry cat, althoughnothing about higher order visual dimensions (whatever those are) would.Those, however, might enter the picture once the term cat is presented in tokenfashion by way of an individual classifier.

The next question is how to move from the ID space we have been exploringto the more complex dimensions. How do we get mass into the picture? Regard-less of whether our 1D space is that of a cat or a square root or whatever (differ-ences in terms of the various modules involved in the relevant network ofconnections) one thing we can do with abstract 1D space is warp it back andforth, until we get a lattice. The very fact that my wife knits the scarf proves, Dr.Johnson style, that you can get a lattice from a one-dimensional object. Onceyou have it, familiar mass properties, as customarily analyzed, follow directly.For instance, more “knitting” is more scarf, less is less. Two scarf halves are stilla scarf and so on.

Similarly, we can ask how to move from the 2D space of the mass expression(the scarf representing whatever concept we are dealing with) to a tokenelement. Again, we can take the topological metaphors seriously.19 With thescarf plane we can get a cigar band or some other origami-style three dimen-sional element. It does not have to look like a cat or chicken or a square root;that is not the point. It has to be objectual in some sense, which we can repre-sent in terms of whatever folding carries us from the previous dimension corre-lated with mass to the new dimension, where boundaries arise. Whether thething looks like an individual cat or a chicken, smells like that, and so forth, areproperties that will come from the other interconnected modules, now access-ible at 3D.

It is reasonable to ask why we chose to express mass at 2D and countable ele-ments at 3D and not the other way around, for instance. There is, first, a phe-nomenological reason for that. When we look at language from the point ofview of the syntax/semantics complexity correlation, that is what we actuallyfind. Count terms are more complex than mass ones. It might have been other-wise, but it just is not.

Then again, could it really have been otherwise, given the general system weare presenting? That is a more difficult question, but it is not unreasonable tospeculate in the direction of a necessary mapping. A topology is a more complexkind of mathematical object than a lattice. You get things like toruses and so on

W A R P S

303

Page 315: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

from folding lattices into themselves (twice, for that particular topologicalform). In other words, form in this sense we are exploring it is mathematicallymore complex than absence of form. A lattice has neither form nor boundaries.The elements with which we are modeling tokens of conceptual spaces do haveform through boundaries (and perhaps also parts and other intricacies, depend-ing on the various warps and folds, which mass terms lack). If this is correct,things could not have been otherwise.20

Once this picture is in place, it trivially predicts the kind of implicational cor-relations that we have become familiar with for lexical terms. I have notattempted to build a case of the sort sketched in Muromatsu’s (1998) thesis forlexical spaces other than nominal ones, but it is easy to see that similar consider-ations apply to verbs, for instance. That is precisely what Mori (forthcoming)tries to show, making a correlation of the sort in (7) for the Vendler ontology.In that view states are akin to abstract terms, activities to mass terms, morecomplex events to count terms. Already familiar questions arise for the basicsource of the relevant conceptual spaces, but once again the kinds of inter-modular, porous connections mentioned above would be very relevant, in prin-ciple. Issues about the boundaries of more complex events expressed throughtopological notions, as opposed to the lattice character of simpler verbs, alsohave an obvious expression in the terms discussed above for nouns.

8 Some predictions

That a given event implies some state, or a count term some mass, is nowexpected. To be exact, the entailment obtains at the level of the mathematicalstructure we have been discussing and we use to model the various concepts.

Aside from these vertical relations, we can now also capture some curioussuper-horizontal relations which do not obtain at the usual syntagmatic cut. Forexample, stative verbs behave like abstract nouns, processes like mass terms, orevents like count nouns. Thus observe the kinds of correspondences in (8):

(8) a. much-little coffee He grew much-littleb. one-two-etc. cats He sneezed once-twice-etc.

Mass quantifiers like much-little go with mass terms like coffee in (8a), unlikecount quantifiers like many-few or the numerals. Similarly, they go withprocesses like grow, in adverbial fashion. Conversely, count quantifiers go bothwith count nouns like cats in (8b), and delimited events like sneeze, in adverbialfashion. This correspondence in behavior between different lexical classes iseasy to state, but hard to code within the theory. However, in the terms pre-sented here, the correspondence in behavior is expected, since the mathematicalstructure modeling each sort of element, whether nominal or verbal, is thesame. This will force us to think about what is the difference between nouns andverbs, but I will return to that.

We can also predict certain surprising behaviors of expressions which seem tobe simultaneously of different sorts at the same time. One can say, for instance, (9):

D E R I V A T I O N S

304

Page 316: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(9) My dog is soft, small, and very intelligent.

The predicates here hold of the dog presumably as a mass, an object and a sen-tient creature; that is, for an expression that has various dimensionalities. Inpresent terms, each predicate applies to each of the dimensions of my dog.

Something related can be seen with selectional restrictions. There are certainverbs that select for direct objects of a given dimensionality. For instance, apredicate like weigh selects direct objects with substance, thus the contrastsin (10):

(10) a. *Weigh mass. b. Weigh coffee.

Note, there is nothing incoherent in the idea of weighing the mass property ofparticles (as opposed to, say, their spin), but it is odd to express that as in (10a),with the abstract term mass (which implies no actual mass). At any rate, giventhe assumptions above, all higher dimensional nominals imply the low-dimensional substance, hence weigh is able to combine with them:

(10) c. Weigh tables. d. Weigh people.

In contrast, there are predicates, like tame, which select direct objects of a veryhigh dimensionality:

(11) a. Tame beasts. b. *Tame trees.c. *Tame rice. d. *Tame life.

Again, I am trying to give a chance to the meaning of these expressions. Theobjects in (11) all either have life or denote it. One could imagine the verb tameas selecting animate objects, thus expressing the idea of turning their propertiesfrom the wild to the domestic. However, physical animacy is not what the highercognitive dimension would seem to care about, but rather change potential, inBleam’s (1999) sense. Physically trees, rice, and life surely change, but cogni-tively they seem to be generally analyzed as more static, thus at a lower dimen-sion, hence the unacceptable combinations in (11). The important thing, though,is that selection of a low dimension, as in (10), allows for presence of higherdimensions, irrelevantly for the selection process, but selection of a high dimen-sion forces a unique sort of combination. Nothing short of that high dimensioncan be present in the selection process.

Nouns or verbs are canonically expressed in a given dimension, as we sawabove. However, these are easily altered in some languages, with an appropriatechoice of determiner. Thus, in English we can get a coffee or a wine, going fromthe 2D mass terms to the higher dimensional expressions. We can also lower thedimensionality of an expression, particularly when we are referring to its stuff(chicken, lamb, etc.).21

In some languages this is done very systematically, even when we are not refer-ring to an expression’s physical mass;22 for example, consider the Spanish (12):

(12) Aquí hay mucho torero.Here have much bullfighter

W A R P S

305

Page 317: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

It is hard to translate (12). It does not mean there are proportionally many bull-fighters here, but that there are cardinally many. At the same time, however, wecannot invoke reference to token bullfighters with that expression. Thus:

(13) a. Diez de mis amigos conocen a muchos sinvergüenzas.Ten of my friends know to many rascals

b. Diez de mis amigos conocen a mucho sinvergüenza.Ten of my friends know to much rascal

(13a) is ambiguous, having both wide and narrow scope readings for muchossinvergüenzas, “many rascals.” But (13b) only has the narrow scope reading ofmucho sinvergüenza. My ten relevant friends either know the same or differentnumerous groups of rascals, but it cannot be the case that they each know acouple of rascals, and jointly all those rascals (known a few at a time, now anumerous group) are known by my ten relevant friends. I take that to mean thatmucho sinvergüenza cannot act as a quantifier that generates readings whereanother quantifier can anchor its referential import, as normal individual quan-tifiers can.23 The expression is acting like a mass term not just in its morphologi-cal guise, but also in its semantic properties.

If nominal expressions present different dimensionalities to them, we shouldnot be surprised if otherwise inappropriate quantifiers (e.g. a mass quantifierwith a count noun) can target those hidden dimensions. Why there are restric-tions to this is something to study. For example, why is the phenomenon notobserved in English? And why, even within Romance, is it restricted to thosecontexts where bare plurals are possible?

(14) a. Aquí hay sinvergüenzas/mucho sinvergüenza.Here has rascals much rascal

b. Yo he visto sinvergüenzas/mucho sinvergüenza.I have seen rascals much rascal

c. *Sinvergüenzas/mucho sinvergüenza gobierna el paísrascals much rascal rules the country

Both types of expressions are grammatical only in direct object guise (14a),(14b), for reasons that are not obvious.

English does have a phenomenon that resembles what we have just seen.Compare:

(15) a. Some of my team mates scored many times.b. Some of my team mates scored much.

(15a) is ambiguous. The interesting reading is, “Many times, some of my teammates scored.” That wide-scope reading for the adverbial is impossible in (15b).Normally, an event like score would be quantified in terms of a count quantifierlike many. However, we also can use a quantifier like much to quantify over thescoring, but then we do not obtain individual token scorings, but rather we quan-tify over a raw scoring space which cannot anchor separate scoring instances neces-sary to allow for the narrow scope reading for the subject some of my team mates.

D E R I V A T I O N S

306

Page 318: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

The use of non-canonical quantifiers, restricted to event quantification inEnglish, does not seem to be constrained by subject-object asymmetries as inSpanish. However, the appearance of bare plurals in English is not constrainedthat way either:

(16) a. I want peanuts/my team mates to score much.b. Peanuts/For my team mates to score much would be fun.

It is likely that bare plurals themselves, which we see patterning with non-canonical quantification in both Spanish and English, introduce a lower dimen-sionality reading for an otherwise count noun.24 Thus compare:

(17) a. Some of my team mates scored many goals.b. Some of my team mates scored goals.

It is rather hard to get a wide-scope reading for the bare plural in (17b), possiblein (17a). Perhaps the expression goals without a quantifier is in essence acting asa mass term (semantically that is known to make sense, since a set of goals isunbounded, and can be modeled as a lattice). Then the question would be whyin English non-canonical mass terms cannot go with mass quantifiers, even whenthey exist. In Spanish, in contrast, not only do those expressions appearsystematically where we expect them (co-occurring with bare plurals), but theycan furthermore be introduced by standard mass quantifiers.

In any case, regardless of all those important grammatical details, the expres-sions of interest now exist, and in effect lower the dimensionality of an other-wise canonical count expression. Interestingly, that is done without invoking theactual mass of the denoted entities; that is, much bullfighter in Spanish or corres-ponding bare plurals in English obviously do not refer to the bullfighter’s fleshand bones. This should dissuade us from taking the dimensional notions astelling us something about the real world, or from obtaining the unexpectedconcepts by way of “grinding” operators and the like. Surely humans can con-ceive bullfighters as made of blood and guts, but more abstract and relevantmass perspectives are possible as well. One can literally find bullfighter stuff in agory bullfight (the reading is possible). However, when one talks of having metmuch bullfighter (in Spanish), one is speaking of other lattice-like attributes ofthe relevant concept. Each dimension is built on a network of associationsamong mental modules, so there is no need to limit ourselves to the tangible,smelly, physical stuff. Only if we are bound to build our ontology on physical,smelly, tangible individuals, do we have to commit to that.

9 Nouns and verbs

It should be clear by now what sorts of things the present system expects to findand can predict, and that each vertical dimension postulated, whatever itsdetails, corresponds to a category. Now we are beginning to approach theEpstein-Seely goal. Although we have not been invoking standard derivations,the dimensional apparatus we have described could be modeled derivationally.

W A R P S

307

Page 319: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

I do not want to go now into the matter of whether that modeling, pushedto its formal limits, would be desirable. A dimension could be seen as a levelof representation, a logical entry to the next level; thus we could verticallyarticulate the non-terminal elements in customary horizontal derivations.However, the ordering imposed on concatenative derivations (for instance fromD-structure to LF) is extrinsic, whereas the order implicit in the dimensionalsystem is intrinsic. That is a non-trivial formal difference between the two sortsof systems, and it may or may not indicate something deep about vertical andhorizontal syntax, which I am not trying to confuse into a single type of process(see the last section).

Be that as it may, the horizontal, concatenative derivation does not have tocare about what the vertical system provides, so long as its demands arerespected, for instance, in terms of selectional restrictions. Similarly, the verti-cal, paradigmatic system does not care about what combinations the horizontalderivation permits with its legitimate items. Logically, the vertical system isprior, since it generates the vocabulary for the horizontal one. And since thetwo are not the same, they may have different properties (I return to some atthe end).

But a serious question remains. We have said something about how verbsand nouns fit the present picture and how they even behave in parallel fashion,as expected, but how do they relate to each other? If we have not derived theverb/noun distinction, we have not explained it.25

I will pursue an idea alluded to by Mori (forthcoming), who in turn summar-izes much previous work. Theme arguments are characteristic functions ofverbs. In general, a standard verb must have a theme. If so, and provided thatthemes are standardly nouns, we could probably define verbal elements as func-tions over nouns, in some sense. If we succeed at that, we would have articu-lated the vertical dimension around nominal mental spaces.

Defining verbs around nouns, in a precise mathematical sense, would have anadded advantage. Theme arguments are known to delimit events, to use Tenny’sterminology. When I drink a beer, my drinking event lasts while there is beerleft. That sort of situation may help us model the verb, in essence a function thatmonitors the change of the beer mass from some quantity to less of that, or noquantity. Functions of that sort are often conceived of as derivatives over time.Since in previous sections we spoke of mental spaces for nouns with propermathematical correspondences, it should not be difficult to interpret (18):

(18) A verb expresses the derivative of its theme’s space over time.

Needless to say, to give non-metaphorical meaning to (18) we have to providevalues to the theme’s space, or for that matter time in a grammatical sense. I willnot attempt this here, because it is formally trivial but conceptually very diffi-cult.26

As expected, different kinds of verbs care about different dimensions ofspace, in the obvious way. You can see, kill and eat a lobster, say, and you areclearly targeting different dimensions in each of those actions, which are thus

D E R I V A T I O N S

308

Page 320: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

bounded in different ways: the killing is conceived as terminating with theanimal’s change potential (animacy), while the eating is not conceived as con-cluding until much later, when the substance is taken to be consumed. So-called“internal aspect,” thus, is a by-product of the dimensions we are articulating,specifically in terms of the dynamic function in (18).

All verbs seem to be dynamic in that respect, although as Atutxa shows inwork in progress, the rate of change for the theme argument in some is zero(states), and in others limiting instances exist for that argument (achievementsand accomplishments). That those possibilities should arise is expected, but it isstill interesting to ask what gives verbs their different dimensionality.

That question was tough enough for nouns, but there we blamed it on themathematical complexity of lattices vs. topologies. One expects a similar resultin the realm of verbs, although events are harder to analyze directly in theseterms, not as a matter of principle, but because they have an elusive dynamicityadded, which makes even intuitive calculations hard to pin down.27

An intuition that Mori pursues in her thesis is that the more arguments a verbhas, the more complex it is in its dimensionality. That is in the spirit of the mini-malist syntax/semantics interface conjecture presented before. The more symbolicrepresentations enter into the computation of an expression, the more semanticcomplexity we expect for it. Thus, if faced with the following two expressions:

(19) a. John Xed a house.b. John Yed himself a house.

We intuitively expect Xed to be a conceptually simpler verb than Yed, simplybecause the extra himself should be contributing something to the meaning, andif it is an argument of the main event, it will arguably affect its dimensionality.So one way to understand the dimensionality of a verb is in terms of its argu-ment structure. But immediate complications arise.

Some are uninteresting. Arguments may incorporate, and thus are hard tospot. That just means one has to be careful in the analysis. Other difficulties aremuch more subtle. It appears, for instance, that some verbs have a rather highdimensionality (they are, say, achievements as opposed to states) in spite of thefact that they have as many arguments as verbs of a lower dimensionality. Forexample:

(20) a. Oswald killed Kennedy.b. Oswald hated Kennedy.

In spite of both verbs being transitive, kill would seem to require a high dimen-sional theme, unlike hate. One can hate, but not kill, say, wine (settingmetaphorical readings aside).28 Then a question that Mori poses is whether thehigh dimensionality of the theme could not in principle “boost” the dimension-ality of the verb defined over it. Mathematically, this is not unreasonable, but itremains to be seen whether it is the right conclusion.

Although many other difficult questions remain, Mori is probably correct innoting that the thematic hierarchy follows trivially if her general approach is

W A R P S

309

Page 321: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

right. Since for her a verb is defined over a theme, theme arguments should bespecial in that sense, and “lowest” in the hierarchy, since they would be presup-posed by any verbal space. Then other arguments would add more verbal com-plexity, thus correlating classes in the Vendler hierarchy with arguments in thethematic one. These would not be two separate hierarchies, but only one, andthe consequence of the general dimensional system. Again, this is reasonable,but it has to be tested seriously.

This much seems true. It is easy to build a model where verbs are derivativesof their theme’s mental space over time, in which case the main task of relatingnominal and verbal categories could be achieved. Note, incidentally, that thisdynamic view of verbs need not make them in principle higher in dimensionthan nouns (pace Mori’s “boosting” situation). That you can map a function totime need not make that a higher dimensional function, just a dynamic one. Infact, recall that certain verbs and nouns are known to behave alike with regardto their process/mass or event/count properties, which indicates in present termsthat the models we use to represent them have the same dimensionality, hencethe same mathematical properties.

A related question, of course, is why we have this static/dynamic dualitybetween nouns and verbs, or why a verb is a dynamic perspective of a noun.Deep questions, it seems, about human cognition and our capacity to under-stand the universe as both permanent and changing, for in a nutshell that iswhat we are saying. Both nouns and verbs correspond to mathematical spaces ofvarious dimensions, the difference between them being whether those spacesare seen as permanent or mutable.29

Needless to say, these conjectures constitute a program, but not an unreason-able one, even in practical terms. For instance, any of the current relationalmaps in the literature would do for our purposes as a starting analytical point.Surely there will be many details that will have to be imported and which maytest the theory, after we translate those maps to the dimensional picture arguedfor here. Nonetheless, I am more worried about getting non-obvious generaliza-tions that are not usually mentioned, than in obtaining total descriptive ade-quacy with regards to the immense wealth of data.

For example, there is an observation of Bertrand Russell’s to the effect thatnormal human names have a characteristic continuity.30 We name a dog dog, butwe do not name the set of its four legs *limb, say.31 How do we represent that andwhat does it mean? It is easy enough to state the fact in terms of a claim like (21):

(21) Human concepts are generalizations of Euclidean space.

Note, we have warped Euclidean space shamelessly, but we have not torn it. Wecould have. But perhaps human cognition does not use that meta-operation inordinary language. If so, ordinary lexical concepts would be what topologistscall “manifolds” of various dimensionalities. That is either right or wrong, butthe question is rarely even posed.

Many other simple, grammatical questions arise. For instance, the syntax ofpossession, as studied by Szabolcsi (1983) and Kayne (1994) (see Chapter 10),

D E R I V A T I O N S

310

Page 322: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

would seem to enter the conceptual picture we are studying. Thus observe theexpressions below:

(22) An animal with two pounds of weight, with structure of symmetricalorgans, with families of polygamous composition.

The use of preposition with immediately indicates possessive syntax, as does thefact that those predications can be expressed with inalienable have (“the animalhas a family of polygamous composition”). This is universal, indicating that pos-sessive syntax enters the realm of ontological classification, perhaps serving asthe interface between the vertical dimension studied here and its horizontalmanifestation. Note, incidentally, that the order in (22) is the natural one, corres-ponding to the neo-Aristotelian substantive claims made by Muromatsu (1998):

(23) [[[[entity] substance] structure] relations]

Likewise, many of these conceptual notions manifest themselves adjectivally, asin “a long haired animal,” and as Muromatsu suggests, familiar adjectival hier-archies should also fall in line with the dimensional structuring, with orderingsalternative to those in (24) sounding either emphatic or ungrammatical:

(24) [promiscuous [symmetrical [light [animal]]]]

On a related note, grammatical functors surely affect all these relations we aretalking about. A nominalizer will have the effect of coding a verbal space, ascomplex as it may be, as a nominal one. I take it though that processes like thatare not the default ones, on which the system is based, which is possibly whythose elements are morphologically specified. The category C may well alsohave that nominalizing effect; after all a complex event which is normally pre-sented in propositional guise to make an assertion becomes something which ahigher verb can use as theme, by way of C:

(25) I hate that Oswald killed Kennedy.

Oswald’s killing of Kennedy can be a very complex, high dimensional verbalexpression. Nonetheless, one can hate that as much as one can hate coffee. Thatplausibly is because the complementizer automatically translates the event intoa noun of sorts.

10 Learnability considerations

We should not conclude this chapter without reflecting on how the presentsystem would square with the acquisition task, and if it does, whether it canreproduce the actual acquisition process. A system which does not pass thesetests is basically worthless.

Let us first clarify what would be a reasonable demand for the system. Givensituations of ambiguity, whereby a word is uttered in front of a child in a situationwhich could be analyzed in terms of more than one of the dimensions above, canthe child decide on one analysis, and if so, is that the correct one? That is a soundquestion. To expect that the system presented here would tell us something about

W A R P S

311

Page 323: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

the general, overall acquisition sequence would be utterly disproportionate,among other things because it is not even clear what that sequence is.

The logic of the interesting test may be presented in a narrative fashion Achild observes a rabbit go by when a native speaker points at the creature whileuttering, “gavagai!”32 How does the child know what the native meant? Was it“rabbit” or “fur,” or more exotic combinations? Let us stick to those two to sim-plify. Suppose the analysis in terms of “rabbit” is a 4D approach in Muromatsu’ssense, whereas the one in terms of “fur” is an instance of a 2D, thus in somedefinable informational sense, simpler take.33 What should the child, or a modelLanguage Acquisition Device (LAD), conclude in the absence of explicit train-ing? Does gavagai mean rabbit or fur here?

We know what children do. They interpret gavagai to mean “rabbit”; in thedimensional terms, that would mean they go with the highest dimension. If oneassumes the dimensional theory and is a behaviorist of any form (including aconnectionist), this means trouble. What sense would it make for the blank, orscarcely organized, mind to go with the more complex structure without havingacquired the simpler one first? What one should expect from this perspective isthat, in the situation just outlined, the child should start with the simplerhypothesis, taking gavagai to mean the lower dimensional “fur.” So either thedimensional, or the behaviorist approach is wrong.

Now take the question from the innatist perspective. The dimensions arealready in place, in the human mind prior to experience. Of course, the childdoes not yet know that gavagai is arbitrarily associated to this or that, but thechild does have the mathematical equipment in place in his or her mind toanalyze the incoming situation. However, since the situation can be ambigu-ously analyzed, how does the child decide?

Once again, the only relevant metric is informational complexity. However,far from hypothesizing the informationally simpler structure, a LAD with all thestructures in place does best in hypothesizing the informationally most complexstructure, among those present. The logic is familiar, Panini’s Elsewhere Con-dition. This conservative learning strategy is often expressed in terms of theSubset Principle, but that particular formulation will not do for non-ExtensionalLanguages, like the one assumed in the Minimalist Program. Nonetheless, thelogic is clear enough, and it can be stated in terms of a Sub-case Principle:

(26) Subcase PrincipleAssuming:(a) A cognitive situation C, integrating sub-situations c1, c2,…, cn;(b) a concrete set W of lexical structures l1, l2,…, ln, each

corresponding to a sub-situation c,(c) that there is a structure lt corresponding to a situation ct which is a

sub-case of all other sub-situations of C, and(d) that the LAD does not know which lexical structure lt is invoked

when processing a given term T uttered in C; then: the LADselects lt as a hypothesized target structure corresponding to T.

D E R I V A T I O N S

312

Page 324: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Note that it is situations that enter into subcase relations. Tacitly, I am assum-ing that:

(27) Given cognitive sub-situations c and c�, obtaining at a situation C, for Iand I� linguistic structures corresponding to c and c�, respectively, andwhere d and d� are the dimensions where l and l� are expressed, we cansay that c� is a sub-case of c if and only if d�� d.

Once a complete theory is in place telling us exactly how more or less complexsituations are analyzed by the dimensional system, (27) (though true) should bea mere corollary of the system, but to make the reasoning work now, it must bestated.

To see how these notions provide an analysis of the case discussed, considertwo different scenarios:

(28) Scenario 1: In fact gavagai means “fur.”Scenario 2: In fact gavagai means “rabbit.”

And let us now see how the analysis works in terms of (26). Assuming (a) a cog-nitive situation C (the perceived event), integrating sub-situations c1 [“a 4Drabbit”], c2 [“2D fur”], (b) a concrete set W of lexical structures l1,l2 (the differ-ent possible interpretations of a word associated to the perceived event that uni-versal grammar allows), each corresponding to a sub-situation c. (c) that there isa structure lt (which involves four dimensions of the basic syntactic structure)corresponding to a situation ct (concretely c2, the “4D rabbit”) which (as per(27)) is a sub-case of all other sub-situations of C (concretely c1, the “2D fur”),and (d) that the LAD does not know which lexical structure lt is invoked whenprocessing a given term T (concretely, gavagai) uttered in C; then: the LADselects lt as a hypothesized target structure corresponding to T. In other words,the LAD selects the “4D rabbit,” a sub-case of the “2D fur” in the sense of (27),as the meaning for gavagai, regardless of the factual details of the two scenariosin (28).

Now, in Scenario 1, the child is wrong. However, she or he will not producean “erroneous expression” if deciding to utter the now (for better or for worse)acquired gavagai, assuming all rabbits are furry. (If we do not assume that, thenthe whole experiment is entirely irrelevant, for we are trying to study situationsof complete analytic ambiguity, that is, sub-case scenarios.) In Scenario 2, thechild is, of course, right.

So how does the child come out of her error in Scenario 1? Suppose sheassumes an Exclusivity Hypothesis concerning lexical meaning in the acquisitionstages:34

(29) Exclusivity HypothesisEntities only have one name.

If so, the child can retreat from the initial mistake by either hearing the wordgavagai used for any other furry object for which she or he already has a name,or by hearing another word used for “rabbit.” Thus correcting the mistake

W A R P S

313

Page 325: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

necessitates no instruction. In contrast, imagine a LAD which did not select theconservative, most specific hypothesis in terms of the Sub-case Principle, onehypothesizing “fur” for the original meaning in the sub-case situation. Such aLAD would have difficulties with Scenario 2, and produce an erroneous expres-sion when uttering gavagai before just any furry thing. It would not be easy for achild corresponding to that LAD to retreat from such a mistake. All other usesof gavagai from the community would reinforce the wrong guess, assuming allrabbits are furry. Someone might utter a different word in front of the rabbit,one in fact meaning “fur.” That will not be helpful, since the LAD would pre-sumably take that to mean “rabbit,” assuming the Exclusivity Hypothesis. Ofcourse, that new term would also be used by the community applied to otherfurry things, so the LAD would then be confused. What it took to mean “fur” isreinforced as meaning fur, but what it took to mean “rabbit” is actually chal-lenged, without much room for maneuver, unless the entire edifice of choices isdestroyed and decisions are again made from scratch. Or some correction by thecommunity is provided.

Linguists of my orientation take such corrections to be pointless and thus arepretty much forced to look the other way. Interestingly, psycholinguistic evi-dence points out in precisely that direction: children, given ambiguous situ-ations, go with the more object-like analysis of nouns, or event-like analysis ofverbs; in our terms, with the higher dimension.35

A couple of wrinkles remain. First, why do children not analyze gavagai tomean “Peter Rabbit.” That opens up the issue of names, and how they fit intothis system. In Chapter 12, I argued that their characteristic rigidity is best ana-lyzed as absence of internal structure of the sort relevant to us here. If so, aname should not compete with the lexical notions discussed so far (would bepart of an entirely different paradigm), or if it does it should come in last, as anelement of lowest (or no) dimensionality.36

Next, why do children not go with an analysis of cumbersome dimensionality,a 6D flag manifold, say, of the sort used to study quarks? That poses the ques-tion of what are the upper limits of grammatical dimensions in ordinary lexicalcognition. If we go with the phenomenology, things seem to stop pretty much atthree or four dimensions like those involved in countable vs. animate nouns orachievements vs. accomplishments.37 Needless to say, I have no clue as to whythat limit exists for cognition, but once it is there, it is reasonable not to expectthe child to come up with a more complex analysis.

Finally, is it really true that children assume things only have a name, as cru-cially implied in the reasoning above and what about synonymous expressions,like car and automobile? Obviously those exist, and suggest the picture has to becomplicated to include the notion “lexical paradigm.” It is quite unclear whatthat is, but it seems to exist even beyond these examples. For example, casesstudied by Horn (1989) show that given a system that covers a certain “lexicalspace,” such as that involved in possible, impossible and necessary, redundantlyexpressed notions like *innecessary (something which is either possible orimpossible) yield their existence to the more specific concepts. That itself is

D E R I V A T I O N S

314

Page 326: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

arguably a consequence of the Sub-case Principle, but it only works within para-digms; thus, unnecessary is perfectly fine, but it does not lexically compete withthe paradigm above (since it uses prefix -un instead of -in).

11 Some conclusions

The notion “paradigm” is interesting not just as a solution to the puzzle weposed in the previous section for the Exclusivity Hypothesis. It is telling us thatlexical relations of the sort studied here have peculiar properties. There are noparadigms in horizontal derivations, no need to speak of exclusivity within themor to allude to a Sub-case Principle when acquiring them.

Another specific property of lexical relations, canonicity, has been noted inSections 3 and 8. Whatever that is, it should tell us that normally bullfighters areanimate entities, although we can interpret the concept as a mass term in someinstances. We can also interpret coffee as count, but it is normally a mass. Canon-icity appears all over the place in the formation of complex predicates. To saddlea horse is not just to put the saddle on the horse. You would not saddle the horseif you put the saddle on its head. It implies a kind of arbitrariness that is observed,also, in the actual words that languages use for classification or measure expres-sions, which as is well known vary culturally. I have nothing to say about the vari-ation here, nor do I think there is much to say in standard terms. My concern wasmostly with the fact that these kinds of symbols have a certain dimensionality tothem, thus, for instance, measures are in a meaningful sense lower in dimensional-ity than classifiers. But the fact that they exhibit canonicity restrictions should beaccountable in standard scientific terms, since it is a universal.

When we speak of the differences between the horizontal and vertical dimen-sions of languages, narrow syntax and the paradigms of the lexicon, most of usinsist on three properties that fueled the linguistics wars: transparency, produc-tivity, systematicity. Lexical structures are opaque to internal syntactic manipu-lation, random in the relations they allow and idiosyncratic, unlike narrowsyntactic structures. But that perspective is loaded. Vertical structures are dif-ferent from horizontal structures. That there should be opacity may just be sig-naling this cut, as conservation laws do in physics. A particle’s conservation ofthe handedness of its spin may have an influence in the formation of a field, butit may be essentially unaffected by a “higher” order gravitational component.The universe happens to be layered. On the other hand, the alleged randomnessor idiosyncrasy may just be because of the wrong theoretical demands. If weimpose the logic of relativity on quantum mechanics we get randomness or evennonsense. But why should we? When looked at from its own perspective, thingshappen productively and systematically within lexical paradigms. It is when westart looking across them that randomness and idiosyncrasy take over.

If this is the right way of looking at things, what we really must understand iswhy vertical syntax manifests itself in lexical paradigms, unlike horizontalsyntax. This might have to do with the fact that only vertical syntax is signifi-cantly acquired, horizontal syntax either being entirely universal or having

W A R P S

315

Page 327: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

parametric choices whose values ultimately reduce to properties that are bestcaptured within vertical syntax. Why that is that way, implying that verticalsyntax feeds horizontal syntax, is to my mind a deep question. I think it was amistake of generative semantics to confuse these two general cuts on the fabricof language, but I think it would also be a mistake not to take the vertical cutseriously, just because we understand the horizontal one better.

A related question is whether the dimensions I have been talking about hereare a matter of syntax or semantics, and if the latter whether it is lexical orlogical semantics. In part, these are terminological issues. It is clear that somestructure resides behind the notions discussed in this chapter, with tight hierar-chical properties. I find it useful to express the relations in question in “smallclause” fashion, in part because I would like to relate the relevant structures topossessive ones, and those, I think, bottom out as predications of an “integral”sort. But whether we need standard or “integral” small clauses, or some othermore abstract object such that each dimension is in effect an object in the next(thus the order of the formal language augments with each warp) is a formalmatter which should not be decided without empirical argumentation. My ownfeeling is that syntax should code various orders of formal complexity, whichwould make its mapping to a corresponding semantics with various, progres-sively more complex types, all the more transparent. But other than a minimal-ist (that is naturalistic or aesthetic) reason for this, I hide nothing behind thishunch, although a serious study of functional categories, which I have set asidenow, may have something to say about the matter.

Deciding on that would also bear on whether the mathematical differencepresented here between the vertical and the horizontal systems is to be takenseriously. Again, order in the vertical system is intrinsic, if what I said is correct,whereas in the horizontal syntax order is customarily taken to be an extrinsicmapping (from D-structure to LF, or corresponding objects). Could it be thatthis order, too, should be expressed in dimensional terms? Is that the rightmove, though, if the vertical and horizontal syntax do have significantly differ-ent properties?

One may ask also what is the “ultimate” reality of these vertical hierarchies,how they differ from whatever underlies numbers, and so on. I do not know.Evidently, a mental module for mathematical reasoning is not the same as thelanguage faculty, or those of us who are terrible practitioners of mathematicswould not be able to finish a sentence. Luckily it is not like that, nor is there anyreason, even from the perspective presented here, why it should be so.

It is true that I am making an explicit connection between the numbersystem, generalizations of Euclidean space and lexical concepts. This is in partbecause we seem to be the only species that moves around comfortably withinthose parameters, and more importantly for us here because that may havesomething to say about the vertical cut of language, which otherwise has to beblamed on some unclear properties of reality. God only knows how the humanmind has access to that general ability. More concretely, but equally mysteri-ously, some evolutionary event must have reorganized the human brain to give

D E R I V A T I O N S

316

Page 328: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

us access to that. Perhaps the fact that we seem to be limited to lexical concep-tualization in (roughly) four dimensions might even relate to the non-trivial factthat the world we occupy comes to us in four dimensions, at least in this stage ofits physical evolution. Before (in the first few milliseconds after the Big Bang) itapparently had more. Although less than a speculation, that would be a curiousstate of affairs, with the structure of what is “out there” heavily influencingaspects of what’s “in here.”

We have no way of knowing that, with present understanding in any of thesciences. Yet, it is reasonable to suppose that once the evolutionary event inpoint took place, whatever its details and causes, other parts of the brain got co-opted to develop the faculty of language, mathematics, music, and perhapsother faculties. It is hard to imagine what else other than this sort of “exapta-tion” (in the sense of Gould (1991) and elsewhere) could have taken place, atleast in the case of music, whose adaptive properties are non-existent; in myview similar cases can be built for mathematics (most of which is useless, cer-tainly in a survival-of-the-fittest sense) and even language (although the latter iscontroversial, see Uriagereka 1998). In any case, presumably different interfacesexist for each of the modules just mentioned, hence the behaviors they allow areall significantly different, even massively so. This is expected in complex systemsmore generally, which may obey similar principles of general organization, butend up with structures mathematically as similar, yet functionally as diverse, as asunflower corolla and a peacock’s tail.

W A R P S

317

Page 329: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

NOTES

2 CONCEPTUAL MATTERS

1 My appreciation, first and obviously, to Noam Chomsky, for more than I can express.Section 1 has benefited from comments from Elena Herburger, Norbert Hornstein,Howard Lasnik, Roger Martin, and Carlos Otero. I assume the errors, etc.

2 I thank Cedric Boeckx, Elena Herburger, David Lightfoot, Roger Martin, MassimoPiattelli-Palmarini, Johan Rooryck, and especially Peter Svenonius for usefulcommentaries on Section 2.

3 Icelandic is trickier because it presents both V and P agreement. Hence it shouldpattern with (Sa) languages, but it can exhibit the (A, P) order. However, it is knownthat in this language the associate can move past the left periphery of VP (e.g. acrossauxiliaries). It is thus likely that at the stage of the derivation that concerns me therelevant order is (P, A) even in Icelandic.

4 P in the graphs below stands for “Participial head,” which may or may not exhibitagreement. “. . .” denotes any arbitrary number of irrelevant categories. Move isrepresented through arrows (e.g. as in (9b)) whereas attract (prior to move) is signaledvia dotted lines (e.g. as in (10)); impossible dependencies are crossed.

5 (11b) is locally convergent (i.e. within the derivational horizon that is relevant fordetermining entropy), even if not using up the pleonastic will eventually lead to thecrash of that derivational line. It does not matter, as all that these would be lines do isdetermine the most entropic, actual path. At the point of the derivational evaluation,the specific step considered is convergent, as desired.

6 Given what we said about Icelandic in Note 3, this should be the derivational fate ofthat language. Importantly, Icelandic is the typical example within Scandinavian lan-guages where expletives can be null (non-existent in our terms). Expletives, though,may be pronounced in Icelandic, which recalls the situation in Western Iberian (null-subject and an overt expletive). In both instances, the overt element is associated withthe complementizer system, hence irrelevant.

7 The idea can also be stated in terms of intervention by the participial head itself. Itwould take me too far afield, however, to present the system in these terms, since themere presence of a rich head does not create an intervention effect if the expletive ismissing (as in Romance).

8 Section 3 has benefited from the useful help of Danny Fox, Joel Hoffman, EstherYeshanov, Lilian Zohar, and very especially Hagit Borer, regarding the Hebrew data,and general comments from Cedric Boeckx, Elena Herburger, and Roger Martin.

3 MULTIPLE SPELL-OUT†The contents of this chapter have been presented in several lectures, at the Universitiesof Connecticut, Delaware, Maryland, Pennsylvania, Stuttgart, Porto Alegre, Potsdam,

318

Page 330: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

and Rio de Janeiro, the City University of New York, the National University ofComahue, Oxford and Yale Universities, the Max Planck (Berlin), San Raffaelle(Milan), and Ortega y Gasset (Madrid) Institutes, and the School of African Studies inLondon. Warm thanks to the generous hosts of these institutions, as well as the variousaudiences, for comments, questions and criticisms. I am indebted to Juan Carlos Castillo,Stephen Crain, John Drury, Jim Higginbotham, Howard Lasnik, Roger Martin, JavierOrmazabal, and especially Norbert Hornstein and Jairo Nunes for very useful commen-tary, and both Maggie Browning (initially) and Norbert Hornstein (eventually) for theirinterest, as editors, in these ideas. This research was partially funded by NSF grant SBR9601559.

1 For instance, in Bresnan (1971), Jackendoff (1972) or Lasnik (1972). Tree-adjoininggrammars explored in, for example, Kroch (1989) also have the desired feature.

2 The reasons why compounds and spelled-out phrase markers are “frozen” are com-pletely different (a real compound does not collapse), but the formal effect is thesame.

3 This would be very much in the spirit of Hoffman’s (1996) idea that syntactic unifica-tion is not given by the derivation itself.

4 I assume the standard definition of a sequence �a, b� as a set {{a}, {a, b}} (see, forinstance, Quine 1970: 65). Jim Higginbotham (personal communication) observesthat the notation {a, {a, b}} would also have the desired effects, although touching ona deep issue concerning whether one assumes the Foundation Axiom (or whether theindividual “a” is allowed to be identified with the set {a}). For the most part, I wouldlike to put these issues aside, although I cannot fail to mention two things. One, ifone assumes Quine’s notation, as we will see shortly, syntactic terminals will ulti-mately turn out to be defined as objects of the form {terminal} rather than objects ofthe form terminal. Two, this might not be a bad result, given that in general we wantto distinguish labels from terms, which could be done by way of the definition of termin (6), stating that labels are members of (set) phrase markers that are not terms.Then the problem is terminal items, which clearly are terms but need to be labeled aswell. One possibility is to consider a given terminal term as labeled only after it hasbeen linearized, hence having been turned by the system to a {terminal} (the wholeobject is a term; thus, terminal is its label).

5 Note that the most natural interpretation of the radical version of MSO ships non-complements to performance prior to the rest of the structure, thus proceeds top-down. This matter becomes even more significant below, when we discussantecedence.

6 If specifiers are adjuncts, one can then attribute their being linearized prior to cor-responding heads to the (poorly understood) concept of adjunction.

7 If command is not defined for an intermediate projection, this category will nevercommand (hence precede) its specifier. The converse is true by fiat, given that a spec-ifier is by definition a maximal projection. At the same time, intermediate projectionsmust be relevant in computing command; if they were not, a head and its specifierwould command, hence precede, each other.

8 The most difficult case does not arise when a specifier projects (the system preventsthis on grounds of chain uniformity and similar considerations pertaining to checkingdomains – although see Chapter 6). Rather, it arises when the system sees an inter-mediate projection as a branch to Spell-out and later, after spelling it out, continuesprojecting it by merging it with a specifier. That should be perfectly fine, and it leadsto an object that is linearized “backward,” with the specifier coming last.

9 See Kitahara (1993) and (1994) for the source of these ideas.10 The presentation that follows owes much to useful discussions with Jairo Nunes and

to Nunes 1995.11 Ormazabal, Uriagereka and Uribe-Etxebarria (1994) and (independently) Takahashi

N O T E S

319

Page 331: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(1994) do make proposals about the matter, which prevent extractions from insidesubjects in terms of the Uniformity Condition. However, unlike the present proposal,neither of these naturally extends to extractions from inside adjuncts (assumingadjuncts are noncomplements). This view is generally contrary to the spirit of Larson(1988) – in particular, the idea that direct objects are structurally high in the phrasemarker.

12 Note that the facts are no different if the subject is a pronoun. That is, (i) allows nomore of a focus projection than (15b).

(i) HE painted those frescoes

This must mean that, despite appearances, pronouns (at least focused ones) arecomplex enough, in phrasal terms, to trigger a separate Spell-out.

13 Certainly, instances of a phrase simultaneously agreeing with two heads are notattested (setting aside chains, which are not phrases). The intuition is that multipleagreement as in (19) creates a “Necker cube” effect, which the Agreement Criterionexplicitly prevents (see Section 6).

14 Interestingly, father does not turn out to be a term. In fact, no “last” element in aright branch ever turns out to be a term after a phrase marker is linearized. (Ofcourse, prior to that, these elements are normal terms.) Technically, this entails thatsuch elements cannot have an antecedent, which if pushed to a logical extreme mightwell mean that they cannot have reference. This would lend itself nicely to the idea,expressed in Chapter 15, that the “last” element in a right branch is always the predi-cate of a small clause; and it bears on the analysis of examples like (i).

(i) every politician thinks that [some picture of him] should be destroyed

Castillo (1998) argues that the correct structure for picture of him involves, in thelexical base, the small clause [him [picture]]; if that analysis is correct, him is actuallynot a complement, but a subject of sorts (of which picture is “integrally” predicated,in the sense developed in Chapter 9). If so, him turns out to be a term and can bebound by every.

15 At least, it is not obvious that an antecedent buried inside a “left branch” can hookup with a variable in a different command path. There are well-known (apparent?)exceptions, such as (i), or similar instances involving “inverse linking” or what looklike bound anaphors in East Asian languages.

(i) ?everyone’s mother likes him

To the extent that these examples are acceptable, they may well involve a processakin to, but formally unlike, variable binding. If pronouns like him in (i) can be ana-lyzed as incomplete definite descriptions, then (i) may have the meaning of some-thing like (ii):

(ii) everyone’s mother likes “the one that is relevant”

By cooperatively confining the range of the context variable of him, we may end upwith a semantics that is truth-conditionally equivalent to that implicit in (i). (SeeChapter 8 for a similar treatment of certain anaphors, which may extend to the EastAsian instances.) Then the question is what conditions govern context confinement,something that need not be sensitive to the strict command restrictions that arepresently being explored for the syntax (see Chapter 11).

16 I mention this to address a reasonable objection that Jim Higginbotham raises in per-sonal communication: semantically, it makes sense to say that “an anaphor seeks anantecedent”; but what does it mean to say that “an antecedent seeks an anaphor”?The issue is turned on its head immediately below, where I show how the radicalversion of the MSO proposal can deal with general issues of antecedence.

17 Evidently, I am speaking of bound-variable pronouns, not of anaphors subject to

N O T E S

320

Page 332: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

local principles – which presumably involve some sort of movement to theantecedent. Those, of course, are impossible unless antecedent and anaphor sharethe same CU, as expected given the MSO architecture.

18 The Agreement Criterion does not preclude an antecedent from binding two differ-ent pronouns, since the definition of antecedence requires only that the would-beantecedent agree with the phrase containing the bindee(s).

19 Why is it the lower and not the upper copy that deletes (although to solve the lin-earization riddle either one would do)? Here, Nunes relies on the mechanics offeature checking. Basically, feature checking takes place in the checking domain ofthe attracting phrase marker, and thus it is the copy in that domain that remainswithin the system. For present purposes, this is immaterial.

20 In the past few years, this sort of idea has been revamped by various researchers (see,e.g. Pica and Snyder 1995).

4 CYCLICITY AND EXTRACTION DOMAINS†We are grateful to Norbert Hornstein, Marcelo Ferreira, Max Guimarães, Sam Epstein,and an anonymous reviewer for comments and suggestions on an earlier version of thischapter. Jairo Nunes is grateful for the support CNPq (grant 300897/96-0) and FAPESP(grants 97/9180-7 and 98/05558-8) have provided for this research, and the same appliesto Juan Uriagereka, who acknowledges NSF grant SBR960/559.

1 For purposes of presentation, we ignore cases where two heads are in mutual c-command. For discussion, see Chomsky (1995b: 337).

2 In Chomsky (1995b: Chapter 4), the term LCA is used to refer both to the LinearCorrespondence Axiom and the mapping operation that makes representationssatisfy this axiom, as becomes clear when it is suggested that the LCA may deletetraces (see Chomsky 1995b: 337). We will avoid this ambiguity and use the termLinearize for the operation.

3 See Chapter 3 for a discussion of how agreement relations could also be used asaddresses for spelled-out structures.

4 Following Chapter 3, we assume that spelled-out structures do not project. Hence, ifthe computational system applies Spell-out to K instead of L in (9), the subsequentmerger of L and the spelled-out K does not yield a configuration for the appropriatethematic relation to be established, violating the �-Criterion. Similar considerationsapply, mutatis mutandis, to spelling out the target of adjunction instead of the adjunctin example (14).

5 That is, regardless of whether adjuncts are linearized by the procedure that linearizesspecifiers and complements or by a different procedure (see Kayne 1994 andChomsky 1995b for different views), the important point to have in mind is that, ifthe formulation of the LCA is to be as simple as (7), the lexical items within L� in(15) cannot be directly linearized with respect to the lexical items contained in thelower vP segment.

6 In principle, the rule of Spell-out can be interpreted as immediately sending spelled-out material for pronunciation, or rather as freezing relevant material as PF-bound,but with actual pronunciation taking place later on in the phonological component.For present purposes, the second interpretation is the assumed one.

7 The approach outlined above is incompatible with a Larsonian analysis of doubleobject constructions (see Larson 1988), if extraction from within a direct object in aditransitive construction is to be allowed.

8 The computation of nondistinct copies as the same for purposes of linearization maybe taken to follow from Uriagereka’s (1998) First Conservation Law, according towhich items in the numeration input must be preserved in the interpretive outputs.

9 Notice that the structure in (24b) could also be linearized if the head of chain were

N O T E S

321

Page 333: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

deleted. Nunes (1995, 1999) argues that the choice of the links to be deleted is actu-ally determined by optimality considerations. Roughly speaking, the head of a chainin general becomes the optimal link with respect to phonetic realization as it partici-pates in more checking relations. For the sake of presentation, we will assume thatdeletion always targets traces.

10 The sequence of derivational steps in (25) has also been called inter-arborealoperation by Bobaljik and Brown (1997) and paracyclic movement by Uriagereka(1998).

11 Recall that the label of a spelled-out object encodes the information that is relevantto the computational system; that includes the information that is required for athematic relation to be established between file and [which �which, paper�] in (30b).

12 See Brody (1995) for a discussion of this kind of “forking” chains from a representa-tional point of view.

13 See the technical discussion about the structure of linearized objects in Chapter 3,where it is shown that constituents of linearized objects such as copy3 in (33) comeout as terms in the sense of Chomsky (1995b: Chapter 4).

14 As for the computation of the wh-copies inside the adjunct in (33) with respect to thewhole structure in the interpretive component, there are two plausible scenarios toconsider. In the first one, the interpretive component holds the spelled-out structuresin a buffer and only computes chain relations after the whole structure is spelled outand the previously spelled-out structures are plugged in where they belong; in thiscase, identification of chains in terms of c-command is straightforward, because thestructural relations have not changed. In the second scenario, the interpretivecomponent operates with each object it receives, one at a time, and chain relationsmust then be determined in a paratactic-like fashion through the notion ofantecedence. The reader is referred to Chapter 3 for general discussion of thesepossibilities (see also Chapter 6).

15 See Hornstein (2001) for a similar analysis.16 This is arguably what excludes the parasitic gap construction in (i), since sideward

movement of who places it in two thematic configurations within the same deriva-tional workspace.

(i) *whoi did you give pictures of ei to ei

17 This raises the very serious question of whether deletion ought to be cyclic, and if it isnot, what that means. In Martin and Uriagereka (forthcoming) a different approachto these matters is attempted in terms of chains “collapsing” at different occurrences,without invoking deletion at all.

18 Needless to say, as stated this implies a representational view of chains, which mustsatisfy c-command conditions regardless of how they were generated.

19 It is not our intention here to present an analysis for all the different aspects involvedin parasitic gap constructions. The aim of the discussion of the so-called S-Structurelicensing condition on parasitic gaps was simply to illustrate how sideward movementis constrained. See Nunes (1995, 1998), Hornstein (2001) and Hornstein and Nunes(1999) for deductions of other properties of parasitic gap constructions under a side-ward movement approach.

20 Following Chomsky (2000), we are assuming, largely for concreteness, that themaximal projection determined by a subarray is either vP or CP (a phase inChomsky’s (2000) terms). In convergent derivations, prepositions that select clausalcomplements must then belong to the “subordinating” array, and not to an arrayassociated with the complement clause (otherwise, we would have a PP phase).Hence, the prepositions after and without in (59) and before in (61) belong to subar-rays determined by a light verb, and not by a complementizer.

21 For further evidence that sideward movement must proceed in this strongly deriva-tional fashion, see Hornstein (2001) and Hornstein and Nunes (1999).

N O T E S

322

Page 334: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

5 MINIMAL RESTRICTIONS ON BASQUE MOVEMENTS†For various reasons (including my own procrastination) this chapter has lived severallives. Its rudiments go back to early work with Itziar Laka, presented at ESCOL (Pitts-burgh) and NELS (Cambridge) a decade ago. Later elaborations were presented at theUniversities of Comahue, Connecticut, the Basque Country, Maryland, Pennsylvania,Porto Alegre, Santa Catarina, and the GLOW Summer Courses at Girona. Lakarra andOrtiz de Urbina (1992) includes a paper where most of this early work is summarized,and the barriers solution discussed in the third section of the present version is suggested.This version has been entirely rethought in minimalist terms – where I believe the mostnatural solution emerges – and this has taken me quite some time because the MinimalistProgram is still rather open ended. I thank all the audiences who attended the relevantlectures and the editors of corresponding publications for various comments and ques-tions. Likewise, I appreciate careful and useful NLLT reviews, as well as extensive edi-torial assistance from Alec Marantz. In addition, I am grateful for concrete comments(concerning the various versions) from Xabier Artiagoitia, Andolin Eguskitza, RikardoEtxepare, Elena Herburger, Norbert Hornstein, Istvan Kenesei, Joseba Lakarra,Howard Lasnik, David Lightfoot, Jon Ortiz de Urbina, Beñat Oyarçabal, GeorgesRebuschi, Pello Salaburu, Ibon Sarasola, Esther Torrego, Amy Weinberg, and especiallyJairo Nunes, Javier Ormazabal, Myriam Uribe-Etxebarria, and also Mark Arnold andEllen Thompson (who edited the piece). Itziar Laka deserves to be mentioned essentiallyas a co-author, although I do not hold her or anyone else responsible for my claims,particularly the minimalist extensions.

1 The classical introduction to Basque from the principles and parameters point ofview, which contains also many relevant references within and outside generativegrammar, is Ortiz de Urbina (1989).

2 The agreement marker in (2c) is not introducing a theme, in spite of appearances.The word lan “work” there is not associated to a determiner, hence is arguably not areal argument, as it is in (i), a true transitive sentence:

(i) Aizkolariak lana egin dulumber jack-the-E work-the/a-A make 3-have-3“The lumber jack has done the/a work.”

3 A reviewer points out that scrambling could also be at issue in the examples in (3) or(5) below. This is true, but I refrain from that possibility because I know of no sys-tematic study of scrambling in Basque. In any case, all of the facts I discuss here ariseregardless of whether the phrases involved may not have plausibly scrambled (e.g.because of their indefinite character).

4 In (7) I am not being exhaustive, but the point is Jonek and Mireni can appear any-where except between verb and wh-phrase.

5 This is something that Ortiz de Urbina acknowledges. Note that the embedded auxil-iary in (8) associates to the element (e)la “that.” In (9) below we see the embeddedauxiliary associated to (e)n, a wh-complementizer that I gloss as “if.”

6 The complementizer nola should not be confused with the equally pronounced ques-tion word nola “how.” The same is true in English:

(i) The professor explained how the earth moves around the sun.

(i) is of course ambiguous, making reference to either the professor’s explanationabout the mechanics of the earth moving, or to the mere fact of the explanation,whose truth is assumed.

7 (10) directly contradicts any attempt at analyzing the phenomenon under scrutiny asin situ wh-questioning: the question word is displaced from its base position. Long-distance wh-movement is shown below to behave as expected.

8 Matters get more complex if we assume Kayne’s (1994) analysis, which involves overt

N O T E S

323

Page 335: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

movement of IP to the Spec of (a universally first) C in languages where C appearslast. This predicts no long-distance extraction in Basque (contrary to fact), unlessmultiple specifiers are assumed, as in Chomsky (1995b: Chapter 4). If so though, it isunclear why multiple specifiers do not salvage (10b).

9 Laka’s “sigma” is, by hypothesis, a left-periphery head. This raises the same ques-tions that Ortiz de Urbina’s leftward C does, and suggests that Kayne’s universalhead-first analysis is on track. Nothing that I say bears on Basque complementizersbeing last, and if I raised the issue before it was only as a possible question for the V2analysis. The reason I will not pursue the Kayne line here is that it is extremely cum-bersome to present in full generality; but we have in fact sketched the details for thatsort of analysis in Ormazabal, Uriagereka and Uribe-Etxebarria (1994).

10 I thank a reviewer for observing that a proposal along these lines is Jelinek (1984).Chomsky (1986a: 39) suggests this view, too.

11 It will not help to claim, à la Rizzi (1990), that why and similar adjuncts do not leavetraces and hence only modify what they directly associate to. Under other circum-stances (with subjects out of the way), IP adjuncts modify long distance.

12 Consider also (i):

(i) Nork mahaia bedeinkatuko du?who-E table-the/a-A bless-fut 3-have-3“Who will bless the table?”

Sentences of the sort in (i) were raised in L&U as a further problem for Ortiz deUrbina’s analysis. Sarasola (personal communication) has provided severalexamples, from written classical texts, of the format wh S O V; apparently, it isharder to find exceptions of the sort wh O S V. Modern speakers have varying judg-ments with respect to (i), Northeastern speakers allowing it more readily than othersfor whom the construction is clearly stigmatized in normative grammars. It is reason-able to suppose that (contrary to both L&U and Ortiz de Urbina 1989) (i) involves(vacuous) LF wh-movement, perhaps only if the direct object is unspecific, as areviewer suggests. Vacuous movement may be happening, also, in multiple questions,which would otherwise create a serious paradox for anyone’s analysis (only one wh-phrase can be left adjacent to V):

(ii) Ez dakit nork zer ikusi duen.not know-1 who-E what-A see 3-have-3-if“I do not know who has seen what.”

13 See Chomsky (1995b) on whether this should be an Agr head.14 See also Raposo (1988), Ambar (1992), among others. As for why IP is not a barrier

in English, see Sections 4.2 and 4.3.15 Although this analysis was possible in the L&U framework, it was not pursued, mis-

takenly assuming that the trace of a wh-phrase should not count as a valid specifier ofa given category.

16 The interested reader can find an introduction to details and conceptual foundationsof this system in Uriagereka (1998).

17 The system also allows for convergent, optimal derivations which have no semanticinterpretation. This is irrelevant now.

18 The idea of Multiple Spell-Out creating a “giant compound” is essential not just tothe present analysis, but to everything said in Chapter 3, where the system is intro-duced. The result of (Multiple) Spell-Out is independent of what in the grammarforces the application of this rule – in this instance the repair strategy in (29). Areviewer suggests that the reason (29) should force early Spell-out is that the specifierand the head must undergo Halle-Marantz type fusion under adjacency conditions;this seems to me very plausible.

N O T E S

324

Page 336: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

19 Multiple Spell-Out of exhaustively merged sub-units of structure derives a number ofnotions and conditions, as discussed in Chapter 3.

20 This line has been pursued by Chomsky in his 1997 Fall lectures and in Chomsky(2000), and is consistent with everything I say, so long as we keep the morphologicalrepair for specifiers of “heavy” heads.

21 Although adapting it to present minimalist concerns, I am now pursuing a modifiedversion of an interesting idea suggested by a reviewer: “Why not say that pro doesnot need Case until LF, and so stays within VP until after all overt movement?” Apro moved at LF is nothing but a feature. I would like to reserve that specific sort ofpro for Asian languages which do not exhibit strong agreement, which are discussedin the next section.

22 This is sometimes referred to as “Taraldsen’s generalization.” Taraldsen (1992)essentially pursued the sort of analysis argued for here. Note also that a feature promust be a neutralized head/maximal projection, in Chomsky’s (1995b) “bare” phrasestructure sense. That is, pro is a sort of clitic which is enough of a projection to con-stitute a real argument, and enough of a head to move in order to check morphology.

23 Chapter 3 shows that for this particular case of Kayne’s LCA we do not need a separateaxiom; under reasonable assumptions, the result follows from economy considerations.

24 The creation of these partial objects of well-formedness suggests that the notion“level of representation of LF/PF” is playing no role, since all convergence issues aredecided in terms of local sub-phrase-marker, very much in the spirit of the sub-treesof Tree Adjoining grammars explored by Aravind Joshi and Tony Kroch, and theirassociates (see Chapter 7). We still crucially need the notion “component of LF/PFrepresentation,” which is what virtual conceptual necessity ensures anyway in asystem that relates “sound” and “meaning.”

25 The notion proposed here is essentially identical to Chomsky’s (2000) “phase,”although it remains to be seen precisely what constitutes a phase.

26 That is under the assumption that features (here, an instruction to pronounce or notto pronounce, depending on what is taken to be basic) can be added in the course ofthe derivation in given structural contexts.

27 Chung suggests that VSO should not be achieved through head movement. Thealternative she presents (her (11)) is incompatible with the minimalist system aspresently being explored. I will abstract away from that possibility.

28 The fact that a complementizer incorporates to the matrix verb, obviously leaving themoved verb behind, suggests that the verb itself moves no higher than Laka’s (1990)“sigma” position. Chung considers and rejects this general sort of analysis for threemain reasons. First, she adduces technical complications which are now solved.Second, she has interpretive reasons to proceed the way she does; the reasons arewell taken, but can be kept as a consequence, not the cause of the syntactic phenom-ena. Third (her most important reason), she reasonably wonders why the trace of wh-phrases, and not the head of the wh-chain, triggers agreement; that, in present terms,is tantamount to asking why C incorporates across a wh-trace, but not a wh-phrase,which must be because if C does not incorporate, long distance wh-movement isimpossible in this language. I will not address this intriguing matter here, though seethe end of Section 7.2.

29 As a reviewer points out, subject extraction from the post-verbal position is corre-lated, in some Romance variants, with absence of overt agreement. This can be inter-preted in various ways, among them assuming that, in those instances, the subject ispleonastic. Whether this case can be generalized to instances where overt agreementshows up is hard to know.

30 See Kiss (1995) for a review of proposals and various references.31 Kiss notes that the facts are slightly more complex. The agreement is generally

optional, although it becomes obligatory when the moved phrase is in the accusative

N O T E S

325

Page 337: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Case (as in (54b)). This sort of Case can be acquired in Hungarian in the course ofthe derivation (see Kiss 1987: 140 and ff.).

32 This is somewhat consistent with well-known obviation effects that subjunctiveclauses induce, particularly if they are analyzed as in Kempchinsky (1986).

6 LABELS AND PROJECTIONS†We thank participants in our seminars for questions and comments, as well as an audi-ence at USC, especially Joseph Aoun and Barry Schein. Thanks also to Elena Her-burger, Paul Pietroski and Anna Szabolcsi. This work was funded by NSF GrantBCS-9817569.

7 A NOTE ON SUCCESSIVE CYCLICITY†We are grateful to audiences at the University of Maryland and the University of Iowa.We are especially indebted to Alice Davison, Norbert Hornstein and Paula Kempchin-sky, as well as an anonymous reviewer.

1 A reviewer points out that, according to Chomsky, phases have the effect of limitingthe search space and thus reducing the complexity of the computation. True as thismay be, it still does not explain why the edges of these syntactic objects should beaccessible from the outside, or why phases should be impenetrable, for that matter.

2 A reviewer points out that the TAG analysis also violates the constituency of the sen-tence, because the subtree is not a constituent. This is not strictly true in the TAGformulation, given that the foot of the tree contains an empty label that indicates thekind of constituent that has to be added at that point, represented in (2b) as IP. This isproblematic nonetheless in a BPS framework, given that this category is not derivedthrough the projection of a lexical item.

3 This is in the spirit of proposals in Chapter 8 about different Cases within a phase asdiacritics on otherwise identical D elements, and Castillo’s (1999) account of weakpronouns in Old Spanish, where multiple specifiers are only distinguished through theA/A� difference.

8 FORMAL AND SUBSTANTIVE ELEGANCE IN THEMINIMALIST PROGRAM

†This is a version of a talk delivered at The Role of Economy Principles in LinguisticTheory, Max Planck Institute, Berlin. I wish to thank the organizers of the conference fortheir invitation and their useful editorial comments, and the audience for their veryhelpful comments. I also thank my students and colleagues at College Park for theircooperation when sorting out some of these ideas in my Spring seminar on minimalism. Iam indebted to Elena Herburger, Norbert Hornstein, David Lightfoot, and Jairo Nunesfor their comments on a draft. Usual disclaimers apply. This research was partly financedby a Summer Research grant from UMD at College Park.

1 Exaptation

Evolutionary theory lacks a term for a crucial concept – a feature, now useful toan organism, that did not arise as an adaptation for its present role, but was sub-sequently coopted for its current function. I call such features “exaptations” andshow that they are neither rare nor arcane, but dominant features of evolution[serving] as a centerpiece for grasping the origin and meaning of brain size inhuman evolution.

Gould (1991: abstract)

2 This is not to say, of course, that procedures to integrate, for instance, go througheach possible variation. That is a matter of implementation.

N O T E S

326

Page 338: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

3 For instance (i), which is ungrammatical as part of (ii) (because (iii) is a better solu-tion), but grammatical as part of (iv) (because there is no better, convergent altern-ative). See Chomsky (1995b: Chapter 4).

(i) [a man to be t here](ii) * [there was believed [a man to be t here]](iii) [there was believed [t to be a man here]](iv) [I believe [a man to be t here]]

4 In standard problems in dynamics, we can define a quantity, called a Lagrangian L,which ranges over velocity and position, and equals the kinetic energy minus thepotential energy. This quantity can be used to rewrite Newton’s law, by way of theEuler-Lagrange equation to describe a particle’s motion in one dimension (an ideal-ized version of the problem, which involves more than one particle; see Stevens(1995: 27–39 and 59–68) on these matters):

� �0

The particle path that satisfies the Euler-Lagrange equation makes the function A(the action) a minimum:

A [x(t)]�� tf

tiL(x,x

.)dt

5 This corresponds to the procedure of “adiabatic elimination” of fast relaxing vari-ables. The procedure is used, for instance, in reducing degrees of freedom within aprobabilistic equation (see Meinzer 1994: 66 and ff.).

6 Uriagereka (1998: Chapter 6) offers a speculation as to what it means for a feature tobe “viral” (as assumed in Chomsky 1995b).

7 This is not meant metaphorically. The Slaving Principle is surely a clearer determi-nant factor in the behavior of turbulence than in linguistic examples.

8 Why the growth of the snail shell proceeds the way it does is a complex matterinvolving genetic information and epigenetic processes of various sorts. On an early,extremely insightful view on this matter, see Thompson (1945: Chapter VI).

9 This version of the Linear Correspondence Axiom is discussed in Uriagereka (1998:Chapter 3), and is adapted to fit a bare-phrase structure theory.

10 Domination can be defined in terms of set-inclusion of constituent elements withinterms, as in Nunes and Thompson (1998).

11 (3b) cannot be monotonically assembled into a unitary phrase-marker, given a“bare” X�-theory; instead, the system allows the merger of structures which havebeen previously assembled, by way of generalized transformations.

12 A command unit roughly corresponds to one of Kayne’s (1984) “unambiguouspaths,” appropriately adapted to the present system. As Dave Peugh and MikeDillinger independently point out, command units can be generated in terms of Mar-kovian systems, and are thus iterative (not recursive) structures. In the presentsystem, recursion is obtained through a generalized transformation.

13 In Kayne’s terms, the correspondence is of the following sort (for A an abstract rootnode:�b,c,d…� a sequence of ordered terminals, and �t1,t2,t3,t4,t5,…� a sequence oftime slots in the A/P components):

(i) A→ t1

A b→ t2

A b c→ t3

A b c d→ t4

A b c d …> t5

�L(x,x.)�

�x

�L(x,x.)�

�x.d

�dt

N O T E S

327

Page 339: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

What we must determine is why this correspondence obtains, as opposed to otherpossible mappings (an issue first raised by Samuel D. Epstein, as far as I know).

14 Otherwise, one would have to define a new structural relation, deduce it and showthis to be a better alternative to command; I do not see what that could be.

15 Generally: y� f(x) (for f a variety of procedures). For instance: y1 is mapped to the xvalue three-times removed from 0; y2 is mapped to the x value prior to x1; y3 ismapped to the x value three-times removed from x2; and so on.

(i) converges (the hierarchical ordering is appropriately mapped to some sequence ofPF slots); but is not a simpler realization of y� f(x) than y�x.

16 Whether or not these mechanics in terms of substitution are necessary depends onthe particular version of the Multiple Spell-Out system that one assumes (seeChapter 3).

17 More generally, the proposal has empirical consequence whenever we find structuresthat do not depend on merger (such as discourse representations or paratactic depen-dencies involving adjuncts). See Hoffman (1996) on this.

18 The LF component being internal, here we can be bold and talk not just of structuralproperties, but in fact of universal structural properties.

19 The flattened word-like object that results from L has to be understood as a word-level unit. Following a suggestion by Jairo Nunes, in Chapter 3 I deduce from this theimpossibility of relating subject/adjunct internal structure to the rest of the phrase-marker (Huang’s (1982) CED effects).

20 If a proposal first noted in Chomsky (1964), and attributed to Klima, is on the righttrack, perhaps a discourse representation line can be pursued for these examples.Klima’s suggestion was that wh-expressions hide an indefinite predicate of existence;who in (8b) is akin to which x and exists x. This essentially indefinite predicate ofexistence should be sensitive to Wasow’s (1972) Novelty Condition, with thepronoun his introducing a more familiar expression inducing an odd interpretation.The same can be said about everyone in (8a), if this element too contains a hiddenindefinite one, as its morphology indicates (see Chapter 3).

21 I do not make any commitments, however, as to whether this is the case.22 Just as the Last Resort Condition reduces computational complexity in derivations,

so too does the Minimal Link Condition, if derivations that violate it are canceled. Ido not see how this condition might follow from something like the Slaving Principle,but it might conceivably relate to other conditions on systems imposing locality andpredicting field and “domino” effects.

23 In this guise: A head � moves to a head � only if � and � are L-related.24 We are talking about structural properties which are conserved across derivational

processes. It should be remembered, though, that quantity conservation laws inphysics have helped in the understanding of particle families, and predicted new par-ticles that were later on discovered.

(i)�

1 2 3 4 5 6 7

N O T E S

328

Page 340: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

25 These are argument clitics. In contrast, sentences such as (i) are possible:

(i) te me vas a resfriaryou me go to get.a.cold“You’re going to get a cold on me.”

However, me is not an argument of resfriar “get a cold,” but is something more akinto an argument of the assertive predicate introducing the speaker’s perspective. Ananalysis of this and related cases would take me too far afield.

26 This is welcome. The nominal carrying an uninterpretable feature must then raise tocheck the number feature in the determiner (see Longobardi 1994). Had the featurein the nominal been interpretable, this raising would be unmotivated. We would thenhave to say that in Spanish the determiner has a strong feature for the nominal tocheck, unlike in English – which would lead to two rather different LFs for each lan-guage. Interestingly, Jairo Nunes points out that several variants of Brazilian Por-tuguese overtly encode number only in determiners. These ideas also relate toChomsky’s (2000) notions of “probe” and “goal.”

27 Nunes & Thompson (1998) modify the notion “dominates” so as to have it hold offeatures. In class lectures (Fall, 1995), Chomsky abandoned the concept of “checkingdomain” altogether in favor of a theory of sub-labels. This is in the spirit of every-thing I have to say here, where it is crucially features, and not categories, that matterfor various syntactic purposes. See also Nunes (1995) for much related discussion.

28 See Halle and Marantz (1993: 129 and ff.) for a recent treatment of why forms like*oxens are not attested. This case is slightly different from the one I am discussing,*lionses, in that the latter involves two obviously identical plurals. It may be signifi-cant that infants do produce *oxens but never *lionses.

29 This idea was suggested in Uriagereka (1988a: 54).30 That is, while the grammar codes the presence of a context variable (in essence, [�s]

is a contextual feature), it does not assign a value to it (I or II), any more than itassigns values to other context variables.

31 These were noted in Uriagereka (1995b): Section 4.32 As Viola Miglio points out, (25d) contrasts with the perfect Italian (i):

(i) Qui glielo si invia.“Here one sends it to them.”

Thus, (25d) cannot be out for semantic reasons. In contrast, (ii) (provided by JairoNunes to illustrate a phenomenon which Eduardo Raposo also notes) indicates thatthe impossibility is not merely phonological either. Thus, while sentences involvingthe �se, se� sequence are impossible in Portuguese, (ii) is perfect:

(ii) Se se morrer de amor… “If one were to die of love…”if se(impersonal) would.die of love

Crucially, though, the first se here is not a pronoun, but a complementizer.33 I am not implying with this that se has a [�s] feature; see below.34 I am abstracting away from the exact status of (29a). What follows is a version of an

analysis I attempted in (1988a), but was not able to put together. It owes much to theseminar on binding and Case taught by Luigi Burzio at College Park. Although I donot follow, specifically, his approach to these matters, they have greatly influencedmy way of looking at the problem. See Burzio (1996, 2000).

35 That is, the direct object features do not directly move to v, particularly because inmany languages involving V movement, this would have to imply incorporation ontoa trace, plausibly barred by Chomsky under the view that chains are integral objectswhose parts cannot be transformationally targeted.

36 In the version of the theory being explored by Chomsky in class lectures (Fall 1995),only heads are targeted for featural movement, specifiers being involved as a

N O T E S

329

Page 341: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

morphological side-effect. Then the checking domain reduces to the sub-labels of ahead, the set of dependents of this head which associate via adjunction. Everythingelse said here remains unchanged, provided that the relevant domain of checking(technically, not a “checking domain”) is a set.

37 Mathematically, this does not go into orders of complexity different from thoseinvolved in standard features. Matrices are needed either way.

38 The intuition is to relate local obviation to switch reference phenomena, of the sortstudied in Finer (1985). Matters have to be slightly more complicated than implied inthe text, given the fact that the subject of an ECM verb’s complement (normallymarked with the accusative value) has to be disjoint from the object of this verb (alsomarked accusative). There are different ways to address this puzzle, but I will put itto the side now.

39 This parameter is otherwise extremely hard to motivate, if Case is an uninterpretablefeature pertaining, ultimately, to the covert component.

40 The Mojave example in (i) (attributed by Lasnik (1990) to Langdon and Muro(1979)) suggest that this view is correct, in light of what is said in Note 37:

(i) ?inyec pap ?-∧kxi:e-m Judy-c salyi:-kI.sg potato I-peel-DR Judy-subj fry-Tense“After I peeled the potatoes, Judy fried them.”

As Lasnik observes: the whole switch reference system is still exhibited even with Iand II pronouns. While this is surprising from a semantic point of view, it is naturalfrom the purely formal perspective that I have just discussed.

41 Regardless of this, the issue is moot if only D features make it to the same checkingdomain of t that the formal features of the pronoun do. This is particularly so if weassume that names, just as any other arguments, are headed by D, which is what getsto be in the checking domain of T. See Longobardi (1994), who builds on the essen-tials of Higginbotham (1988).

42 The Danish data are courtesy of Sten Vikner, to whom I am indebted for an insight-ful discussion of these issues. See Vikner (1985) for a full presentation.

43 These ideas go back to Burge (1973), who took name rigidity to follow from animplicit demonstrative. Higginbotham (1988) reviews and reworks Burge’s insight interms that have inspired the proposal in the text.

44 Why this should be so is, in and of itself, interesting, but I have nothing of any pro-fundity to say about it, other than it fits well with the rest of the system.

45 See Cole and Sung (1994) for this sort of analysis. Uriagereka (1988a: Chapter 4)presented an analysis along these lines as well, with empty operator movement toInfl �Tense. This specific analysis is more in the spirit of what I have to say immedi-ately below, since I do not think it is either sig or selv that moves.

46 For some reason that I do not understand the adverbial mismo is obligatory.47 Templatic conditions of this sort are known to be relevant in various areas of

morphology, across languages. Schematically, for Romance [�s] tends to comebefore [�s] (notorious reversals exist in Aragonese and Old Leonese). In turn, theunspecified [s] clitic (se) is a bit of a wild card. In Spanish, for instance, it comes first,before strong and weak clitics. In Italian, in contrast, it comes after weak clitics, butbefore locative clitics (Wanner 1987). In Friulian, it comes as a verbal prefix (seeKayne (1991: 664) for an analysis, and in archaic Italian, as a verbal suffix (Kayne1991: 663).

48 For discussion on how this affects the Case system see Raposo and Uriagereka(1996), where it is argued that structures involving se may involve Case reversal situ-ations, as expected.

49 None of the arguments that Chomsky gives for the Thematic Criterion carry through.For instance, it is said that without a Thematic criterion, (ia) should outrank (ib), byinvolving one transformation less:

N O T E S

330

Page 342: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(i) a. [John t [v [used Bill]]]b. [John T[t v [used Bill]]]

However, at the point of moving John, the sentences involve two different partialnumerations, and are thus not even comparable for optimality purposes. Chomskyalso wants to prevent (ii) in thematic terms:

(ii) I believe [t to be a great man]

But as John Frampton (personal communication) points out, it is not obvious how agreat man receives Case if the sort of believe that allows raising (selecting for therelevant sort of infinitival) is essentially unaccusative.

50 These same mechanics can be extended to (expletive, argument) pairs, withoutneeding to stipulate that the former are morphemically related to the latter. All thatmatters is that the associate’s features end up in the same checking-domain-set as theexpletive features, as argued in Chomsky (1995b).

51 In (47b), I am assuming a simplified version of Larson’s (1988) analysis.52 The idea here is that checking domains are set-theoretic notions super-imposed on

phrasal dependencies (see Uriagereka 1998: Chapter 5 for a detailed definition).53 About this reading, I have nothing to add to what is said in Raposo and Uriagereka

(1996). I should note, however, that our interpretation of indefinite se creates anapparent problem, since the interpretation of se in dative sites (47c) is not necessarilyindefinite. I suspect this relates to another fact about dative clitics which is discussedin Uriagereka (1995b): their double can be definite or indefinite, something which ispeculiar (the double of accusative clitics cannot be indefinite). Arguably, then, theinterpretation of dative clitics is simply unspecified for definiteness.

54 Reinhart’s proposal is in many respects rather different in spirit from Chomsky’s. Shebelieves that “interface economy . . . determines the shape of the numeration: . . . it isat this stage of choosing the ‘stone blocks’ that speakers pay attention to what it isthey want to say” (Reinhart 1995: 49). In contrast, Chomsky asserts that “there is . . .no meaningful question as to why one numeration is formed rather than another . . .That would be like asking that a theory of some formal operation on integers – say,addition – explains why some integers are added together rather than others . . . Orthat a theory of the mechanisms of vision or motor coordination explains whysomeone chooses to look at a sunset or reach for a banana. The problem of choice ofaction is real, and largely mysterious, but does not arise within the narrow study ofmechanisms” (Chomsky 1995b: 237).

55 Actually, any sets would do the trick, although checking domains as set-theoreticobjects are natural domains for everything I have said here to happen.

9 INTEGRALS

1 Observe that it would be consistent with this account if we added a layer to theunderlying phrase structure.

(i) [Spec be [DP Spec D0 [DPposs [Spec Agr0 [SC John a sister]]]]]

This would yield an underlying small clause structure without any functional mater-ial. The derivation would then proceed with John moving to Spec Agr0 and then toDPposs from there on. The derivation would be as in the text.

2 This is discussed in Section 3. The matter was already discussed in Keenan (1987)and De Jong (1987), and observed as early as in Benveniste (1966).

3 This does not mean to say that the movement of the [D/P] occurs so that a minimalityviolation can be avoided. Such altruistic movement is barred given the assumptionsin Chomsky (1995a, 1995b). More likely, the movement occurs to license the[D/P]. This would make sense if in these sorts of cases the [D/P] is null and that

N O T E S

331

Page 343: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

incorporation is required to license the expression. See den Dikken (1992) for sug-gestions along these lines for other null prepositions.

4 The details of this process are not fully understood. We do not know when deletion isrequired and when not. Furthermore, it appears that copying is understood as pre-serving reference rather than copying as such, given the presence of the pronoun. Forrelevant discussion see Fiengo and May (1994) and their discussion of “vehiclechange.” Alternatively, the pronoun would be the Spell-out of the relevant lexicalmaterial (see Uriagereka (1994) for discussion of this traditional idea).

5 This assumption is defended in Sportiche (1990). In other work, we suggest a differ-ent alternative, by semantically analyzing each structure as follows.

(i) a. ∃e [Infl (a-Ford-T-engine, e) & in-my-Saab (e)]b. ∃e [Infl (My Saab, a-Ford-T-engine, e), & in (e)]

(ia) is invoked in SI instances and is simpler than (ib), the structure associated to IIinstances. While the former is an unaccusative structure, the latter is transitive, repre-senting a relation of in-ness (or of-ness, to-ness, and similar instantiations of relationR). Following Bresnan (1994), we assume that only locative unaccusative structuresallow the processes of long predicate raising. (For discussion of predicate raising, seeden Dikken 1992.) This is exemplified in (2b), where the predicate in my Saab raisesto satisfy the EPP. In (13b) it is a Ford T engine that raises, which is also a predicatein our terms. Note, however, that the predicate in this instance comes from a transi-tive structure (ib), hence cannot be long-moved.

6 Alternatively, it follows trivially from the suggestion made in Note 5.7 Note, for example, that Canada has nine provinces is false, and would be true if the

expression could convey the thought that nine provinces are located in Canada(which is true). This problem does not appear with SI constructions. There are twodoctors in NYC is obviously true (even if there surely are more than two doctors inNYC in toto).

8 This is particularly clear in Spanish where plural associates do not trigger pluralagreement in II interpreted existentials.

(i) a. Había muchos wateres en el tercer piso.“There was (sg.) many toilets in the third floor.”

b. Había(n) muchos wateres en el tercer piso.“There was/were many toilets in the third floor.”

(ia) means that the third floor had many toilets. (ib) means that many toilets were onthe third floor. Note that it is the SI existential that can show agreement in Spanish,for reasons that we will not go into here.

9 This is the unstressed some. When stressed it can be used exclamatively.

(i) Wilbur is SOME pig!

10 A similar assumption is made in Reuland (1983), Loebner (1987), and Hornstein(1993), among others.

11 For sentences such as (i) Keenan proposes that there is a null predicate.

(i) There is [a God XP]

12 Keenan (1987) ties the DE in have-constructions together with those in there-existentials. He does this by licensing the subject in (i) by binding an open positionthat the relational noun inherently has.

(i) John has a brother (of his) in college.

In (i), John binds his and gets its �-role accordingly. If so, (i) does not have a them-atic subject and, if one assumes that have is semantically vacuous, the structure of thepost-have material is (ii), similar to the post-copular material in there-clauses.

N O T E S

332

Page 344: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(ii) [sc [NP a brother of John’s] [in college]]

This early analysis is reminiscent of much of what we have to say, with Keenanarguing that some of the sentences we take to be ungrammatical are simply not inter-preted existentially. For instance, consider (iii).

(iii) Michael Corleone has Sonny’s brother in NYC.

(iii) can mean that Michael has kidnapped Sonny’s brother Dino in NYC, but notthat Michael has a brother in NYC who also happens to be Sonny’s brother. It is hardto see how a more traditional analysis would deal with these sorts of facts, but we donot think that Keenan’s interesting proposal should be extended to all regular exis-tential constructions with there.

13 The following also have a partitive feel to them.

(i) I have a brother (of mine) in college.(ii) John has a picture (of his) in the exhibition.

These sentences invite the inference that I have other brothers and John other pic-tures. In other words, they have a partitive undertone. This contrasts with sentencessuch as (iii) in which the post-verbal NP fails to offer a similar invitation.

(iii) John saw a picture at the exhibition.

14 We do not wish to convey the idea, however, that the functional/physical distinctionis an ontological one, one being more “material” than the other. Throughout, we aretalking about cognitive spaces.

15 Similar relations, at an even more abstract level, need to be postulated for “inalien-able” possessions. Of course, there is no obvious sense in which we are constituted ofour relatives. This suggests that what is central in unifying the C and R relations haslittle to do with constitution proper. One possible approach is as follows (forexamples of the sort in the text):

(i) [Extension (x,e) & Division (y,e) & in (e) & Saab (x) & Ford-T-engine (y)]

The semantics in (i) translates an IS expression as a quantification over an event ofin-ness (an integration) spatially extended in terms of a Saab, and expressing a spatialdivision of this extension in terms of a Ford-T-engine. Suppose the roles EXTEN-SION and DIVISION are primitive cognitive notions, which map onto traditionalaxiomatic operators on Boolean algebraic spaces (analogous to “�” and “�” forarithmetics). These operators express part-whole relations when operating on even-tualities of in-ness, but can express other sorts of relations as well: (inalienable) pos-session when operating on eventualities of of-ness (the father of the bride); abstractconstitutions for eventualities of to-ness (there’s various sides to this issue); andperhaps others. The point is, whereas an extension at a material level is divided interms of constituent parts, an extension at a more abstract level may be divided inmore abstract terms. Crucially, what is extended in (i) is a relation of in-ness, througha Saab (it is not the Saab which is being extended). Along these lines, consider (ii):

(ii) [Extension(x,e) & Division (y,e) & of(e) & bride(x) & father(y)]

There is no reason why (ii) should say anything about a bride being constituted of afather. All that (ii) (partially) describes is an eventuality of of-ness (an abstract rela-tion), extended through a bride and measured through a father. In that abstractspace, the relation is one of bride-dom, and fathers can apparently serve as appropri-ate measures of such spaces.

16 It is possible that the clitic first adjoins to D/P and then the whole complex moves outof the small clause. This prior adjunction would evade minimality restrictions.

17 Thus, for instance, in (i) four stomachs are typically at issue:

N O T E S

333

Page 345: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(i) On leur a lavé les estomacs aux agneaux.they to-them have washed the stomachs to the lambs“We washed the lambs’ stomachs.”

10 FROM BEING TO HAVING†This chapter would not have been possible without the many comments, debates, criti-cisms, and advice from the participants in my recent seminars at College Park and theInstituto Universitario Ortega y Gasset. I cannot credit everyone adequately for theirvaluable contribution. I cannot do justice, either, to the vast amount of literature that hasemerged in recent years around the Kayne/Szabolcsi structure. Let me just say that thepresent chapter should be seen as a mere companion to those pieces. Finally, my grati-tude goes to the generous hosts of LSRL 27 at Irvine (especially Armin Schwegler,Bernard Tranel, and Myriam Uribe-Etxebarria) and the audience at my lecture, whosesuggestions have helped me focus my own thoughts. The present research was partiallysupported by NSF grant # SBR9601559.

1 This of course is not obvious, and would force us to treat this element essentially as aresumptive pronoun.

11 TWO TYPES OF SMALL CLAUSES†Parts of this material were presented at the GLOW conference in Lund, and at colloquiaat the University of Rochester and the CUNY Graduate Center, as well as a seminar atthe University of Maryland. We appreciate comments from all these audiences, as well asfrom our students and colleagues. We also appreciate critical commentary from ananonymous reviewer and Anna Cardinaletti and Maria Teresa Guasti, the editors ofSmall Clauses.

1 Many of these examples are somewhat marginal in English, perhaps because Caserealization in this language in the SC subject is not through a dative marker, as inSpanish.

2 Nominals seem like purely individual-level predicates, thus:

(i) ?*I saw him a man.

However, Schmitt (1993) notes that (ii) is fine in Portuguese, with the import of “hehas turned into a man” or “he looks like a man.”

(ii) Ele está um homem.he ESTÁ a man

This suggests that the impossibility of (i) in English is not deep, but perhaps again aresult of Case theoretic matters (see Note 1). In turn, participial elements seem likepurely stage-level predicates. So far as we know, (iii) is out in all Romance languageswhere the estar auxiliary is used:

(iii) *Juan es despedido.Juan ES fired

(cf. “Juan está despedido.”)

Of course, (iii) is fine with a passive interpretation, which might relate to why thissort of predicate cannot be coerced into an individual-level reading.

3 Kratzer (1988) claims that certain individual-level structures are more constrainedfor modification purposes than comparable stage-level structures are. Thus:

(i) a. Most people are scared in Sarajevo.b. Most people are black in Louisville.

N O T E S

334

Page 346: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(ia) can be true of most of the inhabitants in Sarajevo or of most of the people thathappen to be there. In contrast, Kratzer takes an example like (ib) to be true only ofthe inhabitants of Louisville, not the people that happen to be there. However,Schmitt (1993) points out that there may be a pragmatic factor involved here. Thus,consider (ii):

(ii) Most children are intelligent in Central High School.

(ii) is ambiguous, apparently in the same way that (ia) is. It can mean that most chil-dren in that school are intelligent, or that when in that (mediocre) school any childactually stands out as intelligent.

4 A reviewer points out that, in instances of this sort, the subject may precede negationand some adverbs. If this is optional, the point still holds for the option where thesubject does not precede negation or the adverbs. The reviewer also notes that PROcould be inside VP with the lexical subject occupying some intermediate projectionin a more articulated clausal structure.

5 De Hoop works within a Montagovian system partly enriched in DRT terms (Heim1982; Kamp 1984). We assume neither, and instead work our proposal out in a neo-Davidsonian system. One other semantic proposal that we will not go into here isChierchia (1986), which analyzes the individual-level/stage-level distinction in termsof an implicit genericity operator for individual-level predications. This sort ofapproach may run into difficulties with (i), from Spanish:

(i) Bobby Fischer es genial, pero no estuvo genial en Yugoslavia.“Bobby Fischer is genial, but he wasn’t genial in Yugoslavia.”

This is a very typical instance where auxiliaries ser and estar can be used to distin-guish the standing nature of a characteristic vis-à-vis its transient state. It is one ofFischer’s standing characteristics that he has genius, but that does not mean that hecannot have a bad day. Conversely, consider (ii):

(ii) Soy triste de tanto estarlo.“I’m sad from being that so much.”

This asserts that being in a general state of sadness makes the poet sad in a standingmanner. If the latter (expressed through ser) presupposed the former (expressedthrough estar), then (ii) would be an uninformative tautology – which it is not.

6 There are a variety of topics that are entirely irrelevant for our purposes. We are justconcerned with those which do not introduce emphasis, contrast, focus, etc., but areneutral starting points for a sentence.

7 A proposal of this sort was explicitly made in Chomsky (1977b), with a differentmachinery and assumptions. See also Lasnik and Saito (1992) for an alternative in termsof adjunction, and references. Uriagereka (forthcoming) argues for a principle along thelines of (i), responsible for obviation facts, referential clitic placement, and others:

(i) B is referentially presented from [or anchored to] the point of view of thereferent of A iff A is a sub-label of H whose minimal domain M includes B.

From this perspective (discussed in the Appendix), what drives processes of frontingto the vicinity of a subject – which is to be responsible for a given judgment – is theneed to place the raised material in the minimal domain (essentially, the bindingdomain) of the responsible subject. If (i) is correct as an LF principle, it may beimmaterial whether the landing site of the fronting is a Spec (e.g. the Spec of F) or anadjunction site à la Lasnik and Saito. However, it may be the case that (i) is an inter-face principle of the post-LF mappings, in which case it would indeed matter whetherat LF a feature checking mechanism drives the relevant movements (which wouldargue for a separate category like F). We will proceed assuming F for concretenessand because of the issues to be discussed immediately below.

N O T E S

335

Page 347: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

8 See below on this. To insist, this operation is not a result of QR or any semanticallydriven mechanism, pace Herburger (1993a), where the intuition is taken from. Seealso Guéron (1980).

9 A reviewer raises the question of whether the F position is inside the SC. Theanswer must be no, assuming the simplicity of these objects. The F position is neededsolely for the pragmatic subject to land on at LF. In all instances, it is outside of theperiphery of the clause, be it small or regular. However, see the Appendix for moreon this.

10 We adapt this idea on nominative in Romance from Zwart (1989). Note that realizinga default Case does not mean that the Case is assigned by default. Assignment is inthe usual way, but default realization emerges in peripheral sites.

11 Similarly, the realization of A-case may be morphological or in terms of governmentby a Case assigner. Redundantly (or alternatively), the entire structural process maybe signaled through a given auxiliary.

12 A reviewer asks whether N’s come with an event variable even when N is not a pre-dicate. In this system, though, every N is a predicate at some level, even if notnecessarily the main predicate. This is true even for names (see below).

13 There is a complication with this approach. Consider (i):

(i) A former world champion raped a beauty contestant.

The predicate here is thetic, which means the variable in former world championmust be bound by the event operator. Presumably this means that the rapist was aformer world champion at the event of raping. However, this is not necessary. Thus,suppose that the rapist was world champion at the time of the event, although he isnot now. The speaker may choose to refer to him as a former world champion, andrightly so for he is not a champion any more. It is not entirely clear how former isinterpreted outside of the event of raping if the event variable of the noun is boundby the event operator. If this is indeed a problem, it will cease to be an issue once wedevelop the contextual system we propose below.

14 A more standard form of (17b) exists with many instead of much, but the propertiesof this expression are significantly different. For instance, the latter binds a pronomi-nal variable, but the former does not:

(i) En España hay mucho torero # que está desempleado.“In Spain there’s much bullfighter # who is unemployed.”

See also (28) in the text.15 A reviewer is concerned with the meaning of Fischer in (i):

(i) Fischer is our best friend.

If Fischer is a predicate, what is our best friend? The latter is the main predicate ofthe assertion. But surely there are other predicates here. The difference between allof them is how they are bound. We take it that Fischer is bound by something like aridigity operator internal to the projection of the subject, and hence its predicativestatus does not carry over to the main assertion.

16 A reviewer is concerned about the difference between the notions of context andevent. Our system is essentially building on Schein’s (1993) on this. We take contextvariables to be predicated of event variables – hence the two are of a different order.Note, incidentally, that we are not suggesting that we should get rid of event vari-ables (this would not make any sense from our perspective). Rather, event variablesare not the mechanism to deal with the issue of the transience of thetic predications,and for that we need context variables.

17 In fact, this is the essence of Herburger’s insight, now reinterpreted in minimalistterms enriched with a realistic semantics.

18 Xx just means that X holds as a predicate of x.

N O T E S

336

Page 348: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

19 In fact, even within simplex sentences like the matrix one in (i):

(i) Every golfer hit the ball as if he/she was going to break it.

Incomplete definite descriptions such as the ball in (i) need a previous context fortheir uniqueness to hold. That is, (i) in its most salient reading means that everygolfer hit the ball that he/she hit as if he/she was going to break it. The content of“that he/she hit” is expressed for us through a free context variable. The value of thisvariable must be set in terms of a context associated to each of the hitting events (seeUriagereka 1993).

20 The hypotheses make different predictions. A predicts that context is determinedhierarchically, whereas B predicts that context is determined linearly. However, bothapproaches make their prediction with respect to highly elaborate LF or post-LFrepresentations, and not overt structures. Hence, for our purposes now it isimmaterial which of the hypotheses holds, since at LF we literally scope outthe element which anchors subsequent contexts. This element is both hierarchicallysuperior and linearly precedent vis-à-vis the element whose context it is intended toset.

21 For Szabolcsi or Kayne, sentences like John has a brother studying Physics and?There is a brother of John’s studying Physics have a similar source, roughly (i):

(i) [BE [John [a brother]]]

Each sentence is derived by way of either John or a brother raising for variousreasons. Chapter 9 interprets the relation [John [a brother]] as a “possession” SC,and extends these possessive relations to a number of related instances. See alsoKeenan (1987) for similar ideas involving “integral” relations.

22 The main point of Szabolcsi’s analysis is to show the independence of “possessors”vis-à-vis “possessed.” Uriagereka (1993) also discusses how to ensure that in anexpression like every one of the men is available, every can take as a restriction thesort of structure in (25), and still have the same truth values as every man is available,where the restriction is much simpler. The issue is of no relevance to us now. Itremains a fact that this paraphrase holds, and it cannot be explained away in theusual semantic claim that reduces every one of the to a determiner (see Keenan1987). Syntactically, this is unacceptable.

23 Evidently, the move also forces us to consider “variables” as objects of a predicatetype, which poses questions about the relation between this predicate and the onedenoting the set where the partition occurs. Though real, those very questions arise,independently, for partitive expressions in general. The only radical move beingmade at this point is to assimilate those partitive expressions to at least some quan-tificational ones, although not to all, as becomes apparent immediately.

24 Chapter 9 suggests extending an idea along the lines in (27) to expressions which donot come in a partitive format.

25 Intrinsically distributive quantifiers, such as cada “each” in (23) apparently licensepro as in (27) even in the apparent absence of overt number marking. Importantly inthis instance, though, relevant quantifiers do not tolerate plural expressions; thus*cadas los hombres “each-PL the men” contrasts sharply with todos los hombres “all-PL the men.” This suggests that the distributive quantifier does have number speci-fications associated to it, but they must be singular.

26 It is worth pointing out that some concrete expressions with auxiliary estar have lexi-calized into a frozen idiom which does not tolerate auxiliary alternations with ser. Forinstance, in the Castilian dialect: estar (*ser) loco/como una cabra “to be crazy/like agoat (nuts),” estar (*ser) buena/para parar un tren “to be good looking/ready to stop atrain (gorgeous).” Interestingly, these readings invoke standing characteristics ofindividuals, regardless of the auxiliary. Predictably, in these circumstances examplesof the sort in (28a) are fine, without any further contextual qualifications:

N O T E S

337

Page 349: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

(i) a. Todo portero está loco/como una cabra.All goal-keeper is crazy/nuts

b. Toda quinceanera esta buena/para parar un tren.All teenager is good-looking/gorgeous

12 A NOTE ON RIGIDITY†I wish to express my gratitude to Chris Wilder and Artemis Alexiadou for their interestin this piece, and for editorial assistance. I also thank Qi Ming Chen and Yi Ching Su forthe Chinese data and valuable discussion, and Norbert Hornstein and Elena Herburgerfor commentary and (dis)agreement. Any errors are mine.

1 It is not at issue that we might have called Antony “Brutus” or any other name.2 I’m not trying to imply that there is no known solution to this puzzle. “Counterpart”

theory and complex enough versions of modal logic have the grounds to solve it, forinstance.

3 One could complicate the picture in other directions – e.g. invoking characteristics ofGreeks or Trojans as nations, or each of the individuals in the relevant sets.However, I am trying to idealize here precisely to understand what might be involvedin the more elaborate modes in (1).

4 If this sounds too much like a description, substitute it for the name of your favoritegroup, for instance, Nirvana. Literally everything said applies to them vis-à-vis, say,the Bee Gees.

5 It may seem that by changing the parts of a whole one always changes the whole, butthis depends on the nature of both whole and part. For example, a complex dynamicsystem like that involved in the crystallization or vaporization of water involves awhole whose parts are H2O molecules. You can change this or the other H2O mole-cule, and ice will still be ice and steam steam, under similar conditions of tempera-ture/pressure/volume. Note, incidentally, that the issue has nothing to do with partsbeing essential or accidental. It is essential that ice is composed of H2O molecules,yet changing some of these molecules for other H2O molecules has no consequence.

6 Putting aside the “affective” reading that arises when saying “that (drunkard/bastard/glorious) Buster Keaton was a true genius!”

7 It can also affectively refer to (oh!) that happy Dalai Lama, a reading l am setting aside.8 Burge could always try to stick to his guns by claiming that overt demonstratives do

not behave like covert ones, but that seems unilluminating and unfalsifiable.9 Bear in mind that manifolds can have different dimensionalities, and what is appar-

ent connectivity at some level is not at a lower level. This may be going on in “archi-pelago” or “silverware,” which are not contiguous in obvious dimensions, but areindeed temporally contiguous (which can be thought of in a fourth dimension). I donot know whether this extends to “country” or “furniture,” or whether there we needcontiguity at an even higher, perhaps modal, dimension.

10 I shall now ignore the element of, which need not be present in other realizations ofthe relevant expression, such as some Antony modes or Antony’s modes. The factthat all these are possible constitutes evidence for the syntax implied here, but I shallnot repeat the arguments I give in Chapter 15 for this.

11 That is, they are articulated manifolds built from progressively more complex spaces,all the way down to a more elementary space.

12 I read these last paragraphs again and they all sound like excuses. Therefore, I shallleave them. If I did not feel insecure about this, I would say that much of what I havedone here is in the spirit of Larson and Segal (1995) at the intentional level and Jack-endoff (1990) at the conceptual level. However, since I have pushed matters quite abit in directions that not even they have pursued, I am ready to take the blame gal-lantly, all by myself.

N O T E S

338

Page 350: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

13 PARATAXIS†This chapter was read at the Georgetown University Round Table on Linguistics (1995).We are grateful to the organizers of the event, especially Hector Campos, and theparticipants in the workshop for useful comments and suggestions.

1 The how we have in mind is not adverbial in the intended reading. Rather, it appearsin “he explained to us how we should never be late,” which is of course differentfrom “he explained how we should behave” where the mode of behaving is part ofthe explanation. There are no obvious modes of “being late.”

2 Ross (1967) discusses data like (i) as a surprising violation of his Complex NP Con-straint:

(i) Which company did you hear rumors/a rumor that they have squandered?

Uriagereka (1988a) presented this as evidence for the complement status of theclausal dependent of the noun. But while the facts are clear, it is far from obviouswhat they constitute evidence for in the present system.

3 We mark this sentence with an asterisk even though it is somewhat acceptable withan incomplete definite description interpretation for the predicate. The sentenceimproves as (i):

(i) That the earth is flat is the rumor that I heard.

The contrasts shown by Spanish in (ii) are significant in this respect:

(ii) a. El que la tierra es plana es el rumor que escuché.The that the earth is flat is the rumor that I heard.

b. *Que la tierra es plana es el rumor que escuché.That the earth is flat is the rumor that I heard.

(iia) is good, but notice that the CP is a DP: el CP. Hence, it seems as if (iia) involvesa relation between two nominals, and not a sentence and a nominal. In fact, it seemsas if the nominals in (ii) are both DPs, and the relation is equative. It may be the casethat the same is true about the English (i), even when the sentence in subject positiondoes not have the obvious form of a nominal. At any rate, when the relation betweenthe sentence and a nominal is clear (as in (iib)), the result is ungrammatical. It shouldperhaps be emphasized, also, that it is not the case that all Spanish sentences insubject position need to be nominal. Thus:

(iii) Que la tierra es redonda es verdad.That the earth is round is true

4 In (17), linear order is irrelevant, and is assumed to follow after the rule of Spell-out from Kayne’s (1994) LCA, in the version presented in Chomsky (1995b). Forour purposes, it is also irrelevant whether rumor or truth project as N’s or as D’s,and we will simply use an “X” to label whatever projection is relevant –admitting, even, that it may be different in each instance. The notation employedin (17) is taken from Chomsky (1995b). Note, in particular, that categorial labelsare underlined, and that the labels are different in each instance. In (17a), the labelis the object X. In contrast, in (17b) the label is the ordered pair �X,X�. Thisindicates that the merger in (17b) projects a segment, while the merger in (17a)projects a category. The rest of the information in the categorial notation in (17) isnot ordered. That is, the information coding the constituent sets is simply stating that,for instance, the label in (17a) is obtained by merging X and CP, in no particularorder.

5 Curiously, we need to it to make the sentence fully grammatical:

(i) ??That the earth is round has some truth.

N O T E S

339

Page 351: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

In Chapter 9, similar elements are observed in the context of mass terms, although inthose instances they are fully optional:

(ii) This ring has some gold (in it).

At this point, we simply note the fact, and have no deep explanation for it.6 These sorts of issues have been explored at length in Etxepare (1997).7 Although this of course is not necessary, it could be that in the present system como

is an unanalyzable item, which nonetheless targets a null D for checking reasons.8 Kitahara 1997 allows for LF-mergers which do not extend phrase structure.

However, his assumptions are not the same as those in Chomsky (1995b: Chapter 4),which we are using as our basic framework.

9 However, optimality is a matter of comparison, which entails that structures involv-ing radically null complementizers in matrix clauses must compete with structuresinvolving standard complementizers. Technically, this presupposes a common lexicalarray for both. Such a possibility exits only if we weaken the strong lexicalistapproach in Chapter 3 of Chomsky (1995b) to a version allowed in his Chapter 4approach, having incorporated the theoretical proposals in Lasnik (1995). Simplyput, we must distinguish items in the lexical array (technically a multi-set referred toas a numeration) from items in the lexicon proper. As it is standardly assumed, thelexicon is a repository of idiosyncrasies. Yet, lexical items make it into a numerationwith a considerable degree of systematic features. This indicates that such featuresare added upon considering the lexical item as a member of the numeration for syn-tactic computation. Then, the matter arises as to whether such features should orshould not constitute a mark of difference with respect to the reference set of deriva-tions for optimality considerations. The matter is currently under investigation, andpartial results suggest that, at least in some languages, such features may not forceentirely different reference sets. It is crucial for Chomsky’s reasoning about comple-mentizers that, if radically null ones outrank overt ones, such comparisons be part ofthe system.

10 In the spirit of the proposals in the 1980s, Chomsky (1995b), Chapter 3, still assumesthat strong Agr is associated to the presence of pro. Although this is not a necessaryassumption, we will pursue its logic to see where it leads.

11 Convergence involves legibility conditions at the interface. Nonetheless a derivationmay converge as gibberish in post-grammatical components, the assumption for thisinstance.

12 We still do not explain, though, why (32) is out with an emphatic reading, but accept-able otherwise, while (30) is simply out.

13 Note that whether an economy analysis is tenable within Chomsky’s (1995b) systemdepends on the details of the numeration. If do is its own separate lexical item, thenthere is no hope of getting the relevant examples above into a competition, simplybecause the sets of relevant derivations are not identical.

14 This is somewhat in the spirit of Fiengo and May (1994).15 This is under the substantive assumption that only derivations stemming from the

same lexical array are comparable. Once again, this is a topic of much debate (andrecall Notes 10 and 14).

16 This, incidentally, aligns directly with the description of the facts in Stowell (1981),whereby null complementizers in English are shown to appear in governed positionsonly. In current terms, we would say that these are precisely the sites from whichcliticization is possible (see Uriagereka (1988a) for an early version of this line).

17 The exclusion of the preverbal subject in (39a) is symptomatic that Spanish preverbalsubjects are outside IP altogether. In fact, under present assumptions, they will haveto be outside CP (so that they prevent the cliticization of the complementizer; seeBarbosa 1995). Assuming this, a preverbal subject is the signature of extra functionalmaterial, which should directly prevent cliticization across it. As expected, also, in

N O T E S

340

Page 352: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

languages where the overt subject does not appear in the periphery of the clause,complementizer cliticization over overt subjects is allowed. Hence, the grammatical-ity of the Italian example in (i) (see Torrego 1983):

(i) Voglio Sandro arrive domani.I want Sandro arrive tomorrow.

18 Quite a different picture should emerge, however, with verbs that establish ahypotactic relation with their clausal dependents. In general, there is no reason whythey should bar wh-movement across them, and this is in fact the case. Interestingly,though, we find asymmetries in instances of long wh-movement (which are dealt within Torrego 1983, within a different framework). Basically, long extraction of a wh-phrase forces all intervening complementizers to either be present, or else to all beabsent:

(i) a. ¿qué libro esperas que quieran que pueda conseguirte yo t?what book do you expect that they want that I may get?

b. ¿*qué libro esperas – quieran que pueda conseguirte yo t?*what book do you expect they want that I may get?

c. ¿*qué libro esperas que quieran – pueda conseguirte yo t?*what book do you expect that they want I may get?

d. ¿que libro esperas – quieran – pueda conseguirte yo t?what book do you expect they want I may get?

Within the minimalist framework, there is an immediate approach to this sort ofasymmetry. Movement is driven by feature-checking. The overt movement of thewh-phrase in the sentences in (i) will have to be successive cyclic. If every time a wh-phrase enters into a CP, it does so attracted by a strong feature. Suppose this is thecase. It is natural to assume (in fact, necessary, to motivate cliticization) that the fea-tures of the overt complementizer are not the same as the features of the cliticizedversion. In (i), the wh-phrase that checks successive cyclic wh-features is one and thesame. Since each link of the chain has to be attracted by a strong feature, it followsthat one and the same wh-phrase cannot be attracted by two different features. Tech-nically, it could be the case that each Comp happens to attract the wh-phrase forentirely different reasons. However, this would have to be argued for, and we knowof no empirical evidence that would support such a claim.

19 It is common in various languages (e.g. in the Balkan peninsula) to have differentcomplementizers for each sort of structure, and allow extraction only of hypotacticcontexts (see Spanish data below and Note 20). An interesting question is whathappens in instances where wh-extraction is more or less marginally possible, at leastin some languages, across paratactic domains. We suspect that other strategies areinvolved in this instance, including the licensing of parasitic gap structures of therough form in (i), where the italicized element is perhaps either a clitic or not pro-nounced at all:

(i) Of whom do you think that Mary likes ’im.

(cf. Whom do you think that Mary likes)

Pursuing this idea here would take us too far afield, especially given structures of theform in (ii), where a parasitic gap analysis should not be possible:

(ii) Why do you think that Mary left?

Then again, this sort of structure is known to be somewhat exotic, and it certainlydegrades easily with further embedding or extraction across domains which do notagree in tense:

(iii) a. Why do you think that Mary said that Peter left?b. Why will you say that Mary has left?

N O T E S

341

Page 353: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

These examples are odd with deeply embedded reading for why, suggesting thatsomething peculiar is at stake in the acceptable (ii), perhaps related to the eventstructure of the expression.

20 The verb decir, when followed by a subjunctive (otherwise necessary for “Neg-raising” and polarity licensing) takes on an imperative flavor. Under this interpreta-tion, the raising is perfect with the complementizer, but not without it.

14 DIMENSIONS OF NATURAL LANGUAGE

1 Or put another way, our modern system for representing numbers (as governed bythe relevant rules for transforming those representations) encodes far more thanlinear order; “�/2” describes that real number as � divided by 2. In this sense, “�/2”represents the rational number in question as a “mirror” of 2�, since 2�(�/2)�1.Likewise, “�1” represents the negative number in question as a “mirror” of “1,”since 1�1�0, where “0” is a label for the number whose successor is “1.”

2 In particular, through the creation of a loop (responsible for pronouncing very) whenhitting some particular state (e.g. one pronouncing the word long).

3 Involving recursive rules such as S→ S and/or/but S, or recursive systems of rulessuch as S→ NP VP, VP→ V S.

4 For defense of the general neo-Davidsonian “eventish” framework in which suchaccounts of causatives are embedded, see Davidson (1967a, 1985); Higginbotham(1983a, 1985, 2000); Taylor (1985); Parsons (1990); Schein (1993); Pietroski (1998,forthcoming a, b); Higginbotham, Pianesi and Varzi (2000); Herburger (2000), etc.

5 Feinberg (1965); Davidson (1967a); Goldman (1970); Thomson (1971, 1977); Thal-berg (1972); etc. See Costa (1987) for a review. See Pietroski (1998, 2000), forthcom-ing b for discussion.

6 There are, no doubt, questions about the metaphysics of events lurking here. And weare committed to denying an unrestricted mereology principle according to which thefusion of Pat’s action and the subsequent boiling would itself be an event with Pat asAgent and the soup as Theme. But we see no reason for thinking that theories ofmeaning for natural language must adopt such profligate principles concerning whichevents speakers tacitly quantify over when using eventish constructions like (8).

7 This can be made rather precise if event quantification is restricted (Herburger 2000),and typically quantifier restrictions are contextually confined by speakers. Appar-ently, context confinement has some syntax to it, and is not completely free (seeCastillo 2001 on this).

8 Perhaps it ought to be emphasized that the readings have nothing to do with(in)alienability. Again, both hearts in question are inalienably the patient’s, but it isthe one in the chest that makes the difference.

9 In essence these two approaches correspond to whether lexico-conceptual structuresoccupy a separate syntactic component of the system or are rather a mere aspect ofsome other level.

10 These concerns are no different from the ones presented in Muromatsu (1998),Kamiya (2001) or Castillo (2001) for nominal expressions, which make all of themargue for a dimensional view.

11 Within Montague semantics �-roles are not necessary at all, to start with. So theirlexical absence could perhaps be explained as total absence. At the same time, a lan-guage system without �-roles would be empirically adequate in other respects but wewould have no way of capturing familiar, observable hierarchies. (Moreover, a tradi-tional Montagovian semantics requires a more widespread appeal to type-lifting inorder to accommodate adjuncts.) So our full argument is this. We need roles, yet theyare never pronounced, and this is because they are elements outside the system.

12 Collins (2001) deduces labels in syntactic objects from more basic notions. This is finewith us, although it is important to clarify an idea that is often confused. The fact that

N O T E S

342

Page 354: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

something is deduced does not mean that it does not exist at some level. The moon’sexistence follows from the laws of gravity when applied to celestial objects resultingfrom asteroid collisions with earth. But the moon exists. Indeed, its presence iscrucial in determining tides, and with them, for instance, the emergence of life. Simi-larly for labels, deduced or not, they exist, and they are crucial, for instance, in the“reprojection” process studied in Chapter 6. When we say below that adjuncts do nothave labels, we do not mean that labels are deduced for them. In that instance theytruly do not exist.

13 That itself is telling with regards to the syntax of adjuncts. It is as if they were notthere, at least with regard to aspects of the grammar that care about formal proper-ties.

14 Lasnik and Uriagereka (forthcoming, Chapter 8) show how difficult it is to constructan argument for an unlimited class of words, similar to the one existing in the liter-ature for an infinite class of sentences.

15 In a certain sense it surely is not, if human I-languages are not formal E-languages.16 Here and throughout, we use quote marks (instead of corner quotes) when talking

about schemata involving variables ranging over expressions. For our purposes, thissimplification will do no harm.

17 Note that order matters for “∧” but not for “#” or “�.”18 Alternatively, one can enter a stipulation (ad hoc from a syntactic perspective) that

such inscriptions are not legitimate symbols of the language. One must also be pre-pared to leave certain well-formed symbols, like “%(*1, *1),” undefined on pain ofparadox.

15 WARPS

†I would first of all like to thank those students who have been “crazy” enough to evenlisten to these arcane ideas, especially those who put their thesis at risk because of them:Aitziber Atutxa, José Luis Ariel-Méndez, Tonia Bleam, Cedric Boeckx, Juan CarlosCastillo, Rikardo Etxepare, Analía García, Enrique López-Díaz, Nobue Mori, KeikoMuromatsu, Lucía Quintana. Without their work, these pages would have been worth-less. I am grateful also to my friends of the La Fragua group, who participated in most ofthe discussions behind this system, and Pablo Bustos even pursued the visual connectionwith the help of Felix Monasterio. I would like to thank also my colleagues NorbertHornstein and Sara Rosen; it was quite a challenge to put an intentional and a concep-tual semanticist literally on the same page, but I at least have benefited tremendouslyfrom that clash of insight, which I have developed in the present direction (knowingly inways which neither Norbert nor Sara probably care to endorse, and which should not beblamed on them either). My psycholinguistic babbling would not have been possiblewithout the supervision of Stephen Crain and Rozz Thornton, who presented it to me in“parentese” fashion; again, no blame should go to them for my errors. I had the privilegeof discussing many of these issues with Ian Roberts during my stay in England a coupleof years ago, particularly the syntactic connection that I have not really developed here.The pleasure of (too) many beers with him might account for part of their radicalness.Finally, I thank everyone who attended lectures on these topics in various places, toonumerous to mention, the editors of working papers where versions of this work cameout, and lastly Frank Beckmann for his unrelenting enthusiasm and support, which aredirectly responsible for the present publication. The research behind it was partly sup-ported by NSF grant BCS-9817569.

1 See for instance volume 29, number 2, or volume 30, number 3.2 See for instance the type of model in Dowty, Wall and Peters (1981). A “meaning

postulate” would relate elements in the individual ontology to the correspondingmass.

N O T E S

343

Page 355: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

3 This is a point that Chomsky often emphasizes, for instance in (1993a) and (1994).Philosophers do not usually even take the question seriously, although see Atlas(1989).

4 The view about causal relations with regards to the relevant events is not uncommon.See Pietroski (2000) for a general discussion of these matters.

5 This kind of approach is common within the framework of Category Theory (seeBarr and Wells (1995) for a useful introduction). Here I am using sets (whenever Ican) in order to avoid a more arcane discussion.

6 I have put the word “larger” in scare quotes because, of course, the size of the sets isthe same. They are all infinite; yet it is still true that one set meaningfully generatesthe other, and not vice versa, and in that sense we can conceive of the generating setas “smaller.”

7 Playing might count as a four-dimensional object, particularly if it is not given a setamount of time. It is arguable at least that what gives play its fourth dimension ismessing around in three dimensions, usually carrying an object from here to there byway of some elaborate patterns. I owe most reflections about games at this quasi-theoretical level to a conversation with Ignacio Bosque. One other relevant examplemight be dancing, since the entity obtained through the dance in three-dimensionalspace clearly incorporates a crucial temporal dimension. Similar considerations applyto hunting. Baking (cakes, ceramic, etc.) might also be relevant, or for that matterany thermodynamic process that incorporates some feedback of energy in order toproduce some kind of result (most kitchen tasks, but also agricultural activities).

8 There are many introductions to Goedel’s proof (the classic being Nagel andNewman 1958), but nothing nearly as delightful as Hofstadter (1979).

9 For Star Trek fans, the trick in (5) is a warp in the “technical” sense of the series.That is, somehow ordinary space-time is warped in such a way that a spacecrafttravels from point a to point b, distant light years, through the “approximation” thatthe warp creates, thus much faster. That is warp 1. Once you get into that kind ofsuper space, if you warp it again, you take the next short cut, warp 2, etc. (see Figure(7a)). Your speed grows logarithmically. Unfortunately, this is all physical nonsense,not to speak of the philosophical paradoxes it would create if possible. For a veryinstructive discussion of these issues see Kraus (1995).

10 It might be thought that the feature in question is not linguistic. This is far fromobvious, though, given present assumptions. Surely duck starts with a [d], whereasgoose starts with a [g], and that difference (plus the other phonetic distinctions) con-tributes to the fact that we take the two terms to be different. Roughly, this is true:

(i) Different phonetic forms in N correspond to different referents or at leastdifferent modes of presentation for entities intentionally associated to N.

11 Philosophers will not be satisfied with this, since they are concerned with “ultimatereference,” as opposed to what I like to think of as “ordinary reference.” Thus, itcould be argued that Jones might see (what I think of as) a duck and either confusethe length of its neck, or have a problem in his visual system so that he sees thingslonger, or any such concoction. Then is he referring to the same thing I refer to whenI say “duck”? Perhaps not, but this is, so far as I can see, completely orthogonal toordinary reference. What I expect my intentional theory to tell me is how humansnormally succeed in picking out ducks when saying “duck.” There might be confu-sions, mistakes, lies and so on, but just as there are aberrations in any other realm ofnatural science, still, we study the norm. (Which may or may not be accounted for inJackendoff’s terms, by invoking the visual, or other systems, assuming they work simi-larly in different humans, as is to be expected.) Needless to say, rather different ques-tions arise if one is not interested in what humans do when they talk to each other,but is rather concerned with when they reflect about science, philosophy and the like.There what one means by “atom” or “V” and so on is a theoretical issue, and I do

N O T E S

344

Page 356: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

not think that Jackendoff’s suggestion is of any help. At the same time, I do not seethat there is any point in considering that question from the semantics of natural lan-guage perspective. See Uriagereka (1998: Chapter 2) on these issues.

12 This idea was inspired by fruitful discussion with other members of the La Fraguagroup, in its third meeting in the mountains of Toledo. I am particularly grateful toPablo Bustos, Felix Monasterio and Juan Romero.

13 This part of the point I am making is, again, well explained in Hofstadter’s book. Therest of the point can be appreciated in the useful commentary by Putnam (1983).

14 I have suggested this kind of meta-theoretical approach in Chapter 8 for the LF sideof the grammar (by way of the Transparency Thesis), and in Chapter 3 for the PFside (in attempting to deduce the base step of Kayne’s LCA from economy consider-ations).

15 See Hornstein (1995a) for a useful discussion of the differences between “composi-tionality” and “systematicity,” the latter being a necessary property of a communica-tion system, but certainly not the former. Larson and Segal (1995) is one of the fewplaces I know where compositionality is explicitly taken to be an empirical thesis.

16 I believe this general criticism obtains of all well-known treatments in the literature.17 See Langacker (1987), and for the generative semantics source, see Newmeyer

(1980).18 Gil (1987) does address the question from a traditional semantic perspective, and

comes up with an admittedly Worfian proposal. Humans have (at least) two modes ofcognition. The perspective I am advocating, explicitly defended by Muromatsu withregards to classifiers, is a priori more reasonable. All languages have a classifier system(overt or covert). But that has semantic consequences of the sort I am now outlining.

19 For a useful introduction to topological matters of the sort implied here, see forinstance Gemignani (1967).

20 Incidentally, we are talking about concepts here, but if one considers how form actu-ally emerges in the natural world, exactly the same conclusions apply. One only hasto think of the growth of a baby from a cell to a tube to something with various axesand aggregating tissues, which fatefully crumbles upon death. It is within the realm ofreasonable possibility that natural form, whether in a physical sense or internal toour cognitive capacities, must be of this sort. From that perspective, it would not beunthinkable that the cognitive capacity reflects the nature of reality itself after all it isa part of it. See the last section on these questions.

21 Generally, these processes are presented in terms of operations on the conceptualunits. For instance, Jackendoff (1991) proposes a “grinding” mechanism.

22 This is important. A “grinding” mechanism of the sort proposed by Jackendoff wouldgive the wrong meaning here.

23 The notion “generate readings” is borrowed from Szabolcsi and Zwarts’s (1993)concept of a “generator”; see also Hornstein (1995a). Muromatsu (1998) preventsany form of quantification from appearing in lower dimensionalities. I actually thinkthat is too strong, since the lowest-dimensional concepts can be quantified by very,and certainly mass terms can too, with much and few. However, I suspect Muromatsuis essentially right about her guess, although it should be restricted to bona fide gen-eralized quantification, which for some reason necessitates the highest dimensions.This is perhaps because in those is where individuals arise, and thus correspondingsets, necessary for generalized quantification. Evidently, the idea of set-demandingquantifiers must be connected with “generators” of the Szabolcsi/Zwart sort.

24 How that is done is a very difficult question, for obviously a plurality is comprised ofindividuals yet plural expressions behave in some sense as masses. In other words,the question is, “What do we do in order to conceive a plurality of individuals as anotion which we generally think of as being presupposed in the understanding of

N O T E S

345

Page 357: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

individuals?” It would seem as if the plural morpheme allows for a kind of “loopback” from higher to lower dimensionalities.

25 Similar questions obtain for the other lexical categories, but I will concentrate on themajor cut, setting the others aside.

26 For instance, I do not know what sorts of values nominal spaces should be conceivedof as having, more or less, depending on degree of a quality, amount of mass, numberof elements, all of the above (hence in the general case something more abstract thatranges over all of them). Presumably that is the way to go, since after all it mattersfor bounding the event whether there is, say, this much or that much of beer. Butdoes language care about exactly how much or it simply expresses that there is acertain amount, and leaves the rest unspecified? I suspect the latter is the case, but itdoes not matter for my general purposes now, since all those possibilities wouldwork.

27 “What are event boundaries for a verbal topology?” or “What are the atoms of averbal lattice?,” for example, are immediate hard questions.

28 Regardless of the fact that wine is, strictly, a living entity, again, the issue is notreality, but abstract spaces, where wine is conceived of as a low-dimensional lattice,while people and other animate entities are conceived as very complex topologieswith change potential.

29 One other place where the permanent vs. mutable comes out is the rigidity of namesvs. the flexibility of descriptions. Interestingly, nouns denote kinds, which have beenargued to be rigid (a noun could be seen as the name of a kind). In contrast verbswith their arguments do not obviously denote kinds, and can be thought of as thedescription of an event.

30 As noted by Chomsky (1965: footnote 15), who extends Russell’s observationsbeyond logical proper names, and raises related issues about the internal structure ofobjects with parts (e.g. when a cow breaks a leg we cannot conclude that a herdbreaks a leg).

31 There are potential counterexamples to this, but they may have a simple explanation.For instance, silverware, china or a forest, are non-continuous notions, but thesewould seem to be usable as mass terms, and then the discontinuous elements arenothing but the atoms of the corresponding lattice. At that atomic level everything isdiscontinuous, trivially.

32 This is intended in the spirit of the puzzle posed in Quine (1960, Chapter 2), althoughof course he was not speaking of language acquisition by children.

33 I do not care right now whether “rabbit” is 3D or whatever, so long as its dimension-ality is higher than the one required for “fur.”

34 On these issues, see Markman (1989), especially Chapter 2.35 See Markman and Wachtel (1988). I am indebted to Stephen Crain for discussing

these matters, and providing me with the observation about event vs. state interpre-tations. For much relevant information and various references, see Crain and Thorn-ton (1998).

36 The fact that children acquire names at about the same time they do everything elsedoes not allow us to decide among these two alternatives. As I said before, we need amuch more controlled environment. In this instance, whether a child would take thedescription of something as its name, I am predicting that this is not the case.

37 It is in fact somewhat unclear whether those two sorts of notions can be grammati-cally distinguished. For instance, although I know of languages that limit certainprocesses to animate expressions, I know of no language that has a grammatical mor-pheme to code them (equivalent to a noun classifier or a plurality marker).

N O T E S

346

Page 358: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

BIBLIOGRAPHY

Alexiadou, A. and C. Wilder (eds) (1998) Possessors, Predicates and Movement in theDeterminer Phrase, Amsterdam: Benjamins.

Altube, S. (1929) Erderismos, San Sebastián: Gaubeka.Ambar, M. M. (1992) Para uma sintaxe da inversão sujeito-verbo em português, Lisbon:

Colibri.Aoshima, S., J. Drury and T. Neuvonen (eds) (1999) University of Maryland Working

Papers in Linguistics 8, College Park, MD.Atlas, J. D. (1989) Philosophy without Ambiguity, Oxford: Oxford University Press.Atutxa, A. (forthcoming) “Aktionsart: from the simplest to the most complex,” unpub-

lished manuscript, University of Maryland.Authier, J.-M. (1988) The Syntax of Unselective Binding, unpublished PhD dissertation,

University of Southern California, Los Angeles.Baker, M. (1988) Incorporation, Chicago: University of Chicago Press.—— (1996) “On the structural positions of themes and goals,” in J. Rooryck and L.

Zaring (eds), 7–34.—— (1997) “Thematic roles and syntactic structure,” in L. Haegeman (ed.) Elements of

Grammar, Dordrecht: Kluwer, 73–137.Barbosa, P. (1995) Null Arguments, unpublished PhD dissertation, MIT.Barlow, M. and C. Fergusson (1988) Agreement in Natural Language, Chicago: Univer-

sity of Chicago Press.Barr, M. and C. Wells (1995) Category Theory, New York: Prentice Hall.Benveniste, E. (1966) Problèmes de Linguistique Générale, Paris: Gallimard (trans.

(1971) Problems in General Linguistics, Coral Gables, FL: University of Miami Press).Bleam, T. (1999) “The syntax of clitic doubling in leista Spanish,” unpublished PhD dis-

sertation, University of Delaware.Bobaljik, J. D. (1995) “Morphosyntax: the syntax of verbal inflection,” unpublished PhD

dissertation, MIT.Bobalijk, J. D. and S. Brown (1997) “Inter-arboreal operations: head-movement and the

extension requirement,” Linguistic Inquiry 28: 345–56.Bonet, E. (1991) “Morphology after syntax: pronominal clitics in romance,” unpublished

PhD dissertation, MIT.Boolos, G. and R. Jeffrey (1980) Computability and Logic, Cambridge: Cambridge Uni-

versity Press.Bouchard, D. (1984) On the Content of Empty Categories, Dordrecht: Foris.Bowers, J. (1992a) “Extended X� theory, the ECP, and the left branch condition,” Pro-

ceedings of the 7th West Coast Conference on Formal Linguistics.

347

Page 359: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

—— (1992b) “The structure of stage and individual level predicates,” unpublished manu-script, Cornell University, Ithaca, NY.

Bresnan, J. (1971) “Sentence stress and syntactic transformations,” Language 47.2:257–81.

—— (1994) “Locative inversion and the architecture of Universal Grammar,” Language70.1: 72–131.

Brody, M. (1990) “Some remarks on the focus field in Hungarian,” UCL Working Papersin Linguistics 2, London, 201–26.

—— (1995) Lexico-Logical Form: A Radical Minimalist Theory, Cambridge, MA: MITPress.

Burge, T. (1973) “Reference and proper names,” Journal of Philosophy 70: 425–39.—— (1974) “Demonstrative constructions, reference, and truth,” Journal of Philosophy

71: 205–23.—— (1975) “Mass terms, count nouns and change,” Synthese 31, reprinted in F. Pelletier

(ed.) (1979) Mass Terms: Some Philosophical Problems, Dordrecht: Reidel.Burzio, L. (1986) Italian Syntax: A Government-Binding Approach, Dordrecht: Kluwer.—— (1996) “The role of the antecedent in anaphoric relations,” in R. Freidin (ed.).—— (2000) “Anatomy of a generalization,” in E. Reuland (ed.) Arguments and Case:

Explaining Burzio’s Generalization, Amsterdam: Benjamins.Campos, H. and P. Kempchinsky (eds) (1995) Evolution and Revolution In Linguistic

Theory: Essays in Honor of Carlos Otero, Washington, DC: Georgetown UniversityPress.

Cardinaletti, A. and M. T. Guasti (eds) (1995) Small Clauses, New York: AcademicPress.

Carlson, G. (1977) “Reference to kinds in English,” unpublished PhD dissertation, Uni-versity of Massachusetts, Amherst.

—— (1984) “Thematic roles and their role in semantic interpretation,” Linguistics 22:259–79.

Castañeda, H. (1967) “Comments,” in N. Rescher (ed.) The Logic of Decision andAction, Pittsburgh: University of Pittsburgh Press.

Castillo, J. C. (1998) “The syntax of container/content relations,” in E. Murgia, A. Piresand L. Quintana. (eds) University of Maryland Working Papers in Linguistics 6,College Park, MD.

—— (1999) “From Latin to Romance: the tripartition of pronouns,” in S. Aoshima, J.Drury and T. Neuvonen (eds) 43–65.

—— (2001) “Possessive relations and the syntax of noun phrases,” unpublished PhD dis-sertation, University of Maryland, College Park.

Castillo, J. C., J. Drury and K. Grohmann (1999) “The status of the merge over movepreference,” in S. Aoshima, J. Drury and T. Neuvonen (eds) 66–103.

Cattell, R. (1976) “Constraints on movement rules,” Language 52: 18–50.Chierchia, G. (1986) “Individual level predicates as inherent generics,” unpublished

manuscript, Cornell University.—— (1992) “Anaphora and dynamic binding,” Linguistics and Philosophy 15: 111–83.Chomsky, N. (1955) “The logical structure of linguistic theory,” unpublished PhD disser-

tation, University of Pennsylvania (published (1975), Chicago: University of ChicagoPress).

—— (1964) Current Issues in Linguistic Theory, The Hague: Mouton.—— (1965) Aspects of the Theory of Syntax, Cambridge, MA: MIT Press.—— (1972) Studies in Semantics in Generative Grammar, The Hague: Mouton.

D E R I V A T I O N S

348

Page 360: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

—— (1977a) Essays on Form and Interpretation, North Holland.—— (1977b) “On wh-movement,” in P. Culicover, T. Wasow and A. Akmajian (eds)

Formal Syntax, New York: Academic Press.—— (1981) Lectures on Government and Binding, Dordrecht: Foris.—— (1982) Some Concepts and Consequences of the Theory of Government and Binding,

Cambridge, MA: MIT Press.—— (1986a) Barriers, Cambridge, MA: MIT Press.—— (1986b) Knowledge of Language: Its Nature, Origin and Use, New York: Praeger.—— (1993a) Language and Thought, Wakefield, RI: Moyer Bell.—— (1993b) “A minimalist program for linguistic theory,” in K. Hale and S. J. Keyser

(eds), 1–52 (reprinted in Chomsky 1995b: Chapter 3).—— (1994) “Language and nature,” Mind 104: 1–61.—— (1995a) “Bare phrase structure,” in G. Webelhuth (ed.) Government and Binding

Theory and the Minimalist Program, Oxford: Blackwell, also in H. Campos and P.Kempchinsky (eds).

—— (1995b) The Minimalist Program, Cambridge, MA: MIT Press.—— (1995c) “Categories and transformations,” in Chomsky 1995b: Chapter 4.—— (2000) “Minimalist inquiries: the framework,” in R. Martin, D. Michaels and J.

Uriagereka (eds) Step by Step: Essays on Minimalist Syntax in Honor of HowardLasnik, Cambridge, MA: MIT Press, 89–155.

Chomsky, N. and H. Lasnik (1993) “The theory of principles and parameters,” in J.Jacobs, A. von Stechow, W. Sternefeld and T. Vennemann (eds) Syntax: An Inter-national Handbook of Contemporary Research, Berlin: Walter de Gruyter (reprintedin Chomsky 1995b: Chapter 1).

Chung, S. (1994) “wh-agreement and referentiality in Chamorro,” Linguistic Inquiry 25:1–44.

Chung, S. and J. McCloskey (1987) “Government, barriers, and small clauses in ModernIrish,” Linguistic Inquiry 18.

Cinque, G. (1993) “A null theory of phrase and compound stress,” Linguistic Inquiry24.2: 239–97.

—— (1999) Adverbs and Functional Heads: A Cross-linguistic Perspective, New York:Oxford University Press.

Cole, P. and L. Sung (1994) “Head-movement and long-distance reflexives,” LinguisticInquiry 25.3: 355–406.

Collins, C. (2001) “Eliminating labels and projections,” unpublished manuscript, CornellUniversity.

Contreras, H. (1984) “A note on parasitic gaps,” Linguistic Inquiry 15: 704–13.Corver, N. and D. Delfitto (1993) “Feature asymmetry and the nature of pronoun move-

ment,” paper presented at the GLOW Colloquium, Lund.Costa, M. (1987) “Causal theories of action,” Canadian Journal of Philosophy 17:

831–54.Crain, S. and R. Thornton (1998) Investigations in Universal Grammar: A Guide to

Experiments on the Acquisition of Syntax and Semantics, Cambridge, MA: MIT Press.Davidson, D. (1967a) “The logical form of action sentences,” reprinted in D. Davidson

1980.—— (1967b) “On saying that,” reprinted in Inquiries into Truth and Interpretation,

Oxford: Clarendon Press (1984).—— (1980) Essays on Actions and Events, Oxford: Oxford University Press.—— (1985) Adverbs of Action, in B. Vermazen and M. Hintikka (eds).

B I B L I O G R A P H Y

349

Page 361: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Davies, W. D. (2000) “Against long movement in Madurese,” paper presented at the 7thMeeting of the Austronesian Formal Linguistics Association, Amsterdam.

De Hoop, H. (1992) “Case configuration and noun phrase interpretation,” unpublishedPhD dissertation, University of Groningen.

De Jong, F. (1987) “The compositional nature of (in)definiteness,” in E. Reuland and A.ter Meulen (eds), 270–85.

den Dikken, M. (1992) “Particles,” unpublished PhD dissertation, Holland Institute ofGenerative Linguistics.

Diesing, M. (1992) Indefinites, Cambridge, MA: MIT Press.Doherty, C. (1992) “Clausal structure and the Modern Irish copula,” unpublished manu-

script, UCSC.Dowty, D., R. Wall and S. Peters (1981) Introduction to Montague Semantics, Dordrecht:

Reidel.Drury, J. (1998) “Root first derivations: Multiple Spell-Out, atomic merge, and the cores-

idence theory of movement,” unpublished manuscript, University of Maryland,College Park.

Epstein, S. D. (1999) “Un-principled syntax and the derivation of syntactic relations,” inS. D. Epstein and N. Hornstein (eds), 317–45.

Epstein S. D. and N. Hornstein (1999) Working Minimalism, Cambridge, MA: MITPress.

Epstein, S. D. and D. Seely (2002) Transformations and Derivations, Cambridge: Cam-bridge University Press.

Ernst, T. (2001) The Syntax of Adjuncts, Cambridge: Cambridge University Press.Etxepare, R. (1997) “The syntax of illocutionary force,” unpublished PhD dissertation,

University of Maryland, College Park.Feinberg, J. (1965) “Action and responsibility,” in M. Black (ed.) Philosophy in America,

Ithaca: Cornell University Press.Fiengo, R. and R. May (1994) Indices and Identity, Cambridge, MA: MIT Press.Finer, D. (1985) “The syntax of switch reference,” Linguistic Inquiry 16.1: 35–55.Fodor, J. (1970) “Three reasons for not deriving kill from cause to die,” Linguistic

Inquiry 1: 429–38.—— (1983) The Modularity of Mind, Cambridge, MA: MIT Press.Fodor, J. and E. Lepore (1998) “The emptiness of the lexicon: reflections on James

Pustejovsky’s The Generative Lexicon,” Linguistic Inquiry 29.2: 269–88.—— (forthcoming) “Morphemes matter,” unpublished manuscript, Rutgers University.Fodor, J. D. and I. Sag (1982) “Referential and quantificational indefinites,” Linguistics

and Philosophy 5.Freeze, R. (1992) “Existentials and other locatives,” Language 68: 553–95.Freidin R. (ed.) (1996) Current Issues in Comparative Grammar, Dordrecht: Kluwer.—— (1997) “Chomsky: the minimalist program,” Language 73.3: 571–82.Fukui, N. (1996) “On the nature of economy in language,” Cognitive Studies 3.1: 51–71.Fukui, N. and M. Speas (1987) “Specifiers and projection,” in N. Fukui, T. Rapoport and

E. Sagey (eds) MIT Working Papers in Linguistics 8: Papers in Theoretical Linguistics,128–72.

García Bellido, A. (1994) “Towards a genetic grammar,” paper presented at the RealAcademia de Ciencias Exactas, Físicas, y Naturales, Madrid.

Gemignani, M. (1967) Elementary Topology, New York: Dover.Gil, D. (1987) “Definiteness, noun phrase configurationality, and the count-mass distinc-

tion,” in E. Reuland and A. ter Meulen (eds).

D E R I V A T I O N S

350

Page 362: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Goldman, A. (1970) A Theory of Human Action, Princeton, NJ: Princeton UniversityPress.

Gould, S. J. (1991) “Exaptation: a crucial tool for evolutionary psychology,” Journal ofSocial Issues 47.3: 43–65.

Guerón, J. (1980) “On the syntax and semantics of PP extraposition,” Linguistic Inquiry 11.Haken, H. (1983) Synergetics: An Introduction, Berlin: Springer.Hale, K. and S. J. Keyser (1993) “On argument structure and the lexical expression of

syntactic relations,” in K. Hale and S. J. Keyser (eds) (1993) The View from Building20: Essays in Honor of Sylvain Bromberger, Cambridge, MA: MIT Press, 53–110.

Halle, M. and A. Marantz (1993) “Distributed morphology and the pieces of inflection,”in K. Hale and S. J. Keyser (eds), 111–76.

Heim, I. (1982) “The semantics of definite and indefinite noun phrases,” unpublishedPhD dissertation, University of Massachusetts, Amherst.

Herburger, E. (1993a) “Focus and the LF of NP quantification,” paper presented atSALT III.

—— (1993b) “Davidsonian decomposition and focus,” unpublished manuscript, UCSC.—— (1997) “Focus and weak noun phrases,” Natural Language Semantics 5.1: 53–78.—— (2000) What Counts, Cambridge, MA: MIT Press.Higginbotham, J. (1983a) “The logical form of perceptual reports,” Journal of Philo-

sophy 80: 100–27.—— (1983b) “A note on phrase-markers,” Revue Québecoise de Linguistique 13.1:

147–66.—— (1985) “On semantics,” Linguistic Inquiry 16: 547–93.—— (1987) “Indefiniteness and predication,” in E. Reuland and A. ter Meulen (eds),

43–70.—— (1988) “Contexts, models, and meaning,” in R. Kempson (ed.) Mental Representa-

tions: The Interface between Language and Reality, Cambridge: Cambridge UniversityPress.

—— (2000) “On events in linguistic semantics,” in J. Higginbotham, F. Pianesi and A.Varzi (eds).

Higginbotham, J., F. Pianesi and A. Varzi (eds) (2000) Speaking of Events, Oxford:Oxford University Press.

Hoffman, J. (1996) “Syntactic and paratactic word-order effects,” unpublished PhD dis-sertation, University of Maryland, College Park.

Hofstadter, D. R. (1979) Godel, Escher, Bach, New York: Vintage.Honcoop, M. (1998) “Excursions in dynamic binding,” unpublished PhD dissertation,

Leiden University.Horn, L. (1989) A Natural History of Negation, Chicago: University of Chicago Press.Hornstein, N. (1993) “Expletives: a comparative study of English and Icelandic,” unpub-

lished manuscript, University of Maryland, College Park.—— (1995a) Logical Form: From GB to Minimalism, Oxford: Blackwell.—— (1995b) “Putting truth into Universal Grammar,” Linguistics and Philosophy 18.4:

381–400.—— (2001) Move: A Minimalist Theory of Construal, Oxford: Blackwell.Hornstein, N. and J. Nunes (1999) “Asymmetries between parasitic gap and across-the-

board extraction constructions,” unpublished manuscript, University of Maryland,College Park and University of Campinas.

Huang, C.-T. J. (1982) “Logical relations in Chinese and the theory of grammar,” unpub-lished PhD dissertation, MIT.

B I B L I O G R A P H Y

351

Page 363: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Iatridou, S. (1990) “About Agr(P),” Linguistic Inquiry 21.4: 551–77.Jackendoff, R. (1972) Semantic Interpretation in Generative Grammar, Cambridge, MA:

MIT Press.—— (1982) “The universal grinder,” in B. Levin and S. Pinker (eds) 1991.—— (1990) Semantic Structures, Cambridge, MA: MIT Press.—— (1991) “Parts and boundaries,” in: B. Levin and S. Pinker (eds) Lexical and Con-

ceptual Semantics, Oxford: Blackwell.Jaeggli, O. and K. Safir (eds) (1989) The Null Subject Parameter, Dordrecht: Kluwer.Jelinek, E. (1984) “Empty categories, Case, and configurationality,” Natural Language

and Linguistic Theory 2: 39–76.Kahn, D. (1995) Topology: An Introduction to the Point-set and Algebraic Areas, New

York: Dover.Kaisse, E. (1985) Connected Speech: The Interaction of Syntax and Phonology, New

York: Academic Press.Kamiya, M. (2001) “Dimensional approach to derived nominals,” UMD general paper.Kamp, H. (1984) “A theory of truth and semantic interpretation,” in J. Groenendijk, T.

Jannssen and M. Stokhof (eds) Truth, Interpretation, and Information: Selected Papersfrom the Third Amsterdam Colloquium, Dordrecht: Foris.

Kayne, R. (1984) Connectedness and Binary Branching, Dordrecht, Foris.—— (1991) “Romance clitics, verb movement, and PRO,” Linguistic Inquiry 22.4:

647–86.—— (1993) “Toward a modular theory of auxiliary selection,” Studia Linguistica 47.1.—— (1994) The Antisymmetry of Syntax, Cambridge, MA: MIT Press.—— (1997) “Constituent structure and quantification,” unpublished manuscript, CUNY.Keenan, E. (1987) “A semantic definition of ‘indefinite NP’ ,” in The Representation of

(In)definiteness, (ed.) E. Reuland and A. ter Meulen, 286–317. Cambridge, MA: MITPress.

Keenan, E. and Y. Stavi (1986) “A semantic characterization of natural language deter-miners,” Linguistics and Philosophy 9: 253–326.

Kempchinsky, P. (1986) “Romance subjunctive clauses and logical form,” unpublishedPhD dissertation, UCLA.

Kim, K.-S. (1998) “(Anti-)connectivity,” unpublished PhD dissertation, University ofMaryland, College Park.

Kim, S. W. (1991) “Scope and multiple quantification,” unpublished PhD dissertation,Brandeis University, Waltham, MA.

Kiss, K. E. (1987) Configurationality in Hungarian, Dordrecht: Kluwer.—— (ed.) (1995) Discourse Configurational Languages, Oxford: Oxford University

Press.Kitahara, H. (1993) “Deducing superiority effects from the shortest chain requirement,”

in H. Thráinsson, S. D. Epstein and S. Kuno (eds) Harvard Working Papers in Linguis-tics 3, Harvard University, Cambridge, MA.

—— (1994) “Target Alpha: a unified theory of movement and structure-building,”unpublished PhD dissertation, Harvard University, Cambridge, MA.

—— (1997) Elementary Operations and Optimal Derivations, Cambridge, MA: MITPress.

Kratzer, A. (1988) “Stage-level and individual-level predicates,” unpublished manu-script, University of Massachusetts, Amherst.

Kraus, L. (1995) The Physics of Star Trek, New York: Basic Books.Kroch, A. (1989) “Asymmetries in long-distance extraction in tree-adjoining grammar,”

D E R I V A T I O N S

352

Page 364: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

in M. Baltin and A. Kroch (eds) Alternative Conceptions of Phrase Structure, Chicago:University of Chicago Press.

Kroch, A. and A. Joshi (1985) The Linguistic Evidence of Tree Adjoining Grammar,Philadelphia: University of Pennsylvania Department of Computer and InformationScience Technical Report MS-CIS-85-16.

Kuroda, Y. (1972) “The categorical and the thetic judgement: evidence from Japanesesyntax,” Foundations of Language 9.

Laka, I. (1990) “Negation in syntax: on the nature of functional categories and projec-tions,” unpublished PhD dissertation, MIT.

—— (1994) On the Syntax of Negation, New York: Garland.Laka, I. and J. Uriagereka (1987) “Barriers for Basque and vice-versa,” Proceedings of

NELS 17, University of Massachusetts, Amherst, 394–408.Lakarra, J. and J. Ortiz de Urbina (eds) (1992) “Syntactic theory and Basque syntax,”

Diputación Foral de Gipuzkoa, San Sebastián.Langacker, R. (1987) Foundations of Cognitive Grammar, Stanford, CA: Stanford Uni-

versity Press.Langdon, M. and P. Muro (1979) “Subjects and switch reference in Yuman,” Folia Lin-

guistica 13.Langendoen, T. and P. Postal (1984) The Vastness of Natural Languages, Oxford: Black-

well.Lappin, S., R. Levine and D. Johnson (2000) “The structure of unscientific revolutions,”

Natural Language and Linguistic Theory 18.3: 665–71.Larson, R. (1988) “On the double object construction,” Linguistic Inquiry 19.3: 335–91.Larson, R. and G. Segal (1995) Knowledge of Meaning, Cambridge, MA: MIT Press.Lasnik, H. (1972) “Analyses of negation in English,” unpublished PhD dissertation, MIT.—— (1976) “Remarks on coreference,” Linguistic Analysis 2: 1–22.—— (1990) “Pronouns and non-coreference,” paper presented at the Princeton Confer-

ence on Linguistic and Philosophical Approaches to Anaphora.—— (1995) “Verbal morphology: Syntactic Structures meets the Minimalist Program,” in

H. Campos and J. Kempchinsky (eds).—— (1999) Minimalist Analyses, Oxford: Blackwell.Lasnik, H. and J. Kupin (1977) “A restrictive theory of transformational grammar,”

Theoretical Linguistics 4: 173–96.Lasnik, H. and M. Saito (1984) “On the proper treatment of proper government,” Lin-

guistic Inquiry 15: 235–89.—— (1992) Move �, Cambridge, MA: MIT Press.Lasnik, H. and J. Uriagereka (forthcoming) Essential Topics in the Minimalist Program,

Oxford: Blackwell.Lebeaux, D. (1983) “A distributional difference between reciprocals and reflexives,”

Linguistic Inquiry 14.4: 723–30.—— (1988) “Language acquisition and the form of the grammar,” unpublished PhD dis-

sertation, University of Massachusetts, Amherst.—— (1991) “Relative clauses, licensing, and the nature of the derivation,” in S. Roth-

stein (ed.) Perspectives on Phrase Structure, New York: Academic Press, 209–39.—— (1996) “Determining the kernel,” in J. Rooryck and L. Zaring (eds).Lewis, D. (1973) Counterfactuals, Oxford: Blackwell.Lightfoot, D. (1995) “The evolution of language: adaptationism or the spandrels of San

Marcos?”, paper presented at Developments in Evolutionary Biology, Istituto di Artee Scienzia, Venice.

B I B L I O G R A P H Y

353

Page 365: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Loebner, S. (1987) “Natural language and generalized quantifier theory,” in P. Garden-fors (ed.) Generalized Quantifiers, Dordrecht: Reidel.

Longobardi, G. (1994) “Reference and proper names,” Linguistic Inquiry 25.4: 609–66.Markman, E. (1989) Categorization and Naming in Children, Cambridge, MA: MIT

Press.Markman, E. and G. Wetchel (1988) “Children’s use of mutual exclusivity to constrain

the meaning of words,” Cognitive Psychology 20: 120–57.Marr, D. (1982) Vision, San Francisco: W.H. Freeman.Martin, R. and J. Uriagereka (forthcoming) “Collapsed waves in syntax,” unpublished

manuscript, Tsukuba University and University of Maryland, College Park.Martins, A. (1994) “Clíticos na história do Português,” unpublished PhD dissertation,

University of Lisbon.May, R. (1977) “The grammar of quantification,” unpublished PhD dissertation, MIT.—— (1985) Logical Form, MIT Press.Meinzer, K. (1994) Thinking in Complexity, Berlin: Springer.Milsark, G. (1974) “Existential sentences in English,” unpublished PhD dissertation, MIT.—— (1977) “Toward an explanation of certain peculiarities of the existential construc-

tion in English,” Linguistic Analysis 3: 1–29.Mitxelena, L. (1981) “Galdegaia eta mintzagaia euskaraz,” in Euskal Linguistika eta Lit-

eratura: Bide berriak, University of Deusto, Bilbao, Spain.Moll, A. (1993) “Estructuras de rección en un texto colonial del siglo XVII,” PhD disser-

tation, University of Maryland, College Park.Mori, N. (1997) “A syntactic representation for internal aspect,” generals paper, Univer-

sity of Maryland, College Park.—— (forthcoming) Untitled PhD dissertation, University of Maryland, College Park.Munn, A. (1994) “A minimalist account of reconstruction asymmetries,” in Proceedings

of NELS 24, University of Massachusetts, Amherst, 397–410.Muromatsu, K. (1995) “The classifier as a primitive: individuation, referability, and argu-

menthood,” paper presented at GLOW, Tromsö.—— (1998) “On the syntax of classifiers,” unpublished PhD dissertation, University of

Maryland, College Park.Nagel, E. and J. Newman (1958) Godel’s Proof, New York: New York University Press.Nespor, M. and I. Vogel (1986) Prosodic Phonology, Dordrecht: Foris.Newmeyer, F. (1980) Linguistic Theory in America: The First Quarter Century of Trans-

formational Generative Grammar, New York: Academic Press.Nunes, J. (1995) “The copy theory of movement and linearization of chains in the Mini-

malist Program,” unpublished PhD dissertation, University of Maryland, College Park.—— (1998) “Sideward movement and linearization of chains in the Minimalist

Program,” unpublished manuscript, University of Campinas.—— (1999) “Linearization of chains and phonetic realization of chain links,” in S. D.

Epstein and N. Hornstein (eds), 217–50.Nunes, J. and E. Thompson (1998) Appendix to Uriagereka 1998.Ormazabal, J., J. Uriagereka and M. Uribe-Etxebarria (1994) “Word order and wh-

movement: towards a parametric account,” paper presented at the 17th GLOW Collo-quium, Vienna.

Ortiz de Urbina, J. (1989) Parameters in the Grammar of Basque: a GB Approach toBasque Syntax, Dordrecht: Foris.

Otero, C. (1996) “Head movement, cliticization, precompilation, and word insertion,” inR. Freidin (ed.).

D E R I V A T I O N S

354

Page 366: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Parsons, T. (1990) Events in the Semantics of English, Cambridge, MA: MIT Press.—— (2000) “Underlying states and time travel,” in J. Higginbotham, F. Pianesi and A.

Varzi (eds).Perlmutter, D. (1971) Deep and Surface Constraints in Syntax, New York: Holt, Rine-

hart, and Winston.Pica, P. (1987) “On the nature of the reflexivization cycle,” in Proceedings of NELS 17,

University of Massachusetts, Amherst, Vol. 2: 483–99.Pica, P. and W. Snyder (1995) “Weak crossover, scope, and agreement in a minimalist

framework,” in R. Aranovich, W. Byrne, S. Preuss and M. Senturia (eds) Proceedings ofthe 13th West Coast Conference on Formal Linguistics, Stanford, CA: CSLI Publications.

Pietroski, P. (1998) “Actions, adverbs, and agency,” Mind 107: 73–112.—— (1999) “Plural descriptions as existential quantifiers,” in S. Aoshima, J. Drury and

T. Neuvonen (eds).—— (2000) Causing Actions, Oxford: Oxford University Press.—— (forthcoming a) “Small verbs, complex events: analyticity without synonymy,” in L.

Antony and N. Hornstein (eds) Chomsky and His Critics, Oxford: Blackwell.—— (forthcoming b) Events and Semantic Architecture, Oxford: Oxford University

Press.Postal, P. (1966) “On so-called ‘pronouns’ in English,” in F. P. Dineen (ed.) Report of the

17th Annual Round Table Meeting on Linguistics and Language Studies, Washington,DC: Georgetown University Press.

Prince, A. and P. Smolensky (1993) “Optimality theory,” unpublished manuscript,Rutgers University and University of Colorado.

Pustejovsky, J. (1995) The Generative Lexicon, Cambridge, MA: MIT Press.Putnam, H. (1975) Mind, Language and Reality: Philosophical Papers, Cambridge: Cam-

bridge University Press.—— (1983) “Models and reality,” in P. Benaceraff and H. Putnam (eds) Philosophy of

Mathematics, Cambridge: Cambridge University Press.Quine, W. V. O. (1960) Word and Object, Cambridge, MA: MIT Press.—— (1970) Philosophy of Logic, Englewood Cliffs, NJ: Prentice Hall.Raposo, E. (1988) “Romance inversion, the Minimality Condition and the ECP,” in J.

Blevins and J. Carter (eds) Proceedings of NELS 18, University of Massachusetts,Amherst, 357–74.

Raposo, E. and J. Uriagereka (1990) “Long-distance Case assignment,” LinguisticInquiry 21.4: 505–37.

—— (1996) “Indefinite se,” Natural and Linguistic Theory 14: 749–810.Reid, T. (1785) Essays on the Intellectual Powers of Man, abridged edition by A. D.

Woozley (1941) London: Macmillan.Reinhart, T. (1995) “Interface strategies,” OTS Working Papers, Utrecht.Reinhart, T. and Reuland. E. (1993) “Reflexivity,” Linguistic Inquiry 24: 657–720.Reuland, E. (1983) “The extended projection principle and the definiteness effect,” in

Proceedings of the 2nd West Coast Conference on Formal Linguistics, Stanford, CA.Reuland, E. and A. ter Meulen (eds) (1987) The Representation of (In)definiteness, Cam-

bridge, MA: MIT Press.Richards, N. (1997) “What moves where when in which language,” unpublished PhD dis-

sertation, MIT.Rizzi, L. (1982) Issues on Italian Syntax, Dordrecht: Foris.—— (1986) “On chain formation,” in H. Borer (ed.) The Grammar of Pronominal

Clitics, New York: Academic Press, 65–95.

B I B L I O G R A P H Y

355

Page 367: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

—— (1990) Relativized Minimality, Cambridge: MIT Press.Roberts, I. (1994) “Long head movement, Case, and agreement in Romance,” in N.

Hornstein and D. Lightfoot (eds) Verb Movement, Cambridge: Cambridge UniversityPress.

Rooryck, J. and L. Zaring (eds) (1996) Phrase Structure and the Lexicon, Dordrecht:Kluwer.

Ross, J. (1967) “Constraints on variables in syntax,” unpublished PhD dissertation, MIT.Russell, B. (1940) An Inquiry into Meaning and Truth, London: Allen and Unwin.Safir, K. (1987) “What explains the definiteness effect,” in E. Reuland and A. ter Meulen

(eds), 71–97.Safir, K. (1992) “Implied non-coreference and the pattern of anaphora,” Linguistics and

Philosophy 15: 1–52.Schein, B. (1993) Plurals and Events, Cambridge, MA: MIT Press.Schmitt, C. (1993) “Ser and estar: a matter of aspect,” in Proceedings of NELS 22, Uni-

versity of Massachusetts, Amherst.—— (1996) “Aspect and the syntax of noun phrases,” unpublished PhD dissertation,

University of Maryland, College Park.Schwegler, A., B. Tranel and M. Uribe-Etxebarria (eds) (1998) Romance Linguistics:

Theoretical Perspectives, Amsterdam: Benjamins.Selkirk, E. (1984) Phonology and Syntax, Cambridge, MA: MIT Press.Sportiche, D. (1988) “A theory of floating quantifiers and its corollaries for constituent

structure,” Linguistic Inquiry 19.—— (1990) “Movement, agreement and Case,” unpublished manuscript, UCLA.Stevens, C. (1995) The Six Core Theories of Modern Physics, Cambridge, MA: MIT

Press.Stowell, T. (1978) “What was there before there was there,” in Proceedings of CLS 14.—— (1981) “Origins of phrase structure,” unpublished PhD dissertation, MIT.Suh, S. (1992) “The distribution of topic and nominative-marked phrases in Korean: the

universality of IP structure,” MITWPL 16.Szabolcsi, A. (1981) “The possessive construction in Hungarian: a configurational cat-

egory in a nonconfigurational language,” Acta Linguistica Academiae Hungarivae 31.—— (1983) “The possessor that ran away from home,” The Linguistic Review 3: 89–102.Szabolcsi, A. and F. Zwart (1993) “Weak islands and an algebraic semantics for scope-

taking.” Natural Language Semantics 1–3: 235–84, reprinted in A. Szabolcsi (ed.)(1997) Ways of Scope-Taking, Dordrecht: Kluwer.

Takahashi, D. (1994) “Minimality of movement,” unpublished PhD dissertation, Univer-sity of Connecticut, Storrs.

Taraldsen, T. (1992) “Agreement as pronoun incorporation,” paper presented at theGLOW Colloquium, Lisbon.

Taylor, B. (1985) Modes of Occurrence, Oxford: Blackwell.Tenny, C. (1994) Aspectual Roles and the Syntax-Semantics Interface, Dordrecht: Kluwer.Thalberg, I. (1972) Enigmas of Agency, London: Allen and Unwin.Thompson, D. (1945) On Growth and Form, Cambridge: Cambridge University Press,

reprinted 1992.Thompson, E. (1996) “The syntax of tense,” unpublished PhD dissertation, University of

Maryland, College Park.Thomson, J. (1971) “Individuating actions,” Journal of Philosophy 68: 771–81.—— (1977) Acts and Other Events, Ithaca: Cornell University Press.Torrego, E. (1983) “More effects of successive cyclic movement,” Linguistic Inquiry 14.3.

D E R I V A T I O N S

356

Page 368: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

—— (1984) “On inversion in Spanish and some of its effects,” Linguistic Inquiry 15.1:103–29.

—— (1996) “On quantifier float in control clauses,” Linguistic Inquiry 27.1: 111–26.Uriagereka, J. (1988a) “On government,” unpublished PhD dissertation, University of

Connecticut, Storrs.—— (1988b) “Different strategies for eliminating barriers,” in J. Blevins and J. Carter

(eds) Proceedings of NELS 18, University of Massachusetts, Amherst, 509–22.—— (1993) “Specificity and the name constraint,” in University of Maryland Working

Papers in Linguistics 1, College Park.—— (1994) “A note on obviation,” unpublished manuscript, University of Maryland,

College Park.—— (1995a) “An F position in Romance,” in K. E. Kiss (ed.).—— (1995b) “Aspects of clitic placement in Western Romance,” Linguistic Inquiry 25.1:

79–123.—— (1996) “Determiner clitic placement,” in Freidin (ed.).—— (1998) Rhyme and Reason, an Introduction to Minimalist Syntax, Cambridge, MA:

MIT Press.—— (2001a) “Doubling and possession,” in B. Gerlach and J. Grijzenhout (eds) Clitics

in Phonology, Morphology and Syntax, Amsterdam: Benjamins.—— (2001b) “Pure adjuncts,” invited talk delivered at the Coloquio de Gramática Gen-

erativa, to appear in the proceedings.—— (forthcoming) “Remarks on the syntax of nominal reference,” University of Mary-

land, College Park.Vergnaud, J.-R. and M. Zubizarreta (1992) “The definite determiner and the inalienable

constructions in French and in English,” Linguistic Inquiry 23.4: 595–652.Vermazen, B. and M. Hintikka (eds) (1985) Essays on Davidson: Actions and Events,

Oxford: Clarendon Press.Vikner, S. (1985) “Parameters of binder and of binding category in Danish,” Working

Papers in Scandinavian Syntax 23, University of Trondheim.Wanner, D. (1987) The Development of Romance Clitics from Latin to Old Romance,

Berlin: Mouton De Gruyter.Wasow, T. (1972) “Anaphoric relations in English,” unpublished PhD dissertation, MIT.Watanabe, A. (1992) “Subjacency and S-Structure movement of wh-in-situ,” Journal of

East Asian Linguistics 1: 255–91.Weinberg, A. (1999) “A minimalist theory of human sentence processing,” in S. D.

Epstein and N. Hornstein (eds), 283–315.West, G., J. Brown and B. Enquist (1997) “A general model for the origin of allometric

scaling laws in biology,” Science 280: 122–5.Wilder, C., H.-M. Gaertner and M. Bierwisch (eds) (1996) Studia Grammatica 40: The

Role of Economy Principles in Linguistic Theory, Berlin: Akademie Verlag.Zwart, J.-W. (1989) “The first Case,” unpublished MA thesis, University of Groningen.—— (1998) “Review article: The Minimalist Program, Noam Chomsky,” Journal of Lin-

guistics 3.4: 213–26.

B I B L I O G R A P H Y

357

Page 369: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

INDEX

358

A-movement: barriers 95, 101–4, 114;infinite regress 281–2; L-relatedness156–7; successive cyclicity 139

A-position 101, 104A' position 101, 104A-reconstruction 128–9AC see Anchoring Corollaryaccessibility, antecedence 57accordion events 268–71, 272, 274ACD see Antecedent Contained Deletionacquisition 311–15Actualization Principle 233–4adjuncts: Basque 92–3; Condition on

Extraction Domains 66, 70; copy theory75, 76–7, 79–80; dimensions 266, 278–80,283–4; infinite regress 280–3; MultipleSpell-Out 51; reprojection 126;sub-events 271–3; thematic roles 277;unboundedness 268; wh-movement inHungarian 111

admissibility conditions 3, 23Agents, sub-events 275–6, 277–8agreement: antecedence and Multiple

Spell-Out 58–9; barriers 94–5, 99–100,102; Basque 87; cliticization 56; expletiveconstructions 28–32, 34–7, 40–1;integrals 183–4; Multiple Spell-Out 11,51, 52, 56, 58–9, 60; pro 105; thetic(stage-level) and categorical (individuallevel) predication 218–19; wh-movement107–8, 109–10, 111

Agreement Criterion 60Altube, S. 88anaphora 167–73Anchoring Corollary (AC) 230–4antecedence: Multiple Spell-Out 11, 56–60;

reprojection 131Antecedent Contained Deletion (ACD)

281, 284

arguments: Basque 87–8, 91; determiners116–19; dimensions 284, 308–10; infiniteregress 282–3; ordering 13; reprojection131–2, 134–5; thematic roles 277; thetic(stage-level) and categorical (individuallevel) predication 212, 214–15

associate (A) 28–32, 34–7, 40–1, 126–7Assumption One 270, 272, 274–6Assumption Two 270, 273–4, 274–6asymmetry condition 73atomism 15–18, 20, 284–5, 293; internal

make-up of events 274; sub-events272–3; thematic relations 266

Attract: barriers 96–7, 100; expletiveconstructions 30–2; Merge 52;reprojection 117

Atutxa, A. 309Authier, J.-M. 190

bare output conditions 95–6, 148bare phrase structure (BPS) 10;

determiners 117; LinearCorrespondence Axiom 45–6, 68;reprojection 134; successive cyclicity 137

barriers 86, 91–105, 109–14Basque 86–114, 166, 170–1be 192–211Bello, Andrés 253Benveniste, E. 192, 194binary quantifiers: ordering 13, 14;

reprojection 122, 123, 126–9, 131, 134–5binding: clitics 158–9; dynamic 133–4, 135;

L-relatedness 156, 157; parataxis andhypotaxis 255, 265

Binding Condition B 158–9, 161Bleam, T. 302, 305body plans 26–7Bonet, E. 215Bouchard, D. 162

Page 370: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

bound variable binding 255, 265bounding nodes 136Bounding Theory 24Bowers, J. 214BPS see bare phrase structureBresnan, J. 98, 152, 153Brody, M. 109, 110Brown, J. 33Burge, T. 188–9, 224–5, 240–1, 248, 249Burzio, Luigi 34, 170

c-command 73, 76, 78–9, 84–5calculus of variations 148–9Carlson, G. 213cascades: command paths 153;

computational complexity 12; LinearCorrespondence Axiom 48–9; MultipleSpell-Out 10–12, 51–8, 64; reprojection124–5

Case: anaphora 167; barriers 94–5, 105–8;Basque 86–7; clitics 161, 169–73;government 24–5; Government andBinding 23; L-relatedness 156–7;Multiple Spell-Out 11–12, 59–60;obviation 165–7; parasitic gapconstructions 74; parataxis andhypotaxis 260; reprojection 119, 128–9;thetic (stage-level) and categorical(individual level) predication 217,218–20; uninterpretable features 163–5

Case features 23–4, 157Castillo, Juan Carlos 15, 136–46, 301Catalan 215categorical-predication (individual-level

predication ) 212–34categories 18–20; warps 288–317CATEGORIES 212, 218–20, 230–4Cattell, R. 69causal relations 292–3CED see Condition on Extraction

Domainscenter embedding 38chains 6; Condition on Extraction Domains

72, 84–5; copy theory 73; fusion 170–3;Last Resort Condition 149–50; MinimalLink Condition 96; parasitic gapconstructions 76; reprojection 117, 121,123–5, 129, 130–1, 134

Chamorro 107–8, 111change potential 302, 309Checking Convention 164–5, 169Chierchia, G. 221, 225Chinese 107, 238–9, 245, 246, 248Chomsky, Noam 1; A-positions 104;

associates 126; Attract 117; bare phrasestructure 10, 68; Barriers framework 91;being and having 192–3; bounding nodes136; Case 106, 163, 165; chains 123;closeness 112; complementizers 258–9;computational complexity 12; distance52–3; economy 38, 96; L-relatedness 101;Last Resort Condition 149–50; LinearCorrespondence Axiom 45, 152; Merge50, 52, 137–8; “mind plan” 26–7; minimaldomain 230; Minimal Link Condition 96,112; Minimalist Program 9, 22–8, 34;Multiple Spell-Out 51, 58, 65, 145;names 243; numerations 103; obviation151; optimality 29, 147–8; parasitic gapconstructions 74, 82, 85; parataxis andhypotaxis 253, 263; topics 217–18;wh-islands 112

Chung, S. 107, 213Cinque, G. 54, 153, 289classifiers 238–9, 240, 244, 248clausal typing 144–5clitic climbing 273–4clitic doubling 168–9clitics 158–62; anaphora 168–73; Multiple

Spell-Out 55–6, 64; parataxis andhypotaxis 263–5; thetic (stage-level) andcategorical (individual level) predication215–16

closeness 112command: distance 52–3; Linear

Correspondence Axiom 46–8; MultipleSpell-Out 10–12, 54, 64, 151–5; obviation151

command units (CUs): LinearCorrespondence Axiom 46–9; locality155; merger 152–4; Multiple Spell-Out49–58, 63–5

como (Spanish) 253–65complementizers: Basque 88–9; parataxis

and hypotaxis 255–65; tuck-in 15complements: Basque 88–90; Condition on

Extraction Domains 68, 72; determiners116–17; hypotaxis 253; Multiple Spell-Out 51, 53, 54, 69; parataxis andhypotaxis 255–65

Compositionality Thesis 301computational complexity 8–9, 12concatenation algebras 1–2, 3, 291, 294–6Condition C 65Condition on Extraction Domains (CED):

cyclicity 66–85; Multiple Spell-Out59–64; reprojection 120, 131; successivecyclicity 139

I N D E X

359

Page 371: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

connectionist networks 23–4constraint violations 23constraints 24context 223–9, 231, 273–4Control theory, government 24convergence 6, 12, 72, 96, 104copy deletion 62–4copy theory of movement 67, 70–2, 73–84copy traces 5copying, possession 202coreference 167–70, 173–4count nouns 301, 303–6, 315counterfactual identity statements 235–52counterparts 251CUs see command unitscyclicity 5–6, 8–9; barriers 98–9, 101–4;

computational complexity 12; Conditionon Extraction Domains 70–2; copytheory 76–7, 79–85; extraction domains66–85; ordering 14–15; reprojection124–5; wh-movement in Basque 114; seealso successive cyclicity

Danish 28, 167–8, 169–70Davidson, D. 215, 253DE see definiteness effectDe Hoop, H. 214, 216, 217decompositional approach 20, 266, 272–3,

284–5, 293Deep-structure (D-structure):

Government and Binding model 5–6;levels of representation 2; MinimalistProgram 9, 26; representational systems5–6; thetic (stage-level) and categorical(individual level) predication 214

definite descriptions: anaphora 167;obviation 151, 165–6; reprojection 125–6,127–8; split constructions 122–3

definiteness effect (DE) 122–3; Basque89–90; integrals 181, 184–7; reprojection126–7, 128; thetic (stage-level) andcategorical (individual level) predication217

demonstratives: names 239, 240–1, 248;reprojection 127–8; split constructions122–3

“derivation”, definition 2derivational blocks 151–4derivational entropy 29–33derivational horizons 103, 149–50derivational systems 2–20determiners 115–35; cliticization 55–6;

names 240–1, 248, 250Diesing, M. 128, 214, 215–16, 219

dimensions 18–20, 266–87, 288–317Disjoint Reference Rule 166disjuncts 280–3, 284distance 52–3Distributed Morphology 98, 99Doherty, C. 214, 218Drury, J. 58, 138Dutch 216dynamic binding 133–4, 135dynamic elegance 148–9dynamically split model 98, 152–3;

A-movement 101–2; LinearCorrespondence Axiom 48, 54, 57

economy 25–6; barriers 98; Condition onExtraction Domains 72; elegance 147–8;entropy 32, 33; Linear CorrespondenceAxiom 45–6, 52; MP 34, 37–40, 96;numerations 103

ECP see Empty Category Principleelegance 147–75ellipsis, infinite regress 281–2Elsewhere Condition 208, 312–13Empty Category Principle (ECP) 25–6, 27,

33English: Case 166; expletive constructions

28–9, 35–7; integrals 180, 185–6;L-relatedness 104; names 239; nouns305, 306–7; possession 193–4, 196–8;thetic (stage-level) and categorical(individual level) predication 226–7

Enquist, B. 33entailments 268; warps 288–317entropy condition 29–33, 37–41EPC see external poss constructionEpstein, S. D. 9–10, 47, 289EST see Extended Standard Theoryevents, accordion 268–71, 272, 274evolution 147–8Exclusivity Hypothesis 313–14, 315existentials 126–7, 182–7expletives 34–42; definiteness effects

185–6; MP 28–32; reprojection 126;sideward movement 82

Extended Projection Feature 206Extended Projection Principle 24, 108, 113,

129Extended Standard Theory (EST) 6Extension Condition 137–8, 258–9external arguments 13, 116–17, 118–19,

122–3external poss construction (EPC) 189–91extraction: barriers 94–5; cyclicity 66–85; see

also Condition on Extraction Domains

I N D E X

360

Page 372: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

familiarity/novelty condition 59, 64–5feature attraction: barriers 96–101, 102,

105–7, 113, 114; Case 164–5feature checking 23, 160–5FF-bags 164–5, 174focalization 54, 110–11, 233focus projection 153Fodor, Jerry 217, 233; accordion events

269, 270; atomism 293; modularity 23,299; representational theory of mind 3;sub-events 271, 273, 276

Freeze, R. 201, 203, 205Freidin, R. 22French 189–91Fukui, N. 33, 91, 148, 149Full Interpretation 6

Galician 55–6, 113, 168–9GB see Government and BindingGeneralized Transformation 259, 261–2generics, reprojection 127–8glitches 9Goedel 297, 300Gould, Stephen Jay 48, 147Government and Binding (GB) model:

comparison to MP 22–8; expletiveconstructions 28–32; last resort 33;modularity 22–4; representation 5–6

Greed 26, 117, 119

Haken, H. 149Hale, K. 143, 285Halle, M. 98have 192–211head movement, L-relatedness 156–7heavy categories, barriers 99, 100Heavy NP shift 35Hebrew 35–7Heim, I. 128Herburger, E. 217, 220–1hierarchies: dimensions 266–8, 283–4;

warps 288–317Higginbotham, James 1, 68, 165, 186–7;

names 240; thetic (stage-level) andcategorical (individual level) predication213, 215, 220, 222, 224–5

Holmberg, Anders 34Honcoop, M. 121, 122, 123, 126, 133horizontal syntax 288–317Horn, L. 314–15Hornstein, Norbert 14, 16, 74, 115–35,

179–91Huang, C.-T. S. 120Hungarian 109–12, 180, 226–7

hypotaxis 253–65Hypothesis A 272–3, 274, 285Hypothesis B 272–3, 274, 285

Iatridou, S. 214identity 84identity statements, counterfactual 235–52II see integral interpretationimpenetrability 12Incompleteness Theorem 300incorporated quantifiers 125–6incorporation, reprojection 125–6indefinite descriptions 238individual-level predicates 212–34infinite regress 280–3, 284inflections, checking 159–60integral interpretation (II) 179–91intelligibility 7internal arguments 13, 116–17, 118–19,

122–3internal events 271–6internal poss construction (IPC) 189–91interpretability, antecedence 57IPC see internal poss constructionIrish, small clauses 218–20“is-a” relation 2islands: impenetrability 12; MP 86;

reprojection 14, 120–5, 129–31, 133–4,134; successive cyclicity 15, 142–6; seealso Condition on Extraction Domains

Italian 94–5, 108, 172iteration 267–8, 283

Jackendoff, R. 98, 152, 293, 298Jaeggli, O. 100Japanese 100–1, 107Jean, Roger 39Johnson, D. 28, 33–4, 35–42Joshi, A. 136Judgement Principle 233–4

Kaisse, E. 55Kayne, Richard 1; being and having

192–211; Linear Correspondence Axiom45–6, 62, 67, 120; Multiple Spell-Out 51;relational terms 16–17, 256–7, 259; smallclauses 180–1; thetic (stage-level) andcategorical (individual level) predication226–7

Keenan, E. 186, 187Keyser, S. J.Kim, S. W. 125–6kind plurals 128kind-denoting plurals 122–3

I N D E X

361

Page 373: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Kiss, K. E. 110, 111Korean 125–6Kratzer, A. 214–15, 216Kroch, A. 136Kupin, J. 23Kuroda, Y. 65, 212, 216, 217, 218

L-relatedness: A-movement 101–4; Case106, 107–8; locality 155–7; wh-movement111–12, 113, 114

L-relatedness Lemma 101–2, 104, 113labels 115–35, 152Laka, I. 90, 91, 93, 94, 260Langendoen, T. 280, 284language acquisition 311–15Language Acquisition Device (LAD)

312–15Lappin, S. 28, 33–4, 35–42Larson, R. 118Lasnik, Howard 1, 23, 25–6, 98, 152, 166last resort 33, 37–8; Condition on

Extraction Domains 84; copy theory andparasitic gap constructions 77, 78, 79;Multiple Spell-Out 53; optimality 147–8

Last Resort Condition (LRC) 149–50, 171Law of the Conservation of Patterns 14LCA see Linear Correspondence AxiomLeast Action 33Lebeaux, D. 55, 152, 168legibility conditions 6, 7, 66levels of representation 1–2, 6, 8–9, 10Levine, R. 28, 33–4, 35–42Lewis, D. 251LF: A-movement 101–2; Case 25;

command 56–7, 64, 151–5; as a level 174;levels of representation 6; LinearCorrespondence Axiom 49; MinimalistProgram 9–10; Multiple Spell-Out 11,56–7, 64, 151–5; reprojection 119–25,134–5; thetic (stage-level) andcategorical (individual level) predication216–21, 223, 225–6, 234

Lightfoot, D. 147Lindemayer, Aristide 27Linear Correspondence Axiom (LCA):

A-movement 102; computationalcomplexity 12; Condition on ExtractionDomains 67, 68–9; copy theory 73;Multiple Spell-Out 10, 45–52, 62, 151–2;reprojection 120

linearization 45–54; command paths 152–3;Condition on Extraction Domains68–72; copy theory of movement 73, 84;parasitic gap constructions 76, 78–9

Linearize 68–71, 76Local Association 137–8Local Binding Condition 172, 174local derivational horizons 29locality: CED 66–7; expletive constructions

30; obviation 155–7, 165–7, 173–4;relational terms 17

LRC see Last Resort Condition

McCloskey, J. 213Madurese 145manifolds 242–3, 244–5, 248, 250, 310, 314mapping hypothesis 128Marantz, Alec 82, 98, 102–3Martin, R. 29Martins, A. 260mass nouns 301, 303–7, 315mass term constructions 188–9matching, Case 164May, R. 225Merge 24; arguments 119; Condition on

Extraction Domains 68, 69–70; cyclicity102; Linear Correspondence Axiom45–8; Multiple Spell-Out 49–53, 60–2;parasitic gap constructions 74; successivecyclicity 136, 137–8, 140–3, 145–6

merger 151–3, 255Milsark, G. 212, 216, 217, 234mind plans 26–7Minimal Domain 160–1, 230Minimal Link Condition (MLC) 53, 96;

Condition on Extraction Domains 67;locality 155; possession 199, 203; wh-islands 112–13

Minimalist Program (MP) 9–10, 22–42;Basque movements 86–114; Conditionon Extraction Domains 66–7; cyclicordering 14–15; as derivational system290; economy 25–6, 33, 34, 96; elegance147–75; government 24–5; islands 86;Linear Correspondence Axiom 45–52;mind plan 26–7; modularity 22–4;Multiple Spell-Out 48–52; representation5–6; successive cyclic movement 136;thetic (stage-level) and categorical(individual level) predication 214

Mitxelena, L. 92MLC see Minimal Link Conditionmodeling, names 245, 250–1modes 235–52, 276modification: infinite regress 280–3;

sub-events 271–3modularity thesis 22–4, 299–300Moll, A. 264

I N D E X

362

Page 374: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

monadic event predicates 277Mongol 194monostrings 23–4Mori, N. 275, 304, 308–10Move: Attract 96–7; barriers 105;

Condition on Extraction Domains 84;copy theory 73–4; determiners 117, 119;successive cyclicity 138, 141–3, 145–6

MP see Minimalist ProgramMSO see Multiple Spell-OutMultiple Spell-Out (MSO) 10–12, 45–65;

A-movement 101–4; barriers 97–100;Basque movements 86–114; Case 105–7;command 151–5; Condition onExtraction Domains 67, 68–73; islands86; Linear Correspondence Axiom 68–9;parasitic gap constructions 74–8; tuckingin and successive cyclicity 145

Munn, A. 74Muromatsu, K. 169, 301, 304, 311

names: counterfactual identity statements235–52; obviation 151; reprojection 125–6,127–8; rigidity 314–15; split constructions122–3; thetic (stage-level) and categorical(individual level) predication 221–3

negation 133–4, 135negative polarity item (NPI) 121, 124,

133–4Neo-Davidsonians 17, 283–4Nespor, M. 55noncomplements: Condition on Extraction

Domains 68; Multiple Spell-Out 51–2,53–4, 58–64

Norwegian 28–9Noun-incorporation 125–6nouns: count/mass distinction 301, 303–7,

315; dimensions 301–11; names 244,245–51; relational terms 15–18

NPI see negative polarity itemnumerations 103–4Nunes, Jairo 62, 66–85

obviation 151–7, 163, 165–7, 172, 173–5ontology 192–211, 291–2, 295–6, 301–4optimality 29–33, 37–42, 147–8; Linear

Correspondence Axiom 47–8;representational vs derivational systems8, 9–10

optimality races 96, 102–4Optimality Theory (OT) 5, 23–4, 32–3,

37–42ordering 12–15, 116Ortiz de Urbina, J. 88–90, 92

OT see Optimality TheoryOtero, C. 55

parasitic gap constructions 66–7, 73–84parataxis 253–65participial agreement 28–32, 34–7participial (P) expressions 28–32, 34–7,

40–1performance 8, 51, 55–6, 57–8, 64–5Perlmutter, D. 158PF: A-movement 101–2; cliticization 64;

command paths 151–4; copy theory 73,76; expletive expressions 41; levels ofrepresentation 6, 174; LinearCorrespondence Axiom 45–9, 50, 68–9;Minimalist Program 9–10; MultipleSpell-Out 10–11, 54–6, 62, 64, 151–4;wh-movement in Basque 91–2

phonological components: copy theory 73,76, 78, 81; Linear CorrespondenceAxiom 68–9; Multiple Spell-Out 54–6

phrase-markers: adjuncts 279–80;command paths 152–4; levels ofrepresentation 2; linearization 45–54;sub-events 277

Phrase-structure Grammar 289Pica, P. 167pied-piping 63–4, 96, 97, 183Pietroski, Paul 19, 266–87pleonastics 41–2, 126–7plurality markers 159–60porous modularity 298–300, 302–4Portuguese 169possession 193–211; clitic doubling 168–9;

dimensionality 310–11; integrals 180–91;parataxis and hypotaxis 255–8; relationalterms 16–18

possessor raising 201–3Postal, P. 165, 280, 284PREDICABLES 218–20, 230–4predicates: accordion vents 269–71;

determiners 116, 119, 132; dimensions283–4; names 240, 248–50; reprojection132; sub-events 272–4, 275–6, 277–8;thetic (stage-level) and categorical(individual level) predication 212–34

primitives 289–90, 294–5Prince, A. 23Principle of Strict Cyclicity 64pro: barriers 97, 99–101; Case 105, 108;

clitic doubling 169–70; parataxis andhypotaxis 259–63; thetic (stage-level) andcategorical (individual level) predication228–9; wh-movement in Basque 91

I N D E X

363

Page 375: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

pro-drop: barriers 99; Basque 87–8, 91;Case 105; Extended Projection Principle108; GB 29; subject extraction 95;wh-movement in Hungarian 109–10

PRO 215–16projections 115–35; bare Phrase-structure

10; Multiple Spell-Out 50–2; see alsoreprojection

pronouns: coreference 167–70; obviation165–6; pro-drop languages 105;resumptive 107; sub-events 273

prosodic phrasing 54–6Pustejovsky, J. 271, 285

QI islands see quantifier induced islandsQR see Quantifier Raisingquantifier induced (QI) islands 121–5,

129–31, 133–4Quantifier Raising (QR): infinite regress

281–2; ordering 13–14; reprojection129–31, 134–5; thetic (stage-level) andcategorical (individual level) predication217, 228–9

quantifiers 115–35; binary 13–14;dimensions 304–7; names 240–4, 247;thetic (stage-level) and categorical(individual level) predication 220, 226–9,234

que (Spanish) 253–65questions 86–114, 233

R-predication 189–91Raposo, E. 171, 213, 218reconstruction, reprojection 128–9recursion 268, 283reference: counterfactual identity

statements 235–6; names 242–5, 247–9;possession 204–10; relational terms 16,196–207; representation 3; see alsocoreference

regress, infinite 280–3Reid, T. 193–4Reinhart, T. 168, 173, 174Relation R 151, 154, 173–4; integrals

179–91; possession 196–205, 208–9, 211

relational terms 15–18, 196–207, 256representation 1–20representational theory of mind 3–4reprojection 14, 117–35resumptive pronouns 107Reuland, E. 168, 174rheme 212–34rho 2, 3

Richards, N. 136, 137, 138rigidity 235–52, 314–15Rizzi, L. 94–5, 99, 108, 109, 172Romero, Juan 82, 102–3Rosen, Sara 16, 179–91rule application 8, 15rule ordering 5, 8, 12–15

Safir, K. 29, 100Sag, I. 217, 233Saito, M. 25–6SC see small clausesSchein, B. 225Schmitt, C. 216scope: disjuncts 280; possession 211; thetic

(stage-level) and categorical (individuallevel) predication 217–20, 223, 225–9,234

Scope Principle 225Second Law of Thermodynamics 32–3, 39Seely, D. 289Segal, G. 118semantics: categorization 288–317;

dimensions 298–304, 316; dynamicbinding 133–4, 135; Multiple Spell-Out11–12; thetic (stage-level) andcategorical (individual level) predication220–3, 231

sentential dependencies 253–65set-theoretic objects 3, 5, 23, 50SI see standard interpretationsideward movement 67, 73–84Sigma 260–3Slaving Principle 32–3, 149–50small clauses: integral interpretation

179–91; names 242, 250; parataxis andhypotaxis 256; possession 198, 199–200,209; relational terms 17; thetic (stage-level) and categorical (individual level)predication 212–34

smallest-number-of-steps-condition 39Smolensky, P. 23Spanish: anaphora 171–3; clitics 157,

158–62, 164–5; expletive constructions28–9; nouns 305–7; parataxis andhypotaxis 253–65; plurality markers159–60; possession 17, 194–6, 197–8,200–5, 209–10; thetic (stage-level) andcategorical (individual level) predication213, 221–2, 224, 226–9, 231–4; V2 effect90, 92; wh-islands 112, 113

Speas, M. 91specifiers: barriers 96–104, 109–10; Case

105–7; determiners 116–17; Multiple

I N D E X

364

Page 376: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

Spell-Out 51; wh-movement in Basque91–5, 113–14

split constructions: dynamic binding 133–4;QI islands 121–4; reprojection 129–31,135

stage-level predicates 212–34standard interpretation (SI) 179–91Stowell, T. 213, 255strong determiners 115–16, 117–19, 125–6,

128strong quantifiers 126–7, 130structural complexity 300–4Sub-case Principle 312–13, 315sub-events 271–6subject extraction 94–5, 109–10subjects: Basque wh-movement 94–5;

Condition on Extraction Domains 66–7,68–9, 70; copy theory 77–8; thetic (stage-level) and categorical (individual level)predication 217–18, 220–3, 226–30

successive cyclicity 14–15, 48–9, 136–46Surface-structure (S-structure) 2, 5–6, 9, 48Swedish 28–9symbols 4, 6–8, 9–12syntactic objects 2, 49–50, 53–4, 57Szabolcsi, Anna 16–17, 180–1, 192–211,

226–7, 256–7, 259

TAG 136–7TAG see Tree-Adjoining GrammarsTenny, C. 270Thematic Criterion 171thematic relations 104, 212–34, 266–87,

308–10there constructions, integrals 181–2, 184–7Theta-Criterion 77theta-relations: copy theory 77, 84; MP 26;

reprojection 130–1; successive cyclicity143

�-roles: determiners 116, 118–19;dimensions 278–9, 283–4; L-relatedness156–7; Multiple Spell-Out andnoncomplements 59–60; successivecyclicity 140

theta-theory: determiners 118–19;reprojection 134

thetic-predication (stage-level predication)212–34

topic 217–20, 223, 227–8, 229, 230–1, 233–4Torrego, Esther 90, 92, 112, 253–65tough-constructions 261trace copies 62–4trace deletion 73, 76, 78–9traces: Basque wh-movement 94–5; Case

105–8; Condition on ExtractionDomains 71

Transparency Condition 165–6, 167–8Transparency Thesis 11–12tree splitting 137, 140Tree-Adjoining Grammars (TAG) 15,

136–7, 138–40, 146tucking in 15, 136–46Tunica 194Turkish 194type-shifting 277–8

unary determiners 118, 122–3, 131–2unary quantifiers 126–8uninterpretable features 159–60, 163–5,

170

Vai 194Verb second (V2) effect 88–93verbs: classification of 290–1; dimensions

304–11; relational terms 15–18Vergnaud, J.-R. 189–91vertical syntax 15–18, 288–317Visibility Condition 219Vogel, I. 55

warps 19, 288–317Weak Cross-over 11, 64–5, 154–5weak determiners 115–16, 118, 125–6, 128weak quantifiers 122, 220, 234West, G. 33wh-chains 76wh-copies 75–7, 78, 79–81wh-elements: Basque 89–91, 94; Case

105–8; Condition on ExtractionDomains 66–7, 70–1; copy theory andparasitic gap constructions 77–8;distance 52–3; successive cyclicity 141–2

wh-extraction 264–5wh-features: Case 105–8; Condition on

Extraction Domains 70–1, 72; MultipleSpell-Out 53–4, 59–64; parasitic gapconstructions 75–6; successive cyclicity136, 139, 141–2, 146; wh-islands 112;wh-movement in Chamorro 111

wh-islands 15, 112–13, 142–5wh-movement: Basque 86–95, 109–11,

113–14; Case 105–8; Condition onExtraction Domains 70–1, 72; cyclicordering 15; Hungarian 109–12; MultipleSpell-Out 59–64; parasitic gapconstructions 75–81; successive cyclicity141–5, 146; wh-islands 112–13

I N D E X

365

Page 377: Uriagereka J. Derivations. Exploring the Dynamics of Syntax

wh-phrases: Basque 88–95; Case 105–8;copy theory and parasitic gapconstructions 77–8; wh-islands 112–13;wh-movement in Hungarian 109–11

wh-traces 76, 105–8, 109

Zubizareta, M. 189–91Zwart, J.-W. 22

I N D E X

366