An Enactive Approach to Digital Musical Instrument Design

An Enactive Approach to Digital Musical Instrument Design

Newton Armstrong

A DISSERTATION PRESENTED TO THE FACULTY OF PRINCETON UNIVERSITY IN

CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

RECOMMENDED FOR ACCEPTANCE BY THE DEPARTMENT OF MUSIC

November 2006

© Copyright by Newton Blaire Armstrong, 2006. All rights reserved.

iii

Table of Contents

Table of Contents ............................................................................ iii

Abstract ..........................................................................................v

Acknowledgements ......................................................................... vii

1 Introduction............................................................................... 1

1.1 The Disconnect .........................................................................2

1.2 Flow........................................................................................6

1.3 The Criteria of Embodied Activity ................................................8

1.4 The Computer-as-it-comes.......................................................12

2 The Interface ............................................................................16

2.1 Interaction and Indirection.......................................................16

2.2 Representation and Cognitive Steering ......................................18

2.3 Computationalism ...................................................................23

2.4 Sensing and Acting .................................................................32

2.5 Functional and Realizational Interfaces ......................................41

2.6 Conclusion .............................................................................49

3 Enaction ....................................................................................51

3.1 Two Persistent Dualisms ..........................................................51

3.2 Double Embodiment ................................................................55

iv

3.3 Structural Coupling .................................................................61

3.4 Towards an Enactive Model of Interaction ..................................69

3.5 The Discontinuous Unfolding of Skill Acquisition ..........................82

3.6 Conclusion .............................................................................98

4 Implementation ......................................................................100

4.1 Kinds of Resistance ............................................................... 100

4.2 Mr. Feely: Hardware.............................................................. 103

4.3 Mr. Feely: Software............................................................... 116

4.4 Mr. Feely: Usage Examples .................................................... 128

4.5 Prospects............................................................................. 148

5 Groundlessness.......................................................................152

Bibliography................................................................................. 155

v

Abstract

Digital musical instruments bring about problems for performance that are

different in kind to those brought about by conventional acoustic instruments. In

this essay, I argue that one of the most significant of these problems is the way

in which conventional computer interfaces preclude embodied modes of

interaction. I examine the theoretical and technological foundations of this

“disconnect” between performer and instrument, and sketch an outline for the

design of embodied or “enactive” digital instruments.

My research builds on recent work in human-computer interaction and “soft”

artificial intelligence, and is informed by the phenomenology of Heidegger and

Merleau-Ponty, as well as the “enactive cognitive science” of Francisco Varela and

others. I examine the ways in which the conventional metaphors of computer

science and “hard” artificial intelligence derive from a mechanistic model of

human reasoning, and I outline how this model has informed the design of

interfaces that inevitably lead to disembodied actional modes. I propose an

alternative model of interaction that draws on various threads from the work of

Heidegger, Merleau-Ponty, and the enactive cognitive scientists. The “enactive

model of interaction” that I propose is concerned with circular chains of embodied

interdependency between performer and instrument, instrumental “resistance” to

human action and intentionality, and an integrative approach to the roles of

sensing, acting and cognitive process in the incremental acquisition of

performative skill.

vi

The final component of the essay is concerned with issues of implementation.

I detail a project in hardware and software that I present as a candidate

“enactive digital musical instrument,” I outline some specific usage examples,

and I discuss prospects for future work.

vii

Acknowledgements

This paper would have been a much bigger mess were it not for the timely

contributions of a number of people. In particular, I have benefited from the very

careful readings and insightful criticisms of my advisor, Barbara White, and my

first reader, Dan Trueman. Paul Lansky has uttered more wise words than I could

count, and he has changed my mind about many things during my time at

Princeton (although, as far as I can tell, that was never really his intention).

Perry Cook has taught me a great deal about interaction, both in his classes and

in the approach to design that he takes in his own projects. He has been an

outstanding role model in terms of bridging the gap between theory and practice,

and knowing when it’s time to just sit down with a soldering iron.

I have also benefited greatly from conversations with other graduate students

while at Princeton. In particular, I’d like to thank Ted Coffey, Paul Audi, Mary

Noble, Seth Cluett, Scott Smallwood and Ge Wang, each of whom has given me

feedback on my work, in the form of both critical readings and more casual

conversations about the core topics. I’m also grateful to the other composers in

my intake year: Paul Botelho, Stefan Weisman and Miriama Young. Together we

represent a diverse group, but there has been a considerable and on-going

interest in each other’s work, and this interest has been borne out in tangible

forms of support for our respective projects and activities.

The history of electronic music performance goes largely without mention in

my paper. But the research would not have been possible in the first place were

viii

it not for those practitioners, from David Tudor to Toshimaru Nakamura, who

would question the hidden nature of electronic media in order to uncover not just

new sounds, but new potentialities of the body. I am indebted to all those

electronic performers whose work I have engaged, whether through written

accounts and recordings, or through personal contact and performance

collaborations. Although my fingers are rusty from typing, I’m looking forward to

rejoining the ranks of the improvising community in a less part-time capacity.

1

1 Introduction

Electronics for its own sounds’ sake is a resource that one would be stupid to

dismiss, but the implication is irrelevant, even misleading.

It says that music is a pure art of sound, for people with ears, but with little else—

no eyes, no nerve endings anywhere but the ears, no interrelated functions. And

as a matter of fact much electronic music leaves the impression that this IS the

attitude in which sounds are composed.

It says that the functional shape of an instrument is not important as a sculptural

object, and that the techniques developed on it, because of its particular virtues

and its particular defects, are obsolescent. That the physical, sensual vision of the

playing of it is no longer required.

— Harry Partch, Some New and Old Thoughts After and Before “The Bewitched”

2

1.1 The Disconnect

A wooden wheel placed on the ground is not, for sight, the same thing as a wheel

bearing a load. A body at rest because no force is being exerted upon it is again

for sight not the same thing as a body in which opposing forces are in equilibrium.

— Maurice Merleau-Ponty, The Phenomenology of Perception

The mid-1990s marked a juncture in the short history of computer music. For the

first time, the personal computer was becoming fast enough to be used as a real-

time synthesizer of sound; a capability that had previously been the reserve of

special purpose machines that were for the most part inaccessible to people

working outside an institutional framework. In the years since the mid-1990s,

there has been a rapid proliferation of new software and input devices designed

specifically for musical performance with general purpose computers, and a

burgeoning corpus of new theories, performance practices and musical idioms

have emerged in tandem to the new technologies. But while the widespread

availability of the personal computer to the first world middle class has resulted

in the medium finding its way into any number of new and diverse musical

contexts, the question as to whether the computer should be properly considered

a musical instrument continues, at least in certain quarters, to generate some

controversy.

More often than not, these controversies revolve around the relationship

between the human performer and the performance medium. Or, more

specifically, they revolve around an apparent lack of embodied human presence

3

and involvement in computer music performance practice. The complainants

argue that the performer is either absorbed in near-motionless contemplation of

the computer screen—the repertoire of performance gestures not substantively

different from those that comprise any routine interaction with a personal

computer—or that there is a high degree of arbitrariness to the performer’s

actions, where the absence of any explicit correlation between motor input and

sonic output results in a disassociation of performer from performance medium.

In both instances, what is witnessed is a disconnect; between performer and

audience, and between performer and instrument.

Those who complain about the current state of computer music performance

practice reveal something of their assumptions and expectations as regards

musical performance: that the involvement of the performer’s body constitutes a

critical dimension of the practice, and that for such an involvement to be tangible

to the audience, it’s necessary that that audience picks up on somatic cues that

signal the point of origin, in real time and real space, of the sounds they are

hearing. Defenders of the “near-motionless” school of computer music

performance have suggested that complaints such as these arise not because

there is something substantive missing from the interaction between performer

and performance medium, but because conventional expectations as regards the

constitutive elements of musical performance have not yet caught up to an

essentially new performance practice (Cascone 2000; Stuart 2003). The

argument has it that the computer, considered as a performance medium, brings

a unique set of issues and concerns to the problem of musical performance, and

that the attributes of the medium necessitate a break with established

instrumental conventions, the modes of performance that are attendant to those

4

conventions, and the expectations, assumptions and receptive habits of

audiences. It’s been suggested that those who take issue with the apparent lack

of human motor involvement in current computer music performance practice

reveal a mindset “created by constant immersion in pop media (Cascone 2000:

101-102),” and that the emergence of the new performance paradigm signals a

shift away from the locus of the body of the performer; the “object of

performance” is instead transferred to the ears of the audient, who needs to

relearn “active” modes of listening, or “aural performativity (Stuart 2003).”

On both sides of the argument over the state of computer music performance

practice, there is a suggestion that something is missing. What distinguishes one

side from the other is where that missing something is located: with the

performer, or with the audient. It’s difficult to defend either position, based as

they are on speculative assessments of the receptive habits and practices of

listeners. But what can be called into question is the implied corollary to the

apologist’s claim that the burden of responsibility lies with the audient; that is,

that computer music performance practice, as it stands, is already mature. In

this essay, I take the opposite position: that computer music performance

practice remains both theoretically and technologically under-developed, and that

most of the interesting and significant work in the field remains to be done.

In certain respects, then, the present study is a legitimation of the complaints

being uttered against the current state of computer music performance practice.

But more pressingly, it is borne out of frustration as a computer music performer.

Despite investing a number of years in the development of both hardware and

software designed specifically for performance, I’ve found that the performance

medium has in all but a few instances managed to maintain a safe distance,

5

corroborating (from the shaky perspective of first person phenomenal

experience) the complaint of the disconnect. I’ve come to believe that there is

something intrinsic to the computer, something embedded in the medium itself,

that is the cause of all this; something that necessarily and inevitably brings

about a disconnect. If it turns out that is in fact the case, then the medium

effectively guarantees that an embodied coupling of human and instrument—a

coupling that creates the possibility of engaged and involved experience—never

quite takes place.

Unlike the apologists for the currently predominant modes of computer music

performance practice, I’m going to suggest that the perceived disconnect, or

“missing dimension,” that certain people have been complaining about, is not due

to a conditioned desire for spectacle, or an ingrained expectation that an

explicitly causal relation is witnessed between performance gesture and sonic

result. Rather, it seems to me that there is something more fundamental to the

issue: that an engaged and embodied mode of performance leads to a more

compelling, dynamic, and significant form of music making; for the performer,

the audience, and for the social space that they co-construct through the

performance ritual. If the attributes of the computer preclude such a mode of

performance, then the medium deserves to be examined, in order to determine

what can be done to engender the technical conditions from which an embodied

performance practice might arise.

6

1.2 Flow

The matter of music is sound and body motion.

— Aristides, De Musica

Performers of conventional acoustic instruments often talk of the sense of flow

they experience while playing.1 It’s a way of being that consists in the merging of

action and awareness, and the loss of any immediate sense of severance

between agent (the performer) and environment (the instrument, the acoustic

space, the social setting, and other providers of context). It’s the kind of

absorbing experience that can arise in the directed exchange between an

embodied agent and a physical mechanism, and it’s a coupling that happens as a

matter of course with acoustic instruments. Conventional acoustic instruments

offer resistance to the body of the performer, and their responses are tightly

correlated to the variety of inputs from the performer’s body that are afforded by

the mechanism.2 In a sequence of on-going negotiations between performer and

1 For a more complete account of "flow," in the sense that I will use the term, see

Mihaly Csikszentmihaly, Flow: The Psychology of Optimal Experience (Csikszentmihaly

1991). For a concise summation of the applicability of Csikszentmihaly's ideas to

instrumental performance see Burzik's "Go with the flow" (Burzik 2003).

2 The notion of "affordance" was introduced by psychologist James Gibson (Gibson

1977, 1979). In the Gibsonian sense, an affordance is an opportunity for action that the

environment presents to an embodied agent. As such, the term accounts for the particular

7

instrument, the performer adapts to what is uncovered in the act of playing,

continually developing new forms of embodied knowledge and competence. Over

a sustained period of time, these negotiations lead to a more fully developed

relationship with the instrument, and to a heightened sense of embodiment, or

flow.

Performance with a conventional acoustic instrument serves as a useful

example of an embodied mode of human activity, and of an engaged coupling

with a complex physical mechanism. But in the context of the present study, I’m

not specifically interested in appropriating the conventions of acoustic

instrumental practice for computer music, or in modeling acoustic instruments in

the digital domain. Along with those writers who would proclaim the advent of a

new computer music performance practice, I hold that the computer, considered

as a performance medium, presents new and unique problems and prospects. But

where those writers focus on the shortcomings of the audience, I focus on the

shortcomings of current theory and technologies, and on the body of the

performer—not because of the body’s historical coupling to conventional

instruments, but because I choose to conceive of the body as a site of possibility

and resistance, and because it seems that the computer has a way of both

limiting the body’s possibilities and diminishing its potential for resistance.

physical and perceptual attributes and abilities of the agent, as well as her intentionality.

To borrow an example from Andy Clark: "... to a human a chair affords sitting, but to a

woodpecker it may afford something quite different (Clark 1997:172)."

8

There are attributes, then, to the experience of playing a conventional

acoustic instrument that are pertinent to thinking about the design of digital

musical instruments that would allow for embodied modes of performance. The

optimal performative experience—this somewhat intangible and elusive notion of

flow—could be characterized as a way of being that is so direct, immediate and

engaging, that the normative senses of time, space and the self, are put

temporarily on hold. It amounts to a presence and participation in the world, in

experiential real time and real space, in which meaning and purpose arise not

through abstract contemplation, but directly within the course of action. Such

action involves complex and continuous exchanges and interactions between

senses, the motor system (muscles), the nervous system (including the brain),

and the social and physical environment in which the ritualised act of

performance is embedded. In short, the experience of flow, of a heightened

sense of embodiment, involves an immediately palpable feeling of active

presence in a world that is directly lived and experienced. Traditional though it

may seem, these are qualities that I believe are central, and will remain central,

to musical performance. If the computer is going to figure as a musical

instrument, and if it does not presently lend itself to embodied form of

interaction, then some work needs to be done.

1.3 The Criteria of Embodied Activity

Over the course of this essay, I’ll return to what I take to be the five key criteria

of embodied musical performance; or, more specifically, the five key criteria of

9

the particular kind of embodied mode of interaction with digital musical

instruments that I hope to uncover through outlining a philosophically informed

approach to instrument design. Those criteria are:

1. Embodied activity is situated. Embodiment arises contextually, through

an agent’s interactions with her environment. The agent must be able

to adapt to changes in the environment, and in her relationship to it,

without full prior knowledge of the features of the environment, or of

its structure and dynamics.

2. Embodied activity is timely. Real world activity involves real-time

constraints, and the agent must be able to meet these constraints in a

timely manner. This means that it is incumbent on the agent to not

disrupt the flow of activity because her capacity for action is too slow.3

3. Embodied activity is multimodal. A large portion of the agent’s total

sensorimotor capabilities are galvanised in performance. This involves

3 David Sudnow uses a nice example of untimely behavior in Ways of the Hand:

Recall Charlie Chaplin on the assembly line in Modern Times: the conveyor belt

continuously carrying a moving collection of nuts and bolts to be tightened, their

placements at regular intervals on the belt, Chaplin holding these two wrenches,

falling behind the time, rushing to catch up, screwing bolts faster to stay ahead of

the work, missing one or two along the way, because the upcoming flow seems to

gain speed and he gets frantic, or because it actually does speed up, eventually

caught up in the machine and ejected onto the factory floor in his hysterical

epileptic dance. (Sudnow 2001:32-3)

10

optimising the use of the body’s total available resources for cognition,

action and perception, with an emphasis on the concurrent utilisation

of distinct sensorimotor modalities, as well as the potential for mutual

interaction, or cross-coupling, between those modalities.

4. Embodied activity is engaging. The sense of embodiment arises when

the agent is required by the task domain. That is, the environment is

incomplete without the involvement of the agent, and it presents

challenges to the agent that consume a large portion of her attention.

5. The sense of embodiment is an emergent phenomenon.4 That is,

optimal embodied experience arises incrementally over a history of

sensorimotor performances within a given environment or phenomenal

domain. There is a link between increasing sensorimotor competence

within the task domain and the sense of embodiment.

Borrowing from cognitive scientists Francisco Varela, Evan Thompson and

Eleanor Rosch (Varela, Thompson, and Rosch 1991), I will refer to the embodied

mode of performative activity I’m outlining here as enactive. I’ll address the

concept of enaction in more depth in Chapter 3, but for the time being it’s useful

4 This criterion could perhaps have been condensed into the phrase "embodiment is an

emergent phenomenon." But this is potentially misleading, as embodiment is a given for

biological systems; i.e., living organisms do not emerge into their bodies. The sense of

embodiment, then, is phenomenal, whereas the fact of embodiment is objective. The

implications of this double sense of embodiment—of its "inner" and "outer" aspects—are

explored in Chapter 3.

11

to emphasize the centrality of the body to the enactive model of cognition. In

contrast to orthodox views of mental process that view cognition as the internal

mirroring of an objective external world, the enactive perspective takes the

repeated sensorimotor interactions between the agent and the environment as

the fundamental locus of cognitive development. This encompasses the dynamics

of the experiential present, i.e., that which is ineluctably the “now,” but it also

encompasses the emergence and development of knowledge and competence,

i.e., the cognitive dimension of activity.

In the enactive view, cognition is fundamentally an embodied phenomenon; it

arises through and within an agent’s physical interactions with her environment.

To that extent, the “now” of lived experience, of an instantaneous conceptual and

corporeal disposition within a given environment, plays a determining role in the

emergence of cognitive systems and structures, and cognitive systems and

structures, in turn, play a determining role in constituting the “now.” It’s an

ongoing, circular, and fully reciprocal process of mutual determination and

specification in which subjectivity and the sense of embodiment are in a

continuous state of flux. This model of cognition, with its emphasis on bodily

involvement in the “bringing forth of a world,”5 provides a template for the

performance practice that I hope will emerge from this study.

5 The expression is borrowed from Varela, Rosch and Thompson's The Embodied Mind

(Varela, Thompson, and Rosch 1991).

12

1.4 The Computer-as-it-comes

A number of authors (Agre 1995, 1997; Clancey 1997; Dourish 1999, 2001;

Stein 1999; Winograd and Flores 1986) have shown that it is no easy task to

design computing devices that would allow for embodied modes of interaction.

The prevailing guiding metaphors of computer science (CS) and human computer

interaction (HCI)6 are at odds with the embodied/enactive approach, and

routinely preclude modes of interaction that are situated, timely, multimodal, and

engaging, or that lead to a heightened sense of embodiment over a history of

interactions. And while the subset of computing devices that is of specific interest

to this essay—digital musical instruments—is these days comprised of a vast and

diverse array of implementations, the field as a whole has not been immune to

6 The "prevailing guiding metaphors" of CS and HCI—i.e., the epistemological

underpinnings of what I have labelled "conventional" CS and HCI—will be outlined in terms

of a computationalist ontology in Chapter 2. Lynn Andrea Stein has suggested that it was

a matter of historical contingency that saw the computationalist approach hold sway in the

formative days of computer science:

Cybernetics took seriously the idea of a computation embedded in and coupled to

its environment. These were precisely the issues suppressed by the

computationalist approaches. In the intellectual battles of mid-century, cybernetics

failed to provide the necessary empowerment for the emerging science of

computation and so was lost, dominated by the computational metaphor. The

nascent field of computational science was set on a steady path, but its

connections to the world around it were weakened. (Stein 1999:482)

13

the guiding metaphors of conventional CS and HCI. This is not to say that all

digital musical instruments have failed to realize the potential for embodied

modes of interaction. But rather, those instruments that have managed to realize

this potential have done so despite the conventional tenets of CS and HCI.

It may be useful to distinguish between two main currents in present day

computer music performance practice. The first of these would take the personal

computer more or less as it comes (with minimal or zero additions to the

standard input devices), and would normally be characterized by the “near-

motionless” mode of performance described earlier in the chapter. This practice is

often encapsulated under the rubric of “laptop music,”7 and has given rise to a

so-called “laptop aesthetic (Jaeger 2003).” The second of the two currents is

defined precisely through its non-acceptance of the “computer-as-it-comes” as a

musical instrument. Rather, the practitioners seek to extend computing devices,

or even completely reconfigure them, through the development and integration

of new technologies designed specifically for musical performance. This is the

field of activity to which my own work belongs, and I will refer to it under the

(intentionally) broad term of “digital musical instruments.” A third current could

also be identified, of “extended acoustic instruments,” in which the computer is

used as a signal processing add-on or improvising partner to a conventional

7 The term "laptop music" surfaced in the second half of the 1990s, at around the

same time that the first "laptop performers" began to appear. For a diverse range of

assessments of laptop performance practice and its reception, see the articles collected in

Contemporary Music Review 22 (4), 2003.

14

acoustic instrument. But as the presence of the acoustic instrument already

invokes the potential for embodied performance, this area of practice is not of

specific relevance to the present study.

The “computer-as-it-comes” is a term that will appear throughout this essay.

What I intend to denote is not so much a specific device (although it could be),

but rather a general notion of the more or less generic personal computer; the

technological instantiation of the conventional guiding metaphors of CS and HCI.

This is the computer that “laptop music” adopts wholesale into its performance

practice, and the same computer that those working towards “digital musical

instruments” would seek to re-engineer in order to arrive at embodied modes of

performance.

There has been a great deal of activity in recent years in the development of

new digital musical instruments. There has also been a steadily growing corpus of

scholarly articles, research papers and theses on issues in live computer music.

While this has lead to numerous innovations in both the theory and technology of

computer music performance, there remains a near total absence of work related

specifically to the philosophical foundations of instrument design. I believe that

the most pressing issues in arriving at designs that allow for embodied forms of

musical interaction with computers are philosophical, and that in order to arrive

at sustainable designs for enactive instruments, the limits and potentialities of

the current computational media—i.e., the defining attributes of the computer-

as-it-comes—need to be examined in philosophical terms.

The tendency in digital musical instrument design has been to focus on the

pragmatic issues of design: specific sensor and actuator technologies, audio

15

synthesis methods, mapping strategies, and so on. Without addressing these

issues at some point, there will, of course, be no digital musical instruments of

which to speak. But in this essay I focus more on the theoretical and foundational

issues of design, with a view to providing a conceptual touching stone for the

pragmatic stage. While there is a great deal of overlap between the pragmatic

and the foundational issues, it seems to me that the shift of emphasis is

potentially very useful. Without proper attention to the foundational issues, there

is a greater likelihood that designers will unwittingly fall back on the received

tenets of CS and HCI, even though those tenets may (and more often than not

will) work against the bringing into being of enactive instruments.

The personal computer brings with it a sizable repertoire of usage

conventions, and, all too regularly, designers end up drawing on the conventional

patterns of use without proper consideration of the implications of those patterns

for the end user. As I will endeavour to show, these implications are philosophical

in origin, reflecting world views, and models of interaction, behavior and

cognition, that are immanent in designs, and, in turn, in the technological

artifacts that result from those designs. If a medium precludes a desired usage—

an embodied mode of interaction, for example—and if it does so because of the

world models that are embedded in its very mechanism, then that medium needs

to be examined with a philosophical perspective in order to arrive at a better

understanding of the ways in which it determines its patterns of use. This is the

first step towards rethinking and reconfiguring those patterns, and towards

arriving at designs that are more fully and properly geared towards the

requirements and desires of embodied human actors.

16

2 The Interface

Musical ideas are prisoners, more than one might believe, of musical devices.

— Pierre Schaeffer, Traite des Objets Sonore

2.1 Interaction and Indirection

Interaction takes place when signals are passed back and forth between two or

more entities. Interactions between a human and a computer are conducted

through an interface. The interface provides the human with a means of access to

the programs running on the computer; it consists in providing an appropriate

abstraction of computational data and tasks to the user. Input devices (such as

keyboards and mice) capture signals from the user that are mapped, through the

interface abstraction layer, to changes in the state of computer programs. Output

devices (such as monitors, loudspeakers and printers) transmit human-decodable

respresentations of the state of the running programs from the computer back to

the user. Human-computer interface design is therefore concerned with providing

the user with a set of usage practices, protocols and procedures appropriate to

the task domain for which the interface, in the first instance, is required.

17

One thing that distinguishes the computer from tools such as, say, the

canonical example of the hammer,1 is the absence of any direct correlation

between the physical domain in which a computational task is carried out, and

the way in which that task is conceived by the user. The hammer, considered as

an interface, is correlated within the user’s cognitive apparatus to the physical act

of hammering. But a computer user’s interactions with a computer are rarely, if

ever, correlated within the cognitive apparatus to the electrical phenomena that

constitute the physics of computation. Rather, in order to accomplish meaningful

tasks with computers, its physical operations are abstracted, and the task domain

is presented to the user in the form of graphical and auditory representations. It

follows that interactions with a computer are necessarily indirect. This sets the

medium apart not only from the hammer, but from the overwhelming majority of

tools that humans use, including conventional acoustic musical instruments.

Already, then, in the distance that the interface imposes between the human

and the computer, we see the “disconnect” between agent and medium. The

physics of computational media consists in the regulated flow of electrons

through circuits, and the human agent does not interact with those circuits in any

kind of physically direct manner. Rather, in order for significant interactions to

take place, the interactional domain needs to be designed, input devices need to

1 The hammer example has figured large in philosophy of technology and media theory

since its appearance in Heidegger's Being and Time (Heidegger [1927] 1962) and “The

Question Concerning Technology (Heidegger [1949] 1977)." For an interesting analysis of

the role that the hammer has played within these discourses, see Don Ihde's Instrumental

Realism (Ihde 1991).

18

be be mapped to tasks and procedures in software, and software data need to be

transmitted to the user in the form of representations. The overriding goal of

conventional human-computer interface design is to reduce the inevitable

distance between agent and medium, ideally to the extent that the user comes to

conceive of the task domain directly in the terms of the representations that

comprise the interface.2 To a certain extent, reducing the degree of indirection

between agent and medium is also the goal of the present study. But an enactive

model of interaction will require an entirely different approach to that taken by

conventional HCI.

2.2 Representation and Cognitive Steering

Things is what they things.

— π.o. (printed on a coffee mug)

The computer-as-it-comes packages interface abstractions into

representational frameworks—i.e., aggregated metaphorical schema—that are

customarily (though somewhat inaccurately) characterized as software. Users of

2 This is the express goal of so-called "direct manipulation" interface models (see 2.5

below). More radical approaches, such as tangible and ubiquitous computing, would seek

to embed computing devices directly (and invisibly) within the user's environment

(Dourish 2001; Greenfield 2006; Norman 1999; Ullmer and Ishii 2001; Weiser 1988,

1991, 1994; Weiser and Brown 1996).

19

personal computers are familiar with the now standard interface metaphors for

the routine management and maintenance of their computer systems: files,

folders, desktops, workspaces, trash cans, and the like. It’s a suite of

bureaucratic abstractions, extrapolated from a real-world task environment that

is likely familiar to the user, that serves to facilitate bureaucratic work, keeping

the play of regulated voltages—the physical agency through which that work is

actually accomplished—well out of the user’s immediate zone of awareness. The

interface amounts to a model of the world; an encompassing system of

metaphors that serves to both guide and regulate the agent’s thoughts and

activities through intrinsic correspondences to everyday objects and activities.

It’s an unusual transaction that takes place between the designers of

computer interfaces and the end users of those interfaces. Through the set of

interactions made available by whichever incidental pre-packaged

representational world, the user participates in whichever incidental model of the

world happens to be implicit to the design. Models of the world are born out of

philosophical systems. However well-formulated or defined those philosophical

systems may be, and however conscious a designer may be of the philosophical

underpinnings of the decisions made during the course of design, the transition

from design to artifact nonetheless remains loaded with epistemological

implications for the end user. This is an unavoidable side-effect of indirection,

and, despite a great deal of attention within the fields of interaction design and

20

technology studies,3 it’s a side-effect that remains beyond the bounds of

consideration for a large number of designers, and an even larger number of end

users. As Philip Agre has put it, “technology at present is covert philosophy (Agre

1997: 240).”

As the interface delineates the conceptual milieu to the user, it orients the

user’s cognitive activity. Through repeated performances, a set of implicit

assumptions as regards the elements and structure of the task domain begins to

solidify, and through a chain of subtle reciprocal influences, the repertoire of

meaningful performance actions becomes more or less fixed in bodily habit. This

is what Merleau-Ponty defines as an incorporating practice; a process in which

actions are literally incorporated—i.e., registered in corporeal memory—through

repeated performances (Merleau-Ponty [1945] 2004). These bodily habits do not

so much comprise a catalogue of discrete and distinct states as they do a

collection of dispositions and inclinations; arrangements within which the agent is

potentially free to move, but which at the same time determine the structure and

dynamics of those movements. In a similar vein to Pierre Bourdieu’s notion of the

habitus, there comes to exist “a durably installed generative principle of

regulated improvisations (Bourdieu 1977: 78).” To the same extent that an

interface encapsulates a model of the world, then, it encapsulates a model of

3 In particular, see Heidegger, “The Question Concerning Technology (Heidegger

[1949] 1977), Feenberg, Critical Theory of Technology (Feenberg 1991), and Agre,

Computation and Human Experience (Agre 1997).

21

performance. These dual aspects are inextricably intertwined, and over the

history of an agent’s interactions, they are mutually reinforcing.

Merleau-Ponty’s concept of incorporation is consistent with the enactive model

of cognition. In the enactive view, the systems and structures that play a

determining role in the formation of cognitive patterns are in turn determined by

the emergent patterns of interactional dynamics. Or to put it another way, at the

same time that repetitive dispositions towards action and modes of perceiving are

engendered within the agent’s sensorimotor mechanisms, “cognitive structures

emerge from the recurrent sensorimotor patterns that enable action to be

perceptually guided (Varela, Thompson, and Rosch 1991: 172-173).” This

formulation is essentially a latter day reworking of the fully recursive process,

encompassing incorporating practices, that Merleau-Ponty defined as the

intentional arc4 (Merleau-Ponty [1945] 2004).

In this feedback loop at the heart of the enactive view, there is a high degree

of reciprocal determination and specification between perception, action,

cognition, and the contingencies of the environment in which perception, action

and cognitive process are embedded. If we accept that these dependencies are

real, then it will make little sense, when examining an interactional domain with a

view to the emergence of cognitive and performative patterns, to draw a hard

4 In Varela, Thompson and Rosch's The Embodied Mind (Varela, Thompson, and Rosch

1991)—the book in which "enactive cognitive science" is first outlined—the authors

acknowledge their debt to Merleau-Ponty's phenomenology. See in particular the book's

introduction and opening chapter.

22

dividing line between action and cognition, or between the mind and the body. It

will also make little sense to examine computer interfaces, and the metaphorical

schema that those interfaces encapsulate, without due regard to their

contingencies and particularities, and the potential implications for the thoughts

and actions of the people who will interact with them. These are important

concerns not only when arriving at new designs, but also when looking at the

consequences of existing designs for performance.

It would seem that the more closely we examine the interface in use, the

more quickly the common notion of the interface as a passive and impartial

means to an end begins to break down. We come to see that it is far from

transparent to the task domain to which it is applied, and we begin to understand

it “not as an add-on which allows a human to come into relations with an

underlying structure, but rather as constitutive of that very structure (Hamman

1997: 40).” At the same time that the boundaries of the user’s potential

repertoire of actions and perceptions are determined by the epistemological

underpinnings of the representations that comprise the interface, the interface

reveals itself as embodying a theory of knowledge and performance.

But it’s how this theory of knowledge and performance is embodied in the

interface that is of specific interest to this study. The personal computer arrives

from the vendor prepackaged with a vast collection of programmed responses,

the user adds to these with the installation of new software, and accomplishes

tasks through the agency of the now standard input and display devices—the

keyboard, the mouse, the monitor, and the loudspeakers. The affordances of the

computer-as-it-comes determine the limits of what is possible within any

incidental task domain, and the user comes to learn, from one piece of software

23

to the next, the kinds of behaviors and outcomes that might be expected to come

about as a result of her regulated interactions with the medium. It may well be

that for the majority of tasks for which personal computers are routinely used,

the computer-as-it-comes is a perfectly adequate medium. But I will endeavor to

show that it is precisely the models of activity that are embedded in the interface

to the computer-as-it-comes that preclude the sense of optimal embodied

experience—the sense of flow—that can arise in complex real-time activities such

as musical performance with conventional acoustic instruments.

The predominant guiding metaphors of human-computer interface design,

through the agency of software abstractions, and input and display devices, are

geared towards routine forms of activity. For complex, situated, embodied and

real-time forms of activity, we are in need of new metaphors, new ways of

thinking about design, and new technologies. Before heading straight to the

drawing board, however, it’s worth considering what it is, exactly, about the

computer-as-it-comes that sways the user into a routine-oriented mode of

activity, and thereby precludes the potential for embodied and enactive modes of

interaction.

2.3 Computationalism

While little has been written about the philosophical basis of interaction design

with specific regard to digital musical instruments—or even, for that matter, with

regard to personal computers in general—it’s nonetheless a topic that has

received some considerable attention, particularly over the last fifteen years, in

24

artificial intelligence (AI).5 Having accomplished so little of what the pioneers of

the field promised in the 1960s, AI theorists and practitioners have been forced

to critically re-examine the institutionally endorsed models of perception, action

and reasoning that originally appeared to have such vast potential. This has led

to some important questions being raised as regards the traditional foundations

of interaction design, as well as the various philosophical assumptions on which

those foundations are built.

As a succession of AI implementations would bear out, symbolic

representations of real-world task domains must take into account a huge

number of environmental variables if the artificial agent-at-large is to be

endowed with even a sub-insect capacity for sensing and locomotion. As the

complexity of the agent’s environment increases, the number of environmental

variables also increases. In turn, the number of conditions that must be encoded

in the agent’s representation of the environment increases in geometric

proportion. Given an environment of incrementally increasing complexity, it does

not take long before the computational load on the artificial agent ensures

against its being capable of the rapid real-time responses that we witness in the

various creatures that inhabit the real world. Moreover, the agent has no capacity

for responding to features or obstacles that appear in the environment

5 In particular see Haugeland (1985), Winograd and Flores (1986), Brooks (1991),

Dreyfus (1992), and Agre (1997).

25

unexpectedly, as each new object requires that a new representation be added,

by an engineer, to the agent’s model of the world.6

It was precisely these kinds of problems that prompted a small faction of AI

researchers to question the very principle of symbolic representation.7 This would

be no simple task, as since the advent of the Church-Turing thesis (Church 1932,

1936; Turing 1936) computation has largely been conceived as the

algorithmically codifiable manipulation of symbols, where those symbols stand in

for objects and operations in the world. But even this notion of computation—the

originary notion of computer science—is itself already grounded in an older

notion, namely, the mechanistic explanation of the 17th century. The breakaway

AI researchers would be arguing, then, not only with the accepted wisdom of the

field, but with Descartes, Hobbes, Leibniz, Locke and Newton. They would be

arguing against the guiding rubric of computer science, conventional AI, and so-

called “hard” cognitive science, that has variously been labeled “mentalism (Agre

1995, 1997),” “the computational metaphor (Stein 1999),” and

“computationalism (Dietrich 1990; Scheutz 2002).” Computationalism is the term

that I will use.

6 For an interesting overview of the various problems posed by the symbolic

representation approach in AI, see the introduction to Andy Clark's Being There (Clark

1997).

7 The first viable alternative to the symbolic representation approach is outlined in

Rodney Brooks' "Intelligence without representation" and "New Approaches to Robotics

(Brooks 1991, 1991)."

26

At the heart of the computationalist perspective is the presumption that we

reason about the world through mechanized procedure; i.e., through the

deductive manipulation of symbols that stand in for objects and operations in the

world. Mental activity, therefore, consists in extrapolating data from the world,

coding abstractions from that data, and reasoning about the representational

domain that those abstractions comprise. This is, by and large, the way in which

we program computers to simulate real-world problems and dynamics, and the

successes of computer science can make it rather easy to anthropomorphize the

process of computation; to see in the mechanical procedure a simulacrum of

human thought. It’s such a view that provided the original impetus of AI

research, and has led to what Agre has termed “a dynamic of mutual

reinforcement … between the technology of computation and the Cartesian view

of human nature, with computational processes inside computers corresponding

to thought processes inside minds (Agre 1997: 2).” Essentially, the

computationalist rubric would have it that computation is synonymous with

cognition.

It’s beyond the scope of the present study to enter into what remains a major

debate in the philosophy of mind and cognitive science over the mechanistic

foundations of thought. I will however argue that the tacit acceptance of the

computationalist approach will prove to be a stumbling block in the design of

computer interfaces for musical performance, in much the same way that it has

already proven to be a stumbling block in the design of artificial agents. If, as

Andy Clark has noted, and as the failings of AI would bear out, “symbol

manipulation is a disembodied activity (Clark 1997: 4),” then the computer-as-it-

comes—a materialization of the computationalist paradigm—already precludes

27

the possibility of embodied forms of interaction. With the current state of

knowledge about the workings of the nervous system, there is no precise way of

determining whether this is a matter of a symbolic overload, or a

“representational bottleneck” (Brooks 1991), for the human agent. But what can

be seen in the computationalist model of representation is a fundamental

objectivism in which the reasoning of the agent, whether human or artificial, is

situated above and outside the environmental embedding of the agent’s body. In

other words, the agent performs manipulations on symbolic representations of

the task domain in a realm of mental abstraction that is always and necessarily

disconnected from the environmental niche in which activity actually takes place.

The agent is, therefore, a kind of transcendental controller, coding abstractions

and reasoning about a world that forever remains exterior to cognitive process.

There is, then, an essential dualism at the heart of the computationalist

model of cognition. But it’s a specific variety of dualism; one that sets an “inside”

against an “outside.” It corresponds to a manner of thinking about the world that

George Lakoff and Mark Johnson have identified as the container metaphor:

We are physical beings, bounded and set off from the rest of the world by the

surface of our skins, and we experience the rest of the world as outside us. Each

of us is a container, with a bounding surface and an in-out orientation. (Lakoff and

Johnson 1980: 29)

From this ontological grounding, the container metaphor extends to various ways

in which we conceptualize time and space, the elements of visual, aural and

tactile perception, and events, actions, activities and states. In the discourse of

computationalism, such conceptualizations come about as a result of an abstract

inner space, the “mind,” setting itself in contradistinction to both the body—which

28

is viewed as little more than a transducer of sensory experience—and the outside

world.

The container metaphor is consistent with the mechanistic explanation. The

“bounding surface” of the mind is traversed by sensory stimuli, these stimuli are

converted into representations of the world-as-perceived, these representations—

along with the representations of structure that establish their logical

connections; a kind of propositional calculus—are stored as the contents of the

mind, and these contents form the basis from which plans are constructed, “by

searching through a space of potential future sequences of events, using one’s

world models to simulate the consequences of possible actions (Agre 2002:

132).” When those plans result in behavior, the agent reaches the end of the

sequence of events that characterizes the “in-out orientation” of the mind, and

the relationship between agent and world has, in some way or other, been

altered.

The pertinence of the container metaphor to the present study lies in the

strong separation it enforces between agent and environment, as well as

between mind and body, and also in the sequential model of activity that it

presumes. Implicit to the container metaphor is the assumption that cognition is

fundamentally distinct from perceiving and acting, and that mind and matter—in

the tradition of the Cartesian res cognitans and res extensa—are necessarily

separate. It’s precisely because of the schism between thinking and acting that

activity is sequential—the agent must form an internal represention of the

domain and construct a plan before deciding on appropriate action. There is an

inevitable delay, then, between decision and action. And over the iterative chain

that would characterize extended activity—a chain of actions following decisions

29

following actions—a sequence of such delays punctuates the flow of activity. This

is a point that I will explore more fully in the next section.

The transition from textual to graphical modes of interaction with computers

brought with it significant implications not only in terms of how humans and

computers interact, but in terms of the accessibility of computing machinery to

non-specialists. In order to make computers more accessible, the interaction

paradigm would need to be both immediately intuitive to the broadest possible

range of human subjects, and applicable across the widest range of known and

as-yet-unknown task environments. The great success of the WIMP (window,

icon, menu, pointing device) model is due, as least in part, to the way in which it

galvanizes the user’s knowledge of the world; this so-called “direct manipulation”

style of interaction draws explicitly on the user’s capacity to identify symbolic

representations of data (files) and processes (programs), and—through actions

such as dragging, dropping, clicking, double-clicking, etc.—to accomplish the

tasks required by the activity domain.

The graphical interface paradigm is nowadays so pervasive, and so obviously

effective, that few people would think to question the kind of user knowledge on

which it draws. But on examination, it reveals itself to be an instance of the

container metaphor; the workspace is a container for folders, which are in turn

containers for files, the user puts things “in” the trash, “opens” a file or program,

and so on. The representational domain is functionally isomorphic with the

Cartesian model of mind, and, by virtue of the interface, the user comes to

encounter the virtual environment in much the same way as the Cartesian

subject encounters the world; through an “in-out orientation” to an environment

populated by well-defined objects.

30

If we consider these encounters with the virtual environment in light of the

constitutive role that the interface plays in determining user activity, we can

discern that the interface, over a history of interactions, will bring about the

“disconnect” between agent and environment that is implicit to the container

metaphor. The model of the world as embodied in the interface will effectively

lead to its own realization.

I take no position on the suitability of objectivist forms of representation to

everyday or mundane computational tasks. When those representations stand in

for the states of a task domain, they bring the user to conceive of the task

domain, and to act within it, in such a way that the focus is directed at changing

or manipulating those states. The objects of the virtual environment provide the

locus for interaction, and the user retains the status of detached controller. All is

(most likely) well for the maintenance of a spreadsheet, or for uploading files to a

server; activities in which it would make sense to have an objective cognizance of

the contents of the task domain. But these are not ordinarily the types of

activities in which the agent’s optimal bodily experience, or the sense of flow,

have significant bearing on the effective accomplishment of the task at hand.

If, however, we are considering the suitability of the computer-as-it-comes as

a musical instrument that would allow for embodied modes of interaction, the

kind of representations that serve as the access points to the medium, and the

world models that those representations embody, constitute an important matter

for consideration. By definition, an enactive model of performance would situate

the agent’s cognitive acitivities entirely within her environment. An objectivist

model of representational content, which would situate the agent’s cognitive

activities outside her environment, therefore throws up a not inconsiderable

31

obstacle to arriving at embodied modes of interaction. But there can be no

practicable form of interaction with a computer without an interface, and an

interface requires that the computational activity be represented in some form or

other; even if the form of representation is the physical embodiment of the

computing device itself.8 The crucial point, then, is the form of representation.

More specifically, it is the difference between those forms of representations that

set out to passively encode the state of the task domain, and those that would

seek to structure the agent’s active involvement within the task domain. This is a

point to which I will return throughout the essay. First, however, it’s worth

examining in closer detail the costs to performance of unwittingly adopting the

objectivist/computationalist model of representation that is ingrained in the

methods of conventional CS and HCI.

8 This is precisely the representational strategy behind tangible computing. In terms of

the magnitude of representational abstraction, tangible interfaces are of a very low order.

If, for example, the user of a tangible device manages to put the idea that she is

interacting with a computer out of mind, her cognizance of the interface is of the same

order of abstraction as the Gibsonian affordance ("this chair affords sitting"). For

numerous examples of tangible user interface devices see the website of the Tangible

Media Group at MIT (http://tangible.media.mit.edu; accessed July 25, 2006).

32

2.4 Sensing and Acting

A movement is learned when the body has understood it, that is, when it has

incorporated it into its ‘world,’ and to move one’s body is to aim at things through

it; it is to allow oneself to respond to their call, which is made upon it

independently of any representation.

Maurice Merleau-Ponty, The Phenomenology of Perception

Although the “disconnect” between agent and environment is intrinsic to the

container metaphor as applied to the computationalist model of mind, the

container metaphor does not in itself account for how we experience or perceive

a disconnect in, for example, musical performance with a laptop computer. And

despite the ways in which the WIMP interface regulates the activities of the user,

and indeed situates her in a specific and highly determined relation to the

medium, it’s nonetheless entirely possible for that user to become seemingly

immersed in the task environment, for the experience to be that of direct

manipulation of the interface contents, and for the medium to effectively

disappear9 from use.

9 Disappearance is an important concept in Heidegger's philosophy of technology,

particulary in Being and Time (Heidegger [1927] 1962). In short, disappearance is an

indicator of the moment at which the tool user ceases to experience the tool as separate

from her body. A state of immersion in the task for which the tool is required leads to the

33

Immersion in the task environment does not, however, provide a guarantee

of an embodied mode of interaction. It’s likely that an immersive activity is

engaging; and as such, it fulfills one of the five key criteria of embodied activity

that I outlined in Chapter 1. It would even seem to follow that an immersive

activity is, by definition, a situated activity. On closer inspection, however, this

would not seem to be the case in the specific example of interaction with the

computer-as-it-comes. The context of embodiment, i.e., the environment within

which the agent’s sense of embodiment arises, is the real world. Immersion in a

virtual environment—e.g., the environment constituted by iconic abstractions of

computational data and tasks, as in the WIMP model—involves situating the

agent’s attention and intentions squarely within that virtual world. As a

consequence, the disconnect with the real world is proportional to the amount of

attention consumed by the objects that populate the virtual world. The agent is

immersed in the activity, but it’s that very immersion that determines that the

activity is not embodied. Immersive activity involving the computer-as-it-comes

is therefore substantively different to immersive activity involving, say, a

conventional acoustic musical instrument.

In itself, this is still a superficial treatment of a very subtle process, and it

would seem that there’s more to the issue than drawing a tidy distinction

between the virtual and the real, or between abstract and direct modes of

user, in the midst of activity, experiencing the tool as an extension of her body. To that

extent, the tool disappears as an object of consciousness.

34

interaction. This is where the issues of timeliness and multimodality—another two

of the five key criteria of embodied activity—enter the picture.

I’ve already discussed the ways in which the core elements of the WIMP

model of interaction—the window, icon, menu and pointing device—play a

determining role in the formation of objectivist concepts in the computer user’s

cognizance of activity. But it’s not simply a matter that because the interactional

domain is an instance of the container metaphor, the user will come to think and

act in terms of the objects that populate a world exterior to cognitive process;

there’s also the how of the container metaphor’s instantiation within the WIMP

model. The key consideration here is the modes for the transmission of signals

from computer to user, and from user back to computer; or, more specifically, it

is the sensory and motor mechanisms that are called into use, and the

sensorimotor habits and patterns that are engendered by those modes of

transmission.

There are two key facets to the WIMP model that guarantee that interactions

with the computer-as-it-comes can never be multimodal: 1. at any given

moment, there is only a single and discrete centre of interaction; i.e., the mouse

or text cursor, and 2. the weight of emphasis on visual forms of representation

consumes a large portion of the user’s attention, and in doing so diminishes the

potential for involvement of the other sensory and motor modalities. These are

the two aspects of a mode of interaction—the typical mode of interaction with the

computer-as-it-comes—that are experienced by the user as an on-going

sequence of pointing, clicking, typing and observing. As the gaze is directed

towards an icon of interest to the task, the hand works in tandem with the eyes

to move the mouse cursor towards that icon. When the cursor and icon converge

35

on the screen, the fingers click on the mouse button, or press keys on the

keyboard, to elicit a response from the on-screen abstraction.

The immediate cost of the visuocentric approach to the non-visual

sensorimotor modalities is self-evident: the more cognitive resources are

allocated to vision, the fewer remain for the agent’s other sensors and actuators.

But there is another aspect that is perhaps less obvious, and this is where the

mode of interaction coincides with the issue of timeliness. The single point of

interaction that is characteristic of the WIMP model of interaction leads to a mode

of activity that is characterized by a sequential chain of discrete user gestures;

the flow of time is effectively segmented into discrete chunks, where any action

can be taken only after the prior action has been completed. There is no

concurrency of actions, no possibility of operating at two or more interactional

nodes simultaneously, and no potential for the cross-coupling of distinct input

channels.10 With acoustic instrumental performance, it’s not just the concurrent

use of multiple sensorimotor modalities that leads to the sense of embodiment,

it’s the various ways in which these modalities work together and exert influence

upon one another, and the way in which the performer, as a function of the

ongoing accrual of competence at coordinating the sensorimotor assemblage, is

better adapted to meet the real-time constraints of performance.

10 For a comparative analysis of "time-multiplexed" (single point) vs. "space-

multiplexed" (multiple, distributed points) interaction scenarios, see Fitzmaurice and

Buxton's "An empirical evaluation of graspable user interfaces (Fitzmaurice and Buxton

1997)."

36

With regard to this notion of timeliness, it may be useful to draw a distinction

between planning and agency. The WIMP model of interaction presumes that the

user has plans “in mind,” and that these plans are to be executed, step-by-step,

until the objective of the task-at-hand is met. The system of abstractions and

representations that typify the WIMP model are not geared to the demands of

real time. Rather, building on a model of behavior in which reasoning about

representations formed from sense impressions must always take place prior to

action, each step towards accomplishing the plan will simply take as long as it

takes to sense, infer, and act. Agency, on the other hand, is indicative of

behavior that is adaptive to environmental demands and constraints, where those

constraints encompass the necessity of a timely response. Agency, in this specific

sense, might more properly be defined as “embodied agency.” But whatever the

designation, it points to a mode of performance that is bluntly precluded by the

representational infrastructure of the WIMP paradigm. That infrastructure

presumes a model of reality in which the contents of the world come prior to our

behavioral engagement with the world; a sequence that the enactive approach

would seek to reverse.

It may be interesting to consider if there may be potential misuses of the

computer-as-it-comes that could lead to embodied interactional modes. By

“misuse,” I mean a kind of usage that in one way or another does not correspond

to the usage scenarios presumed by the WIMP paradigm. As “laptop music” has

already figured in the discussion, let’s assume that the computer-as-it-comes is,

in this case, an off-the-shelf laptop computer.

37

The standard input devices of the generic laptop computer are the keyboard

and trackpad.11 According to conventional WIMP practice, inputs at these devices

are coordinated by the position of the cursor on the computer screen. As I type

these words (at my generic laptop computer), the text cursor blinks at the

current text position, indicating the point at which the next character in this

sequence of discrete characters is anticipated. When I’m done typing, I’ll move

my second finger to the trackpad. Upon contact, a mouse cursor will appear on

screen. My finger will guide the mouse cursor to a point at the top-left region of

the screen, where prior experience tells me I will find the word “File.” When the

cursor is over that word, I’ll use my pointer finger to click the trackpad button. A

menu will appear, and as I use my third finger to move the cursor over its

contents, each item will be highlighted in turn. My third finger will stop when the

menu item “Save” is highlighted, and again I’ll use my pointer finger to click the

trackpad button. The structure of the interface determines that my motions will

follow a type-point-click sequence, and that at each step in that sequence my

attention will be directed towards the single point of interaction that the interface

affords.

As I’ve argued, this kind of determination on the part of the interface will

preclude embodied modes of interaction. But with different mappings from the

11 These devices vary from one model of laptop computer to the next; e.g., many

laptops substitute a trackpoint for a trackpad, and the number and type of trackpad

buttons may also vary. For the purposes of this example, I will assume a trackpad with a

single trackpad button.

38

input devices to programs—e.g., mappings that would subvert the inherent

sequentiality of WIMP—the interface acquires new affordances. That is, it solicits

new modes of activity from the user.

Suppose that some piece of sound synthesis software is written, and that it’s

written expressly to be used without graphical or textual feedback from the

computer screen. The cursor, then, is done away with altogether. To minimize

unnecessary distractions in performance, the computer screen could be entirely

dimmed. Interestingly enough, it’s in doing away with the cursor that entirely

new interactional possibilities for the keyboard and trackpad become apparent.

We see that the keyboard does in fact afford multiple points of interaction, and

that these points might be engaged concurrently; an affordance that the blinking

text cursor—along with the accumulated usage history of QWERTY technologies—

had somehow hidden from view. We also see that the trackpad affords

continuous input with two degrees of freedom; an affordance that was not

apparent when trackpad usage was bound up with the task of directing the cursor

to discrete points on the computer screen.

Could this amount to an interface that affords embodied modes of interaction?

The short answer is, I think, perhaps. These misuses of the keyboard and

trackpad would seem to circumvent the impediments to embodied activity that

characterize the WIMP paradigm: singularity and sequentiality. Mappings from

keyboard events to software could be arbitrarily complex, or as simple as the

mapping from piano keyboard to hammer and string (one sound event per key

event). Either way, the interface affords chording; the formation of composite

events from distributed points of interaction. Mappings from trackpad input to

software could afford the continuous modification of the sound events thus

39

triggered by the keyboard; and it’s in the continuity of these modifications that

the inherent sequentiality of pointing and clicking would be circumvented. The

asymmetry of “handedness” would likely determine that, because of the fine

granularity of action required of keyboard input, chording actions would be

performed by the dominant hand, while continuous modificatory actions at the

trackpad would be performed by the nondominant hand.12 To situate the hands in

optimal position, we might turn the base of the laptop at a 30-45° angle to the

standard typing position. We would almost certainly push the (blank) screen to as

flat a position as possible, to put it out of the way of the hands. We may, then,

have the beginnings of an expressive instrument; even, perhaps, of an embodied

performance practice.

What’s interesting about this example is that we have not changed the

physical structure of the interface; i.e., we continue to use the same keyboard

and trackpad that serve as the input devices in the WIMP model. What we have

changed, however, is the potential for interaction that the interface affords; and

the example shows that these affordances are immanent to the map from input

devices to programs. At the same time, then, that we substitute a new map for

the WIMP map, we construct a new model of performance.

Of course, to any regular user of a laptop computer, these new affordances

would need to be learned. And they would need to be learned in spite of the

activities the laptop has previously afforded in everyday use. This is not an

insurmountable task, especially given that users of general purpose computers

12 The role of bimanual asymmetry in interface design is discussed in 4.4.

40

are, to a certain, limited degree, accustomed to learning new patterns of

interaction with each new piece of software. But when I suggested that this

reconfigured laptop would perhaps afford embodied modes of interaction, I did so

out of a hesitation as regards the physical structure of the interface. That is,

while the affordances of the interface have been fundamentally altered by new

mappings from hardware to software, and while it’s entirely feasible that the

performer could develop a timely and multimodal mode of interaction with this

new interface, there nonetheless remains some physical property of the interface

that would seem to be opposed to the development of an embodied performance

practice. This may be an issue of the limited potential for resistance in the

keyboard’s pushbutton mechanism, of the arrangement of keys not being

conducive to chording, of the limited surface area of the trackpad, of the

trackpad’s proximity to the keyboard, and so on. Or, it may simply be an issue of

the instrument’s failure to be properly indicative of use (a topic I will discuss in

Chapter 4). Whatever the explanation, there seems a reasonable possibility that

the instrument will not be engaging over a sustained period of practice. And this

possibility provides enough incentive to turn attention towards the design of

special purpose devices, and to leave unanswered the question as to whether this

general purpose device might, under certain circumstance, afford embodied

modes of interaction.

I’ve been concerned in this section with outlining the ways in which the

standard interaction model of the computer-as-it-comes precludes embodied

activity. One of the hazards of design is the weight of convention on current

practice; a force that often goes entirely unnoticed in design practice. It seems to

me that it’s this very force—and the widespread failure to notice it—that has led

41

to numerous music softwares that buy unwittingly into the model of interaction

that is implicit to the WIMP paradigm. In doing so, these softwares also buy

unwittingly into a model of performance that places abstract reasoning prior to

action; a model that inevitably leads to a disembodied mode of interaction. An

enactive, embodied agent-based model of interaction, then, will need to arrive at

an alternative interactional paradigm to that of the computer-as-it-comes. One of

the main objectives of this study is to outline a sketch of one such alternative.

2.5 Functional and Realizational Interfaces

Something in the world forces us to think. This something is an object, not of

recognition, but of a fundamental encounter.

— Gilles Deleuze, Difference and Repetition

Andrew Feenberg draws a distinction between a “primary” and a “secondary

instrumentalization,” which respectively consist in “the functional constitution of

technical objects and subjects,” and “the realization of the constituted objects

and subjects in actual technical networks and devices (Feenberg 1999: 202).”13

In terms of the implementation of interfaces, the core difference between the

primary and the secondary instrumentalization lies in the way that the task

13 In Feenberg's scheme, primary and secondary instrumentalization respectively

correspond to "essentialist" and "constructivist" orientations of human to medium

(Feenberg 1999, 2000).

42

domain is structured. The functional interface (primary instrumentalization)

serves a predetermined function; it is structured around a finite set of

interactions which are known in advance of the task’s execution. The well-

designed functional interface conceals the specific mechanics of the task, and

presents the user with possibilities for action that draw on familiar and often

rehearsed patterns of experience and use. The realizational interface (secondary

instrumentalization), on the other hand, brings with it the possibility of

continuously realizing new encounters and uses, and, in the process, of re-

determining the relationship between technical objects and their human subjects.

The realizational domain encompasses the contexts of meaning and signification

in which human and medium are embedded, and is conducive to dynamic and

indeterminate forms of interaction. In short, realization is a form of play.

While Feenberg correlates the secondary instrumentalization with a broadly

socialist utopian project, he is nonetheless careful to point out that the primary

instrumentalization, or functionalism, still has it uses. There are a great many

task environments in which it makes sense to facilitate, as transparently as

possible, the accomplishment of the task. Landing an airplane, for example,

presents a situation in which human agency is best served by an immutable

function-relation between the elements of the interface and the range of possible

outcomes that the interface represents; the representational correspondence of

the interface to the world—i.e. the correlation between the system of interface

metaphors and the system of real-world objects and operations for which those

metaphors stand—should, in the interest of maximizing the potential for

continued existence, be static.

43

Efficiency is key to the functionalist approach. In terms of meeting the various

constraints and demands of the task environment, it’s of no use to have the user

waste time on the parsing of a complex metaphorical system, and it’s of no use

to involve her in forms of play. Functionalism aims to minimize the cognitive

load. To that end, the well-designed functionalist interface is comprised of

representations that are immediately familiar to the user. The cognitive effort is

at its optimal minimum when the representations have a directly recognizable

corollary in the user’s prior experience of the world. Indeed, the ideal

functionalist interface would have the user convinced that it consists of no

representations at all; it takes on an artificial transparency through its very

leveraging of the user’s experience. That is, as the task environment obtains its

coherence through the system of representations that comprise the interface, the

user comes to conceptualize the task directly in terms of what is represented; the

representations cease to be denotative, and instead become the intrinsic

elements of the task itself. At that moment the task is conflated with the

metaphorical domain in which it is represented, and the interface effectively

disappears in use; it becomes equipment.14

This situates the user in an interesting position. She is immersed in what

would appear to be the im-mediacy of the task, but the medium is still very much

present, and continues to be constitutive of the structural relation of technical

object and human subject. And while the interface is evidently not at all

14 In Heidegger's terminology, the tool becomes "equipment" at the moment of its

disappearance in use.

44

transparent to the task domain, the more it seems to be transparent, the more

effectively it corresponds to the ideal of functionalist efficiency. In leveraging the

user’s experience of the world, the interface directs her towards a set of

predetermined expectations as regards performance. It minimizes the cognitive

demand and, at the same time, defines an interactional context in which

significance—at least ideally—is invariable. In contradistinction to the domain of

realization, then, the functionalist domain does not encompass the contexts of

meaning and signification in which human and medium are embedded, and is not

conducive to dynamic and indeterminate forms of interaction.

Functionalism has become a standard metric in the evaluation of the

successes and shortcomings of computer interfaces. The idea of leveraging

experience in order to minimize the strain that the interface places on the user’s

cognitive apparatus is a hallmark of “user-centered design” (Norman 1986, 1999;

Norman and Draper 1986), and the extent to which the interface disappears from

the user’s attention constitutes the key criteria for the success of such

approaches. The model of computer interface design known as “direct

manipulation” (Norman, Holland, and Hutchins 1986)15—the model in which the

user drags graphical representations of files into graphical representations of

folders, among other things—already has the aim of the usage enterprise built

15 For an implementation guide to the "direct manipulation" model of computer

interface design, see "The Macintosh Human Interface Guidelines"

(http://developer.apple.com/documentation/mac/HIGuidelines/HIGuidelines-2.html;

accessed March 20, 2006).

45

into the blanketing term; i.e., things work best when the user believes that,

rather than manipulating symbolic abstractions, she is in fact working directly

with the objects of the task domain.

It’s entirely possible that the functionalist approach is optimally effective

across a broad range of routine computational task environments. In much the

same way that it makes little sense to employ dynamic and indeterminate forms

of interaction when landing an airplane, it makes little sense to do so when

balancing a computerized bank account or uploading a file to a server. These are

tasks in which the activity is better served by invariable representations, and in

which the degree of efficiency with which the task may be accomplished is

inversely proportional to the amount of user attention that is consumed by the

interface.

But there is a danger, with functionalism becoming something of a de facto

standard in interaction design, that the functionalist approach is adopted in task

environments where it is not well-suited. That is, in task environments where the

task-at-hand is better served by a realizational approach. In thinking about

designing interfaces for musical performance, we are dealing with such a task

environment.

Where Donald Norman and other key figures in “user-centered design”

champion the disappearance of the interface, the realizational approach would

suggest that the interface offers some form of resistance to the user; i.e., that it

should be irrevocably present. At first glance, this would seem to be at odds with

the notion of flow. One of the key aspects of this paradigmatically embodied form

of activity is its immediacy; and it would seem self-evident that the more the

46

medium obtrudes in use, the less im-mediate the activity. It’s at this point that

it’s useful to draw a distinction between embodied action and enaction. While the

sense of embodiment may be enhanced, or even optimal, when the agent

successfully responds to cognitive challenges, such cognitive challenges are not

prerequisite to embodied activity. For example, the sense of im-mediacy

experienced when the agent is immersed in the act of hammering—the sense

that the hammer is not a distinct object, but an extension of the agent’s

sensorimotor mechanism—is indicative of that agent’s embodiment in action. But

once the agent has acquired a sufficient degree of performative competence at

hammering, the task ceases to present her with cognitive challenges. Hammering

may be immediate and immersive, but it is not necessarily engaging. And it is in

this that the hammer is not a realizational interface, and hammering is not

enactive. To return to Francisco Varela’s formulation, enaction involves the

“bringing forth of a world.” The cognitive dimension is central to the process, and

it is precisely where enaction and realization coincide.

This raises an obvious question: if performance with conventional acoustic

musical instruments is enactive, exactly how is the potential for realization

embedded in the the instrumental interface? Or, how is that, say, a violin is

substantively different to a hammer? The short answer is in the way in which the

musician’s intentionality is coupled to the instrument’s specific and immanent

kinds of resistance. As the musician transmits kinetic energy into the mechanism,

the instrument responds with proportionate energy; energy that is experienced

by the musician as sound, haptic resistance, weight, and so on. There is a “push

and pull” between musician and instrument. Over a sustained period of time, the

musician adapts her bodily dispositions to the ways in which the instrument

47

resists; i.e., to the instrument’s dynamical responsiveness. It’s important to note

that these adaptations, as much as they are determined by the resistance offered

by the instrument, are also determined by the musician’s intentionality. It’s

because the musician sets out to realize something—to actively participate in

embodied practices of signification—that her adaptation follows a unique

trajectory, and the cognitive dimension continues to be central to the process of

adaptation.

But this still doesn’t provide a satisfactory explanation of how the potential for

realization is somehow embodied in one interface but not another. The hammer,

like the violin, offers resistance to the agent. At one level, then, it would seem

meaningless to talk of functional and realizational interfaces, and instead to view

the entire process as a matter of the agent’s intentionality; functionalism would

correspond to a “functional attitude,” and realization would likewise correspond to

a “realizational attitude.” But this view does not consider the specific dynamic

properties of resistance that are embodied in the interface. Rather, it presumes a

neutrality of the interface to human intentionality, and ignores the constitutive

role that the interface plays in the emergence of intentional and behavioral

patterns. An agent could very well set about developing a musical performance

practice with a hammer, carefully adapting her bodily dispositions to its dynamic

properties of resistance over a period of many years of thoughtful rehearsal. But

it’s likely that, at some point, she will either abandon the instrument for a

medium that offers greater potential for realization, or she will make

modifications to the instrument that would better serve that realizational

potential.

48

To return to Feenberg’s specification, both technical objects and subjects—

i.e., artifacts and humans—are constituted through an ongoing process of mutual

specification and determination. The hammer has been constituted to serve a

largely predetermined functional agenda: hammering. As such, it is

advantageous that the hammer, considered as interface, presents minimal

cognitive demands on the agent. Although music has its obvious functional uses

in late capitalist society, the model of musical performance that is of specific

interest to the present study is realizational; it assumes open-ended, fluid, and at

least partly indeterminate processes of signification, and as such requires the

ongoing cognitive involvement of the musician. A majority of conventional

acoustic musical instruments have been constituted in such a way that the

dynamic properties of their resistance are sufficiently complex, and at the same

time sufficiently coherent, that they coincide optimally with the musician’s

intentionality. In requiring that the musician’s ongoing cognitive involvement is

central to the process of adaptation to the instrument’s dynamics, the potential

for realization—for embodied forms of signification, and for the “bringing forth of

a world”—is effectively maximized.

Approaches to digital musical instrument design that set out to model the

dynamics of conventional acoustic instruments by and large circumvent the

pitfalls of de facto functionalism. In the simulation of the various networks of

excitors and resonators that constitute the physical mechanisms of acoustic

instruments, and in the carefully considered mapping of the parameters of those

synthesis primitives to tactile controllers, the integration of force feedback within

the controller apparatus, and so on, an interface is constituted that comes close

to the realizational potential of the real world instrument that it models. But the

49

main focus of this study is to outline a foundation for the design of digital musical

instruments that is more general than the physical modeling of existing

instruments. And while there is much to be learned through analyzing the

dynamical properties of conventional instruments, the basic idea is nonetheless

to arrive at a practice that fully engages the new prospects for performance that

are indigenous to computing media. This is why I have considered it important to

distinguish between functional and realizational modes of interaction. The

discourse of functionalism is implicit to the discourse of conventional human-

computer interaction design. As I have attempted to show, this can only be an

impediment to arriving at technologies that maximize the potential for

realization. In the specific case of musical performance, this means interfaces

that embody the prospect of enaction. This calls for an alternative discourse, and

alternative approaches to design.

2.6 Conclusion

To reiterate my key criteria from Chapter 1: embodied activity is situated,

timely, multimodal, and engaging. The sense of embodiment over a history of

interactions within a phenomenal domain emerges at the point where these

various constraints intersect. This is not just a matter of action, but rather a

matter of the various and complex dependencies between action, perception, and

cognition. Or, in a purely enactivist sense, of the inseparability of action,

perception, and cognition.

50

What I have set out to show in this chapter is the various ways in which the

computer-as-it-comes is a far from ideal medium in terms of meeting the criteria

of embodied activity. The objectivist foundations of conventional HCI presume a

strong separation between user and device, and situate the user squarely

“outside” the interactional domain. The WIMP model serves to enforce this

separation, and at the same time to regulate the actions of the user in such a

way that time is discretized into repeating units of sensing and acting, where the

locus of interaction is almost invariably unimodal. Further, the predominant

notion of human-computer interaction design, which would aim to reduce the

cognitive load on the agent and make the interface disappear from use,

presumes a model of activity that is anything but engaging or challenging to the

agent. When all or most of these criteria fail to be met, there is, in my view, no

possibility for the kind of interactive and circular processes of emergence that are

characteristic of enaction, or embodied cognition.

To arrive at an enactive model of musical interaction, then, we will need to

systematically rethink the world models that are embedded in the interface to the

computer-as-it-comes. Overcoming the disconnect that the computer-as-it-comes

enforces between human and instrument will require elaborating an alternative

world model, and then looking at the ways in which such a model could be

materialized in an instrument that would necessarily be something other than

the-computer-as-it-comes. This will be my task for the remainder of the essay.

51

3 Enaction

The body is our general medium for having a world.


3.1 Two Persistent Dualisms

In Chapter 2, I suggested that it would make little sense, when examining an

interactional context with a view to enactive process, to draw hard dividing lines

between action and cognition, or between body and mind. But these hard dividing

lines persist in our language, and therefore also in any provisional description of

the elements and processes of enaction. I also suggested that it makes little

sense to discuss agent and environment in isolation, and instead stressed the

inseparability of one from the other, particularly when attempting to discern the

adaptive process that sees a complex set of ever-more refined skills, dispositions

and behaviors emerge over a history of interactions. But in any attempt to

describe such interactions, our descriptions inevitably land squarely at the

boundary between agent and environment. And so in the same way that we

52

insert a hard dividing line between body and mind, we tacitly delineate a neat

separation between body and world. On the face of it, it would seem that our

language, permeated as it is by the inherent dualisms of Western philosophical

and scientific discourse, will ultimately lead us back to a primary disconnect. Or

rather, it will lead us to two disconnects: between mind and body, and between

body and world. This presents a problem. As long as the body is opposed to both

mind and world, it’s difficult to describe, much less defend, any notion of “direct

experience.” Disconnection would seem to be the order of the day.

But while philosophical language may be geared in such a way that describing

experience necessarily involves dualist, abstract and objectivist terms, it does not

necessarily follow that “direct experience”—however that may be defined—does

not factor among those varieties of human experience for which we may or may

not already have an adequate terminology. The specific variety of experience that

I’ve set out to describe—this paradigmatically embodied, immersive and engaged

experience—is fundamentally about activities that are always in a state of

becoming, and which are therefore not at all easy to define in dualist, abstract

and objectivist terms. Enaction involves a temporality in which relations are

constantly in flux, and in which new systems and structures continuously emerge

and disappear in the midst of interactional unfolding. In other words, it involves

“the processual transformation of the past into the future through the

intermediary of transitional forms that in themselves have no permanent

substance (Varela, Thompson, and Rosch 1991: 116).” The directness of

experience, then, resides in the “nowness” of the experiential present. It is a

variety of experience that comes prior to description, and prior to any clear

53

determination of the subject, or of those objects and opportunities for action that

make up that (transitional) subject’s environment.

In attempting to define “direct experience,” then, we encounter a paradox.

Direct experience implies a provisional and temporary state of being that is

always and necessarily resistant to ontological reduction. I would even go so far

as to say that the “nowness” of the lived present is that which makes direct

experience, by definition, preontological. But as soon as we attempt to describe

the systems and structures of direct experience, we introduce ontological

categories. It’s in this that we see the intrinsic paradox of the description: there

can be no notion of that which is direct without casting experience in abstract

terms.

This is likely to be the source of some confusion. And given that one of the

primary motivations behind the present study is to outline a philosophical

foundation for design, it will not help if the key philosophical concepts are poorly

defined or potentially misleading. Fortunately, questions such as these are not

without precedent; there is a branch of philosophy that has dealt systematically

with direct experience, and it has done so within the context of a well-defined

dualist discourse. In the transcendental phenomenology of Husserl,1 the

existential phenomenology of Heidegger and Merleau-Ponty, and in the latter day

1 Although Husserl does not figure very significantly in this study, I mention him

because he is acknowledged as the founding figure of European phenomenology, and had

a direct influence on the thinking of both Heidegger and Merleau-Ponty.

54

reworking of both European and Buddhist phenomenology2 in enactive cognitive

science and so-called postphenomenology,3 the apparent paradox of a dualistic

description of unreflective behaviour is dealt with comprehensively.

Phenomenology, in its various manifestations, is a vast and complex field, and

it’s beyond the scope of this essay to cover any of its myriad branches of inquiry

in any significant manner. However, there are two key concepts, from two quite

different moments in the phenomenological tradition, which are particularly

useful to the model of interaction that I am attempting to describe. Double

embodiment and structural coupling—both of which terms already point to a

fundamental dualism prior to their elaboration—respectively address the

mind/body and body/world problems in direct experience. In outlining them here,

I hope to clear up any confusion as to how the dualism that resides in any

description of embodied action is substantively different from the disembodied

dualism that lies at the heart of the computationalist perspective. This should

bring us to a point where, after having established a disconnect in our

descriptions, we come to see how that disconnect ceases to exist in the flux of

2 The philosophy of Nagarjuna, for example, and of the Madhyamika tradition in

Buddhist thought, figures significantly in Varela, Rosch and Thompson's outline of

"codependent arising," and its implications for subjectivity (Varela, Thompson, and Rosch

1991).

3 Postphenomenology is a term introduced by, and most often associated with,

philosopher Don Ihde (Ihde 1983, 1990, 1991, 1993, 2002).

55

embodied action, and in the experiential merging of self and world. I should note

that I am not attempting to construct a new theory of the mind/body problem

here, or even to weigh into the debate. Rather, the objective is pragmatic: to

outline some core theoretical issues with a view to opening up a space for new

digital musical instrument design scenarios.

3.2 Double Embodiment

As long as the body is defined in terms of existence in-itself, it functions uniformly

like a mechanism, and as long as the mind is defined in terms of pure existence

for-itself, it knows only objects arrayed before it.


In his analysis of tool use in Being and Time (Heidegger [1927] 1962), Heidegger

draws a famous distinction between the ready-to-hand and the present-at-hand.

The ready-to-hand indicates an essentially pragmatic relation between user and

tool. It is when the tool disappears, i.e., when it has the status of equipment,

that the user engages the task environment via the ready-to-hand. The relation,

then, is not about a human subject and an “object” of perception. Rather, it is

about that object’s “withdrawal” into the experiential unity of the actional

context:

The peculiarity of what is proximally ready-to-hand is that, it must, as it were,

withdraw in order to be ready-to-hand quite authentically. That with which our

56

everyday dealings proximally dwell is not the tools themselves. On the contrary,

that with which we concern ourselves primarily is the work. (Heidegger [1927]

1962: 99)

The ready-to-hand implies an engaged and embodied flow of activity. The human

is caught up in what Hubert Dreyfus has called “absorbed coping (Dreyfus 1993:

27).” It’s only when this flow of activity is disturbed by some kind of technological

breakdown that the apparently seamless continuity between user and tool is

broken.

In the moment of breaking down the tool becomes un-ready-to-hand, or, in

Heidegger’s more often used term, present-at-hand:

Anything which is un-ready-to-hand … is disturbing to us, and enables us to see

the obstinacy of that with which we must concern ourselves in the first instance

before we do anything else. With this obstinacy, the presence-at-hand of the

ready-to-hand makes itself known in a new way as the Being of that which lies

before us and calls for our attending to it. (Heidegger [1927] 1962: 102)

The hammer appears as an object of consciousness, i.e., it acquires

“hammerness,” only “if it breaks or slips from grasp or mars the wood, or if there

is a nail to be driven and the hammer cannot be found (Winograd and Flores

1986: 36).” Prior to the technological breakdown, then, the hammer is invisibly

folded into the continuum of direct experience. It has no objectness in itself, but

rather disappears into the purposefulness of action. The moment of its acquiring

the status of object coincides with a disturbance to the accomplishment of the

purpose for which the activity, in the first instance, was undertaken:

57

When an assignment has been disturbed—when something is unusable for some

purpose—then the assignment becomes explicit. (Heidegger [1927] 1962: 105)

Hubert Dreyfus recasts Heidegger’s distinction between the ready-to-hand

and the present-at-hand in psychological terms. He suggests that it is only when

purposeful activity is disturbed that “a conscious subject with self-referential

mental states directed toward determinate objects with properties gradually

emerges (Dreyfus 1991: 71).” That is, direct, immediate experience is

supplanted by abstract and reflective experience when the tool user is

necessitated by a breakdown to perceive the tool in abstract terms, and to reflect

on the context in which action and intention is embedded. There is a back-and-

forth in experience, then, between direct and abstract modes of engaging the

world. That both modes are experienced by the same body points to a

fundamental duality of embodied experience, or a double embodiment.4

4 I borrow the term "double embodiment" from Varela, Thompson and Rosch's The

Embodied Mind (Varela, Thompson, and Rosch 1991), who in turn base their coinage on

Merleau-Ponty's notion of embodiment:

We hold with Merleau-Ponty that Western scientific culture requires that we see

our bodies both as physical structures and as lived, experiential structures—in

short, as both "outer" and "inner," biological and phenomenological. These two

sides of embodiment are obviously not opposed. Instead, we continuously circulate

back and forth between them. Merleau-Ponty recognized that we cannot

understand this circulation without a detailed investigation of its fundamental axis,

58

At first glance, it would seem contradictory to speak of abstract reflection as a

subset of embodied experience. It does not, for example, satisfy the criteria of

embodied activity that I laid out in Chapter 1. Further, abstract reflection would

seem to be more or less identical in function to the disembodied reasoning of the

computationalist model of cognition that I outlined in Chapter 2. There are two

critical points here in arriving at a fairly subtle, and inherently paradoxical,

distinction. First, by locating cognitive process entirely within the mechanisms of

the body as lived, the body must necessarily “contain” cognition. To the extent

that abstract reflection forms part of lived experience—at the moment of a

technological breakdown, for example—the experience of disembodiment is quite

literally embodied by the reflective subject. Second, the computationalist model

of cognition does not account for unreflective experience. According to the

computationalist perspective, all activity is mediated by internal representations

of the task domain, and reasoning about potential courses of action. With double

embodiment, such a state of affairs arises only when the flow of unreflective

activity is interrupted. An enactive model of cognition does not, then, dismiss the

reflective state of disembodied reason. Rather, it encompasses it within the lived

experience of the doubly embodied agent at large in the world.

This seemingly paradoxical state of affairs is captured in Merleau-Ponty’s

concept of the “practical cogito (Merleau-Ponty [1945] 2004);” an idea that, in a

namely, the embodiment of knowledge, cognition, and experience. (Varela,

Thompson, and Rosch 1991:xv-xvi)

59

single turn of phrase, encompasses both direct action and abstract reflection. For

Merleau-Ponty, as for Heidegger, the phenomenological project is in the first

instance concerned with reversing the Cartesian axiom; with the substitution of

practical understanding for abstract understanding, and with the placement of an

“I can” prior to the “I think (Merleau-Ponty [1945] 2004: 137).” The crucial factor

in addressing the apparent contradiction between direct action and abstract

reflection is to situate both within the context of the unfolding of activity and

cognitive skill in a temporal context:

There is, indeed, a contradiction, as long as we operate within being, but the

contradiction disappears…if we operate in time, and if we manage to understand

time as the measure of being. (Merleau-Ponty [1945] 2004: 330)

Embodied being, then, encompasses both reflective and unreflective experience.

And in the unfolding of being that conforms to the enactive model of cognition,

“These two sides of embodiment are obviously not opposed. Instead, we

continuously circulate back and forth between them (Varela, Thompson, and

Rosch 1991: xv).” Indeed, it is through this circulating back and forth, through

what Varela et al. have termed “a fundamental circularity (Varela, Thompson,

and Rosch 1991),” that perceptual, actional and cognitive skills develop, hand in

hand. Enaction does admit a mind/body dualism, then: it “encompasses both the

body as a lived, experiential structure and the body as the context or milieu of

cognitive mechanisms (Varela, Thompson, and Rosch 1991: xvi).” But the

moment in which the agent becomes subjectively conscious of her body, and of

her body’s objective relations to the objects arrayed before it, is only ever

60

transitory. At the moment that activity resumes, the body recedes into the

background, and its objects withdraw into the immediacy of the task.

As I argued in Chapter 2, the computer-as-it-comes precludes embodied

forms of activity. It does not allow for a motility that is situated, timely,

multimodal, engaging. In short, it keeps the user in a state of disconnection from

the tool; a disconnect that is reinforced by the symbolic representationalist

underpinnings of conventional computer interfaces. What I have endeavored to

show here is that this disconnect is a factor in experience, and so when turning to

design, it should not be discounted. But that form of direct experience that

Heidegger termed the ready-to-hand—a notion that is more or less synonymous

with the notion of embodied activity that I outlined in Chapter 1, and is our

natural way of galvanizing tools and working within our everyday environments—

is missing from the conventional interactional paradigms with the computer-as-it-

comes.5 While some authors have suggested that we should explicitly factor the

Heideggerean breakdown into our music interface models (Di Scipio 1997;

Hamman 1997, 1999), they also place emphasis on non-real-time music

production (composition), rather than the processes of real-time music

production (performance) with which I am specifically concerned. I suggest,

rather, that with a view to designing enactive instruments, attention should be

5 Winograd and Flores present an extensive analysis of the conventional metaphors of

computer science in relation to a Heideggerean ontology in Understanding Computers and

Cognition (Winograd and Flores 1986).

61

directed at maximizing the potential for fully engaged and direct experience. As

with the hammer, or with any other tool, we can expect that breakdowns will

happen in the course of everyday practice. Such breakdowns are essential, for

example, to the incremental adaptive process of learning to play a conventional

acoustic instrument.6 My focus, then, when turning to issues of design, will not be

directed at engineering breakdowns, but rather at engineering the potential for

the desired kind of breakdowns. In terms of the technical implementation, the

measure will be resistance.

3.3 Structural Coupling

The world is inseparable from the subject, but from a subject which is nothing but

a project of the world, and the subject is inseparable from the world, but from a

world which the subject itself projects.


Although I’ve already suggested that double embodiment and structural coupling

address, respectively, mind/body and body/world dualisms, it would be more

accurate to say that both double embodiment and structural coupling address the

mind/body/world continuum with an emphasis on different processes. The world

6 Later in the chapter (3.5) I outline this adaptive process in detail with specific

reference to the role of breakdowns.

62

obviously figures in the double embodiment analysis: it is the context in which

action is embedded. In much the same way, the mind figures in structural

coupling: it is the locus of cognitive emergence over a history of interactions

between body and world. But where the emphasis in double embodiment is on

the oscillatory nature of mental engagement in an interactional context, the

emphasis in structural coupling is on the circular processes of causation and

specification that pertain between the agent and the environment. More

specifically, structural coupling draws a dividing line between body and world in

description and schematization—i.e., it enforces a separation—in order to

demonstrate the inseparability of one from the other in the unfolding of a

coextensive interactional milieu, and in the emergence of performative and

cognitive patterns and competencies.

In early formulations (Maturana and Varela 1980, 1987), the concept of

structural coupling was applied to evolutionary biology. It presented an analysis

of the interactions between an organism and its environment (where the

environment may include other organisms), with a view to their mutual

adaptation and coevolution. More specifically, it addressed the circular and

reciprocal nature of these interactions. The coupling between organism and

environment is “structural” because, as the organism and the environment

exchange matter and energy, their respective structures, and hence the structure

of their interactions, are changed as a function of the exchange. The process is

captured neatly in Maturana and Varela’s definition of an autopoietic machine:

An autopoietic machine is a machine organized (defined as a unity) as a network

of processes of production (transformation and destruction) of components that

63

produce the components which: (i) through their interactions and transformations

continuously regenerate and realize the network of processes (relations) that

produced them; and (ii) constitute it (the machine) as a concrete unity in the

space in which they (the components) exist by specifying the topological domain

of its realization as such a network. (Maturana and Varela 1980: 78-79)

Over a history of exchanges between organism and environment, there is an

increasing regularization of structure, i.e., a continuous realization of “the

network of processes,” such that both organism and environment are more viably

adapted to productive exchange, and such that those exchanges strengthen the

conditions for continued interaction.

Structural coupling is a key component of the enactivist model of cognition. In

Varela, Rosch and Thompson’s formulation, it is the very mechanism by which

cognitive properties emerge:

Question 1: What is cognition?

Answer: Enaction: A history of structural coupling that brings forth a world.

(Varela, Thompson, and Rosch 1991: 206)

The world that is brought forth, or enacted, by the agent, traverses the divide

between agent and environment. In contrast to the computationalist subject—

who reasons about an external world in an internal domain of symbolic

representation—the enactive subject actively realizes the world through the

connection of the nervous system to the sensory and motor surfaces which, in

turn, connect the embodied agent to the environment within the course of action.

The fully developed notion of structural coupling, then, emphasizes the

64

inseparability of agent and environment in embodied cognition, but at the same

time locates the points at which agent and environment intersect, and offers an

explanation as to how repetitive contacts at these points of intersection can lead

to incrementally more complex states of functioning on the part of the cognitive

system.

There is a certain push and pull of physical forces between agent and

environment that constitutes a critical aspect of their structural coupling. In other

words, structural coupling implies physical constraints and feedback. The

contingencies and specificities of the agent’s embodiment form one such

constraint, and it is a constraint that is in an ongoing state of transformation as

the agent acquires and develops motor skills, or finds herself in new or changing

environments with new or changing actional priorities. Physical constraints also

exist within the environment, and these forces act upon the agent’s body within

the course of activity, and so play a critical role in the emergence of embodied

practices and habits. This push and pull between agent and environment has a

dynamic contour, and this is where the “hard dividing line” that we may draw

between them must necessarily be qualified. The dividing line is rather more

pliable; a quality that is tidily encapsulated in a schematization by Hillel Chiel and

Randall Beer (Figure 3.1).

65

Figure 3.1. Interactions between the nervous system, the body

(sensorimotor surfaces), and the environment (from Chiel and Beer

(1997)). Chiel and Bier’s commentary: The nervous system (NS) is

embedded within a body, which in turn is embedded within the

environment. The nervous system, the body, and the environment are

each rich, complicated, highly structured dynamical systems, which are

coupled to one another, and adaptive behavior emerges from the

interactions of all three systems.

In Chiel and Beer’s diagram, the dividing lines between body and environment,

and between nervous system and body, are clearly distinct, but they are not

rigid. The push and pull between each of the components in the interactional

domain is indicated by projecting triangular regions. It’s clear that a “push” on

one side of the body-environment divide results in a proportionate “pull” on the

other, and vice versa. The “body” consists of sensory inputs and motor outputs,

and contains the nervous system, which is connected to the sensorimotor surface

through the same dynamical “push-pull” patterns that connect the body to the

environment. There is, then, a fluid complementarity between environment,

66

body, and nervous system, of which Chiel and Beer’s diagram provides an

instantaneous snapshot. To capture the properly dynamical nature of this

complementarity, the diagram would need to be animated. We would then see

the projecting triangular regions extend and contract in regular (though not

necessarily periodic) oscillatory patterns, and these motions would provide a view

of the continuous balancing of energies between agent and environment as the

play of physically constrained action unfolds over time.

These kinds of exchanges may be more or less stable in terms of the impact

of environmental dynamics on agent dynamics, and vice versa. And they may

demand more or less of the agent’s cognitive resources, depending on the

potential complexity of balancing the intentionality of the agent with the

environmental contingencies. What we see is a transfer function—a map—from

agent to environment and back again, that, from one interaction to the next, may

exhibit linear, nonlinear, or even random behavior. Beer has suggested that when

embodied agent and environment are coupled through interaction, they form a

nonautonomous dynamical system (Beer 1996, 1997). It’s a perspective that has

also been adopted by a handful of cognitive scientists as an explanatory

mechanism for the emergence of cognitive structures through interactional

dynamics (Hutchins 1995; Thelen 1994). Although it doesn’t form an explicit part

of Varela and Maturana’s original formulation, the dynamical systems approach

provides a potentially useful way of both understanding and schematizing

structural coupling. I will return to this point in my outline of implementational

models in Chapter 4.

67

There are two fundamental and seemingly contradictory points to viewing

interactions between an embodied agent and its environment as a process of

structural coupling: 1. to emphasize the inseparability of agent and environment,

and 2. to locate the points at which agent and environment intersect, i.e. their

bounding surfaces. The danger with the analytic part of this formulation is that,

as soon as we’ve drawn the dividing line between agent and environment, it’s

rather easy to view them in isolation, and to understand their respective

behaviors as self-contained properties of autonomous systems. This lands us,

more or less, back within the computationalist model of rationally guided action.

Therefore, it’s precisely the point at which the mechanics of the agent-

environment connection need to be described.

We will see a disconnect in schematizations of both the computationalist and

the enactive models of action. On one side, the agent, on the other, the world.

But what distinguishes the enactive model from the computationalist model is the

formation of a larger unity between agent and world through dynamical

processes of embodied interaction and adaptation. These processes are

characterized by crossings of the divide, by the “push and pull” between coupled

physical systems, and by a form of experience that, rather than being lived

through a world of abstract inner contemplation, is lived directly at the points

where the sensorimotor system coincides with the environment in which it is

embedded. Although we can delineate the boundary between agent and

environment in an abstract diagram of their interactional milieu, such a diagram

will not capture the experiential aspect of embodied interaction. The agent does

not feel herself to be separate from the world in which she is acting but, rather,

68

is intimately folded into its dynamics and processes. The “bringing forth of a

world,” that is, of an organismic continuity between agent and environment,

amounts to the moment at which the original severance, or disconnect, ceases to

factor in the agent’s experience. The body is not “as it in fact is, as a thing in

objective space,” but rather constitutes “a system of possible actions, a virtual

body with its phenomenal ‘place’ defined by its task and situation (Merleau-Ponty

[1945] 2004: 250).”

I would argue that, as a matter of definition, when the five criteria of

embodied activity (Chapter 1) are met, a structurally coupled system is inevitably

formed.7 To this extent (and in keeping with Varela, Thompson and Rosch’s

formulation), structural coupling implies enaction, and vice versa. Structural

coupling between performer and instrument will, therefore, be key to the model

of enactive musical performance that I am proposing, and an essential criterion

in design.

7 To be more precise, the first four criteria of embodied activity would form a

structurally coupled system, and the fifth--embodiment is an emergent phenomenon--

would come for free. As Thelen and Smith point out (Thelen 1994), the emergence of

cognitive, perceptual and actional abilities constitute the teleological dimension of

structural coupling.

69

3.4 Towards an Enactive Model of Interaction

The key theoretical components of the essay have now been presented. But

before turning to issues of the design and implementation of enactive digital

musical instruments, it may prove useful to outline the various models of

interaction that I’ve discussed to this point in the form of diagrams. The leap

from theory to implementation is almost always a shaky endeavor, and the

models that I present here may serve as a provisional and necessarily speculative

bridging of the gap between theory and praxis. To that end, the diagrams focus

specifically on human-computer interaction, but remain both general and non-

specific in terms of hardware and software implementation details (i.e., the

interface). It’s the dynamics of the various models of interaction between human

and computer that form the key concern, with a view to distinguishing their

various implications for the development of human cognition and action. The

underlying rationale, then, is to arrive at a candidate model of enactive

interaction, with the intention of holding this model in view when shifting the

focus to implementation.

There is a basic model of human-computer interaction (figure 3.2) that can be

taken to hold for all subsequent models. The human performs actions at the

inputs to the computer which cause changes to the state of the computer’s

programs. In turn, the computer transmits output signals representing the state

of its programs which are perceived by the human.

70

PERCEPTION

ACTION

HUMAN COMPUTER

OUTPUT

PROGRAMS

INPUT

S

M

Figure 3.2. The basic model of the human-computer interaction loop. S

represents the map from the state of the computer’s output devices to the

human’s sensory inputs, and M represents the map from the human’s

motor activities to the state of the computer’s input devices. Together, the

input and output devices constitute the interface to the programs running

on the computer.

The basic model is, however, incomplete. The human perceives and acts, and

therefore demonstrates intentionality. But there is nothing to link perception to

action. That is, although a cognitive dimension is implied, the model does not

account for it. In fact, the usefulness of the model lies solely in specifying the

basic mechanics of human-computer interaction, and as these mechanics can be

assumed to be unchanging for all subsequent models,8 it can also be assumed

8 To say that the basic mechanics is unchanging is not to say that the interfaces will be

identical. The basic mechanics can be taken to mean the maps from output to perception,

71

that the subsequent models will be distinguished solely by cognitive

considerations. For present purposes, this means the map between perception

and action.

To make the step from the basic model to the conventional model of human-

computer interaction, we need only insert human reasoning between perceiving

and acting (figure 3.3).

PERCEPTION

REASONING

ACTION

HUMAN COMPUTER

OUTPUT

PROGRAMS

INPUT

S

M

Figure 3.3. The basic model extended to include the model of human

activity in conventional HCI. Human actions follow after inner reasoning

about sensory inputs, resulting in a sequential chain of actions, and a

and from action to input. Different interfaces will result in different map dynamics, and

these dynamics will in turn carry different sets of implications for cognition, perception and

action.

72

segmentation of the flow of time (see Chapter 2.4). This model is

paradigmatic of what I have termed “the computer-as-it-comes.”

We now have a schematization of the Cartesian subject in the midst of

interaction, and it’s interesting to note the upside-down symmetry on either side

of the human/computer divide.9 The conventional model presumes that the

human reasons about her interactions with the computer in an inner world of

mental abstraction. There is therefore an inevitable time delay between

perception and action, the duration of which is simply as long as it takes to

perform the necessary mental computations. Although they are not detailed in

figure 3.3, it can be assumed that the input and output devices of conventional

HCI serve to reinforce the computationalist ontology from which conventional HCI

derives. To this extent, input devices are ordinarily monomodal and geared to a

single focal point of motor activity (from one moment to the next, either the

mouse or the keyboard), and output devices are ordinarily visuocentric and

geared to a single focal display point (the cursor). When these factors combine in

the form of a device, we have what I have termed “the computer-as-it-comes.” It

can effectively be guaranteed that interactions with the computer-as-it-comes

will be disembodied, at least according to the minimal criteria I set down for

embodied activity in Chapter 1.

9 In the spirit of mechanistic philosophy, we could even relabel “Perception,”

“Reasoning,” and “Action,” respectively as “Input,” “Programs,” and “Output.”

73

In Chapter 2, I drew a distinction between functional and realizational

interfaces. The distinction rests on the manner in which the interface elicits

particular varieties of action and thought from the human user. While the

terminology places explicit emphasis on the interface and how it is constituted,

the immediate concern lies with the implications of the interface for the

emergence of cognitive, perceptual and actional patterns. In schematizing the

respective interactional paradigms of the functional and realizational interfaces,

then, I have added a further cognitive dimension to the human side of the

computer-as-it-comes model, while the computer side has remained unchanged.

In the diagram of the functional model of interaction (figure 3.4), the added

dimension is labelled “Knowledge.” This knowledge can be considered offline with

regard to activity. That is, it’s an abstract quantity that exists prior to interactions

with the computer, and while it directly informs the ways in which the human

subject perceives and reasons, knowledge is “accessed,” rather than

“constituted,” within the course of action. It can also be assumed that there are

no real-time constraints on the accessing of this knowledge, and that this aspect

reinforces the sense, in user experience, that the knowledge being galvanized is

offline.

74

PERCEPTION

REASONING

ACTION

HUMAN COMPUTER

OUTPUT

PROGRAMS

INPUT

S

M

KNOWLEDGE

Figure 3.4. The human-computer interaction loop with the functional

interface (see 2.5). The human’s knowledge is leveraged by the

abstractions that comprise the computer’s interface, and this knowledge is

galvanized to guide perception and reasoning, leading to appropriate

action. The functional interface is deterministic; i.e., the goal of the task-

at-hand is known in advance, and the interface is designed to lead to the

accomplishment of this goal while placing minimal cognitive demands on

the human.

I noted in Chapter 2 that functionalism is something of a standard in conventional

interaction design. Through leveraging existing user knowledge, and thereby

minimizing the cognitive load, the task domain and its end goals are made as

transparent as possible. While the approach has a great many advantages for

routine activities with computers, it is not advantageous to activities that are

dynamic or nondeterministic by nature.

75

In figure 3.5, “Knowledge” is relabelled “Realization,” and the links between

“Realization” and “Perception,” and “Realization” and “Reasoning,” are now

bidirectional.

PERCEPTION

REASONING

ACTION

HUMAN COMPUTER

OUTPUT

PROGRAMS

INPUT

S

M

REALIZATION

Figure 3.5. The human-computer interaction loop with the realizational

interface (see 2.5). The key difference between the realizational and the

functional interface lies in the cognitive demands they place on the human.

Whereas human knowledge can be considered static in functional

interactions, it is dynamic in realizational interactions. The realizational

interface is nondeterministic; i.e., it brings with it a continuing potential

for new encounters and uses, and human knowledge continues to expand

over a history of interactions. Because the term “knowledge” implies a

fixed state of knowing, it is substituted in the diagram by the more

dynamic and fluid “realization.”

76

The term “knowledge” implies a static corpus of known facts. It’s precisely this

corpus of “knowns” on which the functional interface draws. The realizational

interface, on the other hand, offers resistance to the user, deliberately prompting

her to new modes of thinking about the task domain. Hence the substitution of

the more dynamic and fluid term “realization.” In figure 3.5, a reasoning stage

still intervenes between the perceiving and acting stages. According to the

criteria of embodied activity, then, the model represents a disembodied mode of

interaction. Nonetheless, an important step has been taken towards the enactive

model. By introducing resistance to the interface—a resistance that requires the

human to fully engage in the activity—the shift is effected from a static and

deterministic model of activity to one that is dynamic and nondeterministic. While

realization is offline to the activity, it still requires that the human commit

continuous and significant cognitive resources to the task, and thus opens the

possibility for the on-going generation of new meanings and modes of thought.

I have defined embodied activity as a state of being that consists in a

merging of action and awareness. That is to say, there is a seamless continuity

between perceiving and acting, experienced as flow. In figure 3.6, the boundaries

between perception, reasoning and action are collapsed, and the continuity

between perceiving and acting is indicated by the label “Perceptually Guided

Action.”

77

PERCEPTUALLY

GU IDED

ACTION

HUMAN COMPUTER

OUTPUT

PROGRAMS

INPUT

S

M

Figure 3.6. Embodied Interaction. The perceiving/reasoning/acting

sequence has been collapsed into a fully integrated model of activity.

Perception and action constitute a unity, labelled here as “Perceptually

Guided Action.” This corresponds to the flow of embodied activity (see

1.2), and to Heidegger’s ready-to-hand (see 3.2); there is a merging of

action and awareness, and the sense of disconnect between human and

computer ceases to factor in experience.

This is the first of the schematizations in which the human is represented as a

unity, and it can be assumed that the experience of “oneness” involves the loss

of any sense of disconnect with the computer. I’ve argued that such a mode of

activity is precluded by the computer-as-it-comes, and that this has proven a

major stumbling block in arriving at designs for digital musical instruments that

allow for embodied modes of interaction. The model of activity corresponds to

Heidegger’s ready-to-hand, or in Hubert Dreyfus’ paraphrase, “absorbed coping,”

or, in the rubric that I’ve used throughout the essay, embodied action. As with

78

the standard model of human-computer interaction (figure 3.2), there is no

explicit focus on conscious mechanisms. Indeed, the distinguishing aspect of the

ready-to-hand is that it is an unconscious, unreflective mode of behavior.

In Chapter 2, I suggested that what distinguishes embodied action from

enaction is the realizational dimension. That is, that while the sense of

embodiment may be optimal when cognitive challenges are placed upon the

human agent, such challenges are not prerequisite to embodiment. Cognitive

realization is, however, prerequisite to enaction. To make the step from

embodied action to enaction, then, “Realization” is connected to “Perceptually

Guided Action” through a bidirectional path (figure 3.7).

PERCEPTUALLY

GU IDED

ACTION

HUMAN COMPUTER

OUTPUT

PROGRAMS

INPUT

S

M

REALIZATION

Figure 3.7. Enaction. Human and computer are structurally coupled

systems (see 3.3). Enaction implies an embodied model of interaction with

79

a view to cognitive and actional realization. In the enactivist view,

cognition is an embodied phenomenon. It arises through physical

interactions, and in turn shapes the trajectory of future interactions.

There’s a symmetry between the enactive model and that of the realizational

interface (figure 3.5): both include a realizational dimension that is tied, through

reciprocal patterns of determination, to perception and action. And in both

instances, realization is tightly correlated to the resistance that the interface

offers to the human user, and to the cognitive challenges this resistance

presents. But where the realizational interface solicits a mode of activity that is

disembodied and offline, the enactive interface solicits time-constrained

improvised responses that are embodied and online. Another way to view this is

as the difference between, in Elizabeth Preston’s terminology, “representational

and non-representational intentionality (Preston 1988).” Where the realizational

interface is concerned with engineering a representational breakdown—i.e.

deliberately causing a reappraisal of the representations that comprise the

interface; an activity that necessarily involves reasoning, and is therefore

disembodied and offline—the enactive interface is concerned with soliciting new

responses without recourse to inner representations.10 That is, the interface is

10 There are continuing disagreements among cognitive scientists and philosophers of

mind as to whether "inner representations" play a part in direct experience. Although I

take no position in the debate, for the purposes of the present study I assume that inner

representations play no part in direct experience, as this makes it easier to distinguish

between direct and abstract experience. If we were to stick with the idea that humans are

80

encountered directly rather than abstractly, in real time and real space, and

human activity is embodied and online. In the enactive model, then, realization is

an incremental process of cognitive regularization and awareness, stemming from

forces that are directly registered through the body, and at the same time

determining the emergent contour of the body’s unfolding patterns and

trajectories.

The enactive model of interaction represents the ideal performative outcome

of the class of digital musical instruments that I am setting out to define and

describe in this study. Before turning to design, however, it’s important to note

that while the enactive model of interaction represents an idealized “way of

being” in the performative moment, it does not represent the sum total of the

performance practice. Rather, in keeping with Merleau-Ponty’s theory of “double

embodiment,” that performance practice, in addition to the enactive model of

interaction (figure 3.7), would also at various moments involve embodied action

storing the contents of their environment as inner representations at all times, then we

could potentially draw the distinction between abstract and direct experience in terms of

objective and deictic intentionality. Deictic representations were discussed in Chapter 2,

but I will reiterate here. According to Philip Agre, "a deictic ontology ... can be defined

only in indexical and functional terms, that is, in relations to an agent's spatial location,

social position, or current and typical goals or projects (Agre 1997: 242)." With deictic

intentionality, then, we do not relate to an object in terms of its objectness, but in terms

of the role it plays in our activities. And it is because the object is so directly folded into

the actional midst that we encounter it directly rather than abstractly.

81

(figure 3.6), and offline realization (figure 3.5). Each of these modalities would

constitute different ways of engaging the same instrument, and the human

performer would routinely cross the lines that distinguish one modality from the

next. There will be “breakdowns,” particularly in the learning stage, which shift

awareness to the “objectness” of the instrument. The instrument will become

present-at-hand; i.e., it will be encountered through a representational

intentionality. Additionally, in the midst of embodied activity, it cannot be

assumed that the instrument will provide endless novelty to the performer;

particularly as, over the course of practice, she becomes more finely adapted to

the instrumental dynamics. At such moments—again, to borrow terminology from

Heidegger—the instrument effectively disappears from use, and becomes ready-

to-hand. In everyday embodied practices, it’s not unusual for these experiential

modalities to be engaged simultaneously. For example, a violinist breaks a string

in the middle of performance, drawing the focus of her attention to the

objectness of the instrument. At the same time, however, she continues playing

on the remaining three strings. With the greater portion of available cognitive

resources allocated to the instrumental breakdown, it’s likely that the act of

playing proceeds without a great deal of reflective thought. We see then a

coincidence of the present-at-hand and the ready-to-hand, as the intentionality

of the performer is divided across different components of the same instrument.

That the same human is able to divide the instantaneous allocation of

cognitive resources into representational and nonrepresentational subcomponents

is nothing extraordinary for a practiced, multi-tasking, doubly embodied

performer. It is also something that happens as a matter of course in the

82

development of any form of embodied practice, and therefore need not factor in

design. In the particular case of what I have termed enactive digital musical

instruments, then, it can be assumed that if the instrumental implementation

engenders suitable conditions for the enactive model of interaction, the other

modalities—embodied action and offline realization—will invariably follow. The

practical implication for instrument design, then, is that the enactive model is the

only one that need be kept in view.

3.5 The Discontinuous Unfolding of Skill Acquisition

In Merleau-Ponty’s phenomenology, human intentionality is fundamentally

concerned with the body’s manner of relating to objects in the course of

purposive activity. In the broadest sense of the term, it encompasses both

representational and nonrepresentational intentional modes. In using the

umbrella term “intentionality,” then, we can condense the enactive, embodied

action, and offline realization models into a single integrated model, which I have

termed “enactive performance practice (figure 3.8).” The model encompasses the

interdependencies between perception, action and cognitive unfolding within the

circumscribed interactional domain of instrumental practice.

83

HUMAN BODY INSTRUMENT

COGNITION

R

I

Time

Figure 3.8. Enactive performance practice. Human body and instrument

are unities, and cognitive abilities emerge over time through the

continuous and embodied circular interactions between them. As these

cognitive abilities develop, there is an incremental regularization of the

performative patterns of the body, and of the dynamics of the body-

instrument interactions. I represents the map from human intentionality to

the instrument, while R represents the map from the instrument’s

reactions back to the human.

While the enactive performance practice model is too general to be useful in

design, it does serve to encapsulate all the key facets of the interaction paradigm

I’ve set out to describe. The human acts purposefully through her body,

exemplifying an intentionality. Her bodily actions are transduced by the

instrument and lead to a reaction. The instrumental reactions are perceived by

the human, and these perceptions, as they are registered in the body, modulate

her intentionality, and thus her ongoing reactions and bodily dispositions. The

process could be schematized as a bidirectional exchange, but we get closer to

84

the flux of the performance experience if the interactions are viewed as circular

and continuous. The cognitive dimension is not independent of these interactions,

but rather is folded into them through realization. Over time, cognitive abilities

continue to develop, as the body continues to adapt to the dynamics of the

interactional domain. Although cognition and the body are indicated as distinct

entities in figure 3.8, this is solely for the purposes of clarity. It should be kept in

mind that cognition is an embodied phenomenon, realized at the connections

between the nervous system, the sensorimotor surfaces, and the environment.11

Enactive performance practice as I’ve outlined it here is consistent with

Merleau-Ponty’s notion of the intentional arc (see Chapter 2). The “arc” metaphor

is interesting, as it implies a continuity in the acquisition of perceptual, actional

and cognitive skills; a continuity that is also implied in the unbroken trajectory of

cognitive unfolding in figure 3.8. As long as enactive performance practice—and

also the intentional arc—can be said to encompass representational and

nonrepresentational modes, this doesn’t present a problem. At the same time,

however, the model does not accurately reflect the ways in which the modes of

bodily relation to an instrument are transformed over the course of cognitive

11 In this essay, the environment can be taken to comprise the instrument, and an

idealized physical space in which the instrument's outputs might be optimally perceived by

the human performer. In real practice, of course, the environment may include any

manner of physical spaces, humans, other animals, etc. While such features of the

environment will inevitably play a part in the emergence and formation of performer

intentionality, it's beyond the scope of this study to factor them into consideration.

85

unfolding; i.e. it does not account for the intrinsically discontinuous back-and-

forth between the present-at-hand and the ready-to-hand that characterizes the

acquisition of skill. This is an especially important point when considering the

acquisition of realizational skills, such as learning to play a musical instrument.

Before moving on to issues of implementation, then, it’s worth considering the

ways in which human bodily ways of being are transformed within the process of

acquiring a specific skill. I’ll do this by drawing out some correspondences

between two texts, Hubert Dreyfus’ “The Current Relevance of Merleau-Ponty’s

Phenomenology of Embodiment (Dreyfus 1996),” and David Sudnow’s Ways of

the Hand (Sudnow 2001).

Dreyfus sets out in his article to “lay out more fully than Merleau-Ponty does,

how our relation to the world is transformed as we acquire a skill (Dreyfus

1996:6).”12 He does this by dividing the temporal unfolding of skill acquisition

into five distinct stages—“Novice,” “Advanced beginner,” “Competence,”

“Proficient,” and “Expertise”—where each stage is characterized by specific bodily

ways of relating to the task environment in question. Dreyfus assumes “the case

of an adult acquiring a skill by instruction (Dreyfus 1996:6),” and illustrates his

argument with two examples: learning to drive a car, and learning to play chess.

In the discussion that follows, I will borrow from Dreyfus’ decomposition of the

intentional arc into five distinct stages, but will illustrate the argument with an

12 The numbering system in citations of Dreyfus' article refer to the paragraph number

of the online text.

86

example that is more immediately pertinent to the present study: learning to

improvise with a musical instrument. Sudnow’s Ways of the Hand—a detailed first

person “production account” of the gradual acquisition of skill as a jazz pianist—is

in this regard an ideal candidate.

Dreyfus’ “Novice” stage begins with the reduction of the task environment

into explicit representations of the elements of which the environment is

composed:

Normally, the instruction process begins with the instructor decomposing the task

environment into context-free features which the beginner can recognize without

benefit of experience in the task domain. The beginner is then given rules for

determining actions on the basis of these features, like a computer following a

program. (Dreyfus 1996:7)

That the features of the environment are “context-free” implies that the focus of

activity is directed towards connecting the body to the instrument—i.e.,

establishing a “grip”—in the proper place and with the proper alignments, but

without any explicit regard as to how these alignments will eventually fold into

the context of embodied, time-constrained performance. For Sudnow, the

features of the task environment were chords, and the proper alignments were

the voicing of those chords:

In early lessons with my new teacher the topic was chord construction, or voicing,

playing a chord’s tones in nicely distributed ways. (Sudnow 2001:12)

The proper “place” of the chords was determined by the specific configuration of

piano keys that the hand would need to engage. It’s interesting to note the

87

“substantial initial awkwardness” that Sudnow describes in the complex of

lookings and graspings that characterize this stage:

I would find a particular chord, groping to put each finger into a good spot,

arranging the individual fingers a bit to find a way for the hand to feel

comfortable, and, having gained a hold on the chord, getting a good grasp, I’d let

it go, then look back to the keyboard—only to find the visual and manual hold

hadn’t yet been well established. I had to take up the chord again in terms of its

constitution, find the individual notes again, build it up from the scratch of its

broken parts. (Sudnow 2001:12)

The mode of engagement here is clearly that of the present-at-hand. Each note

of the chord is mentally associated with an individual finger before the hand gains

a hold on the chord as a whole. The chord, then—the initial “context-free” feature

of the environment—is itself decomposed into individual features. And this

decomposition demands an on-going coordination between an abstract mental

image of the task at hand and the accomplishment of the task. As Sudnow notes,

“lots of searching and looking are first required (Sudnow 2001:12).”

In Dreyfus’ taxonomy, the “Advanced beginner” stage is characterized by the

emergence of a degree of contextual recognition:

As the novice gains experience actually coping with real situations, he begins to

note, or an instructor points out, perspicuous examples of meaningful additional

aspects of the situation. After seeing a sufficient number of examples, the student

learns to recognize them. Instructional maxims now can refer to these new

situational aspects, recognized on the basis of experience, as well as to the

88

objectively defined non-situational features recognizable by the novice. (Dreyfus

1996:10)

The “situational aspects” here point to an initial emergence of gestalts; i.e. of the

tendency to regard coordinated actions—such as the playing of a chord—not as

the combined motions of individual figures, but as a single, integrated motion of

the hands:

As my hands began to form constellations, the scope of my looking

correspondingly grasped the chord as a whole, seeing not its note-for-noteness

but its configuration against the broader visual field of the terrain. (Sudnow

2001:13)

It’s important to note, however, that such gestalts remain limited to isolated and

non-time-pressured events. The context that the performer is beginning to

glimpse, then, remains offline. The perceptual recognition of places and

alignments is beginning to occur at a higher level of scale, but this recognition is

neither situated (in the sense that one place and alignment might lead to a next

place and alignment, or that it might be solicited by some other pressing

constraint in the environment, or both) nor timely (in the sense that the

transition from one place and alignment to a next must satisfy timing constraints

in the broader context of a performance). It is at the next stage of skill

acquisition that such factors enter the equation.

Dreyfus’ designation for the third stage of skill acquistion—“Competence”—is

potentially misleading. It would perhaps be more accurate to say that

competence emerges towards the end of the third stage, where the stage as a

whole is characterized by a gradually increasing capacity for dealing with the

89

online aspects of performance; i.e. for situated and timely musical utterances.

The beginning of the third stage is marked, however, by anything but a sense of

performative competence. Rather, the disparity between the level of skill

accomplished thus far and a newly gained understanding of the larger context of

performance—i.e., its online aspects—leads to a sense of frustration. This

frustration is borne specifically of the body’s inability to adequately respond to

the seemingly overwhelming online demands of performance:

With more experience, the number of potentially relevant elements of a real-world

situation that the learner is able to recognize becomes overwhelming. At this

point, since a sense of what is important in any particular situation is missing,

performance becomes nerve-wracking and exhausting, and the student might

wonder how anybody ever masters the skill. (Dreyfus 1996:13)

Interestingly enough, Sudnow’s first public performance took place at precisely

this stage in his development. It’s worth quoting his account in full:

The music wasn’t mine. It was going on all around me. I was in the midst of a

music the way a lost newcomer finds himself suddenly in the midst of a Mexico

City traffic circle, with no humor in the situation, for I was up there trying to do

this jazz I’d practiced nearly all day, there were friends I’d invited to join me, and

the musicians I’d begun to know. I was on a bucking bronco of my own body’s

doings, situated in the midst of these surrounding affairs. Between the chord-

changing beat of my left hand at more or less regular intervals according to the

chart, the melodic movements of the right, and the rather more smoothly

managed and securely pulsing background of the bass player and drummer, there

obtained the most alienative relations. (Sudnow 2001:33)

90

The gap between motor intentionality and motor ability led to a music that “was

literally out of hand (Sudnow 2001:35).” It also led to Sudnow shying away from

further public performances for a period of several years.

Dreyfus notes that the performer normally responds to the newly discovered

enormity of the task at hand by adopting a “hierarchical perspective,” and by

deciding upon a route that “determines which elements of the situation are to be

treated as important and which ones can be ignored (Dreyfus 1996:14).” In

short, the task is again reduced to individual components. But unlike the concrete

components of activity that constitute the “context-free” features of the “Novice”

stage, the components of the “Competence” stage are rather more context-

bound:

The competent performer thus seeks new rules and reasoning procedures to

decide upon a plan or perspective. But these rules are not as easily come by as

the rules given beginners in texts and lectures. The problem is that there are a

vast number of different situations that the learner may encounter, many differing

from each other in subtle, nuanced, ways. There are, in fact, more situations than

can be named or precisely defined so no one can prepare for the learner a list of

what to do in each possible situation. Competent performers, therefore, have to

decide for themselves what plan to choose without being sure that it will be

appropriate in the particular situation. (Dreyfus 1996:15)

For Sudnow, the plan was to work towards a “melodic intentionality” by

extending in practice his acquired embodied knowledge of isolated chords to

patterned sequences of chords, as well as sequences comprised of the individual

91

notes that those chords contain. Not coincidentally, this plan was decided upon

without input from his teacher, or guidance from “texts and lectures”:

At first, and for some time, this was a largely conceptual process. I’d think: “major

triad on the second note of the scale, now again,” then “diminished on the third

and a repeat for the next,” doing hosts of calculating and guidance operations of

this sort in the course of play. (Sudnow 2001:43)

And in due course, gestalts began to emerge at the level of the sequence, rather

than appearing solely at the level of the event:

A small sequence of notes was played, then a next followed. As the abilities of my

hand developed, I found myself for the first time coming into position to begin to

do such melodic work with respect to these courses. (Sudnow 2001:43)

The emergence of these gestalts is more or less equivalent to what Sudnow

describes as “the emergence of a melodic intentionality”:

... an express aiming for sounds, was dependent in my experience upon the

acquisition of facilities that made it possible, and it wasn’t as though in my prior

work I had been trying and failing to make coherent note-to-note melodies.

Motivated so predominantly toward the rapid course, frustrated in my attempts to

reproduce recorded passages, I had left dormant whatever skills for melodic

construction I may have had. The simplest sorts of melody-making entailed a

note-to-note intentionality that had been extraordinarily deemphasized by virtue

of the isolated ways in which I’d been learning.

It’s precisley in this emerging capacity to form fully articulated phrases that the

performer achieves a degree of competence. Though not yet a native speaker of

92

the language, there is nonetheless a fledgling facility for forming coherent

sentences.

Dreyfus’ chracterization of the “Proficient” stage is particularly interesting in

terms of the Heideggerean opposition between the present-at-hand and the

ready-to-hand:

Suppose that events are experienced with involvement as the learner practices his

skill, and that, as the result of both positive and negative experiences, responses

are either strengthened or inhibited. Should this happen, the performer’s theory of

the skill, as represented by rules and principles will gradually be replaced by

situational discriminations accompanied by associated responses. Proficiency

seems to develop if, and only if, experience is assimilated in this atheoretical way

and intuitive behavior replaces reasoned responses. (Dreyfus 1996:20)

These “situational discriminations” of “intuitive behavior” point explicitly to the

mode of “absorbed coping” that is definitive of the ready-to-hand. And it’s

precisely in the ready-to-hand that “experience is assimilated”; i.e., it is

embodied by the experiencing subject. With an increase in embodied skill, then,

there is also an increase in the ratio of ready-to-hand to present-at-hand modes

of engagement:

As the brain of the performer acquires the ability to discriminate between a variety

of situations entered into with concern and involvement, plans are intuitively

evoked and certain aspects stand out as important without the learner standing

back and choosing those plans or deciding to adopt that perspective. Action

becomes easier and less stressful as the learner simply sees what needs to be

achieved rather than deciding, by a calculative procedure, which of several

93

possible alternatives should be selected. There is less doubt that what one is

trying to accomplish is appropriate when the goal is simply obvious rather than the

winner of a complex competition. In fact, at the moment of involved intuitive

response there can be no doubt, since doubt comes only with detached evaluation

of performance. (Dreyfus 1996:21)

The “Proficient” stage is, however, still comprised of a generous quota of

moments characterized by a mode of “detached evaluation”; i.e., the present-at-

hand. And it’s interesting to note the way in which this can directly conflict with

“intuitive behavior”:

No sooner did I try to latch onto a piece of good-sounding jazz that would seem

just to come out in the midst of my improvisations, than it would be undermined,

as, when one first gets the knack of a complex skill like riding a bicycle or skiing,

the very first attempt to sustain an easeful management undercuts it. You struggle

to stay balanced, keep failing, then several revolutions of the pedals occur, the

bicycle seems to go off on its own, you try to keep it up, and it disintegrates. Yet

there’s no question but that the hang of it was glimpsed, the bicycle seemed to do

the riding by itself, and essence of the experience was tasted with a “this is it”

feeling, like a revelation. (Sudnow 2001:76)

What we see is the paradigmatic Heideggerean “breakdown”; the catalyst that

effects the shift from a ready-to-hand to a present-at-hand mode of perceiving

the task environment. The occurrence of such breakdowns is directly related to

the number and type of skills the performer has managed to assimilate in the

course of interactions with the environment up to the moment in question. Or,

more specifically, the occurrence of breakdowns is directly related to the number

and type of skills the performer has not managed to assimilate:

94

The proficient performer simply has not yet had enough experience with the wide

variety of possible responses to each of the situations he or she can now

discriminate to have rendered the best response automatic. For this reason, the

proficient performer, seeing the goal and the important features of the situation,

must still decide what to do. To decide, he falls back on detached, rule-based

determination of actions. (Dreyfus 1996:22)

What distinguishes the “Proficient” stage from the “Competent” stage is a

shift to a yet higher level of articulational scale. That is, from the level of the

individual phrase or sentence to the level of, perhaps, a discussion or argument.

What distinguishes the “Proficient” stage from the “Expertise” stage, however, is

the continuity of the discourse. A continuity that—in the case of proficiency—is

rendered discontinuous by the intrusion of breakdowns. Sudnow also uses a

linguistic analogy:

From a virtual hodgepodge of phonemes and approximate paralinguistics, a

sentence structure was slowly taking form, sayings now being attempted, themes

starting to achieve some cogent management. But at the same time, courses of

action were being sustained that faded and disintegrated into stammerings and

stutterings, connectives yet to become integrally part of the process. (Sudnow

2001:56)

It’s these “connectives”—“a way of making the best of things continuously

(Sudnow 2001:59)”—that gradually fall into place over the course of sustained

practice. With this falling into place, and with the embodiment of ever more

refined responses to the dynamical contingencies of the environment, the

occurrence of breakdowns—i.e. the solicitation of self-conscious thought, and the

catalyst of “stammerings and stutterings”—becomes increasingly seldom.

95

I’ve already suggested that a capacity for continuous intuitive interactional

response to environmental dynamics is definitive of what Dreyfus describes as

the “Expertise” stage. But Dreyfus also points to a greater refinement to these

responses than there is to the variety of responses that are typical during the

“Proficient” stage:

The expert not only knows what needs to be achieved, based on mature and

practiced situational discrimination, but also knows how to achieve the goal. A

more subtle and refined discrimination ability is what distinguishes the expert from

the proficient performer, with further discrimination among situations all seen as

similar with respect to plan or perspective distinguishing those situations requiring

one action from those demanding another. (Dreyfus 1996:25)

More specifically, he suggests that discriminating ability and a continuity of

response are necessarily linked criteria of expertise:

With enough experience with a variety of situations, all seen from the same

perspective but requiring different tactical decisions, the proficient performer

gradually decomposes this class of situations into subclasses, each of which share

the same decision, single action, or tactic. This allows the immediate intuitive

response to each situation which is characteristic of expertise. (Dreyfus 1996:25)

The lessons learned from breakdowns during the “Proficient” stage, then, have

enabled the expert performer to respond to the same conditions from which

those breakdowns emerged in a timely and unselfconscious manner. Actions are

perceptually guided, the perfomer is immersed in the activity, and the “I think” is

supplanted by an “I can”:

96

I’d see a stretch of melody suddenly appear, unlike others I’d seen, seemingly

because of something I was doing, though my fingers went to places to which I

didn’t feel I’d specifically taken them. Certain right notes played in certain right

ways appeared just to get done, in a little strip of play that’d go by before I got a

good look at it. (Sudnow 2001:76)

With the refinement of dispositional abilities, there also emerges a parallel

refinement of articulational fluency:

I could hear it. I could hear a bit of that language being well spoken, could

recognize that I’d done a saying in that language, in fact for the very first time, a

saying particularly said in all of its detail: its pitches, intensities, pacings,

durations, accentings—a saying said just so. (Sudnow 2001:78)

At this point in the discontinuous unfolding of skill acquisition, the performer

embodies perceptual, actional and cognitive capacities that, in suitable

performance circumstances, enable the experience of flow.

In light of the apparent discontinuities of skill acquisition, it may be worth

revising the diagram of figure 3.8, in which cognitive unfolding is indicated as

continuous over time. In figure 3.9, the temporal dimension is segmented into

discrete blocks corresponding to Dreyfus’ five stages of skill acquisition.

97

HUMAN BODY INSTRUMENT

SKILL

R

I

Time

1.

Novice

2.

Adv an c e d

beginner

3.

Competence

4.

Proficient

5.

Expertise

Figure 3.9. A detailed view of enactive performance practice,

encompassing the discontinuous unfolding of skill acquisition. “Skill” is

indicative of cognitive, motor and perceptual skills. It is also indicative of

the developing capacity for coordination between all three. I represents

the map from human intentionality to the instrument, while R represents

the map from the instrument’s reactions back to the human.

“Skill” replaces “Cognition” in this diagram, where “skill” can be said to

encompass cognitive, motor and perceptual skills, as well as the capacity for

coordination among the three components in both reflective and unreflective

behavior. A more accurate model yet might indicate the changing nature of

human body/instrument relations over each of the five stages of skill acquisition,

but as it stands, the diagram of the continuous and circular human/instrument

interaction loop is sufficiently general to be applicable at each of the stages.

98

Sudnow’s account in Ways of the Hand is representative of what I have

termed an enactive performance practice. But there is nothing particularly

extraordinary about the way in which his skills were acquired. Given an able body

(and therefore an innate capacity for perception, action and cognition), an

intentionality (e.g. to become an improvising jazz pianist, to produce coherent

sequences of notes, etc.), and a sufficiently responsive instrument (e.g. a piano),

any human subject might follow an analagous course. In Sudnow’s case, these

three prerequisites to enactive performance practice came for free. But my

argument has been that in the case of performance with digital musical

instruments, something fundamental is missing; i.e. a sufficiently responsive

instrument. A sufficient responsiveness is synonymous with what I have referred

to as resistance. And it’s precisely the kind of resistance that an instrument

affords to the intentioned, embodied agent that will determine whether or not

that instrument has the kind of immanent potential that would lead to an

enactive performance practice. Kinds of instrumental resistance, then, will be a

major focus when the discussion turns to issues of implementation in Chapter 4.

3.6 Conclusion

I began this chapter with a discussion of the inevitable paradox in any description

of direct experience. The model of enactive performance practice—an attempt at

such a description—brings the discussion squarely back to this fundamental,

instinctive, and largely unreflective way in which humans, through the agency of

99

their bodies, relate to the world. This raises the question: if unreflective behavior

is so fundamental to human experience, why go to the trouble of detailing so

many of its particularities? Why not let that which will happen as a matter of

course, happen as a matter of course?

Both Heidegger and Merleau-Ponty viewed their work as opposed to the

mechanistic underpinnings of canonical Western philosophy. In their respective

analyses of mundane, everyday, unreflective activity, there is an agenda to

replace the Cartesian model of subjectivity with that of the embodied agent at

large in the world. I suggested, earlier in the chapter, that a reversal of the

Cartesian axiom constitutes the first concern of the phenomenological project.

The mechanistic and the phenomenological discourses, then, are fundamentally

at odds. And to the extent that technical discourse continues to hinge on the

discourse of mechanistic philosophy, it also continues to be resistant to

phenomenology. My concern, then, has been with outlining a model of human

experience and activity that serves as an alternative to the model routinely

adopted by technical designers, i.e. that of the perpetually disembodied Cartesian

subject. If it is in fact possible to design and build digital musical instruments

that allow for enactive processes to be realized, then we will have done nothing

other than arrive right back at the most fundamental form of human agency.

100

4 Implementation

4.1 Kinds of Resistance

There are two key assumptions that underlie the enactive model of interaction: 1.

that human activity and behavior has rich, structured dynamics, and 2. that the

kinds of resistance that objects offer to humans in the course of activity are key

to the on-going dynamical structuring of interactional patterns. In the previous

chapter, I was concerned with describing the interactional patterns of an enactive

performance practice with a view to the implications of those patterns for

cognition. Focus was directed at the dynamics of human activity and behavior. In

this chapter, focus is directed at the kinds of resistance that a candidate digital

musical instrument might offer to a human performer in the midst of

performative activity. The underlying concern, then, shifts from theory to

implementation.

I have suggested previously in the essay that conventional acoustic

instruments, because of the resistance they offer to the performer, serve as

useful examples of technical objects that embody the potential for enaction. But

in the huge diversity of mechanisms that we see across the range of acoustic

101

instruments, there is a proportionate diversity in kinds of resistance. The physical

feedback to the performer that arises in the encounter between bow and string,

for example, is of a different kind to that which comes of the projection of breath

into a length of tubing. We can assume, then, that in much the same way that

the contingencies of human embodiment play a determining role in the dynamical

emergence of performative patterns, so too do the contingencies of instrumental

embodiment. This makes the task of arriving at a universal template for the

design of enactive musical instruments a profoundly complex, if not obviously

impractical undertaking.

In the various models of interaction that I schematized in the previous

chapter, the maps from human motor function to computer input devices, and

from computer output devices to human sensory input, are non-specific in terms

of the particular sensorimotor mechanisms that are activated in the course of

interaction—the models are intended to be as general and universal as possible.

But as soon as we move from interaction diagrams to real world

implementations, a higher degree of specificity is required. If, for example, a

candidate model for an enactive digital musical instrument were to remain

general, there would need to be an account of the myriad ways in which human

energy might be transduced as signals at the computer inputs. In the context of

the present study, rather than attempting to compile a comprehensive catalogue

of implementational possibilities, I will focus on one particular real world

implementation: a digital musical instrument that also happens to represent my

first serious attempt at engaging the essay’s key theoretical issues in the form of

an actual device. This device, as with any musical instrument, offers unique kinds

102

of resistance to the performer. The final component of the study sets out, then,

to detail the instrument’s implementational specifics, with a view to the various

ways in which its indigenous and particular kinds of resistance may or may not

lend themselves to the development of an enactive performance practice.

Standard human-computer interaction models partition the computer into

three distinct layers: input devices, programs and output devices. This is the

model I employed in the interaction diagrams of Chapter 3, and I will stick with

that model here. It would seem likely, when the core concern is how the

candidate instrument is resistant to the human performer, that the greater

portion of attention would be directed towards input and output devices, i.e.,

hardware. It is at the level of hardware, after all, that the performer actually

physically engages the instrument. But as I pointed out in Chapter 1, digital

instruments constitute a special class of musical devices: their sonic behavior is

not immanent in their material embodiment, but rather, must be programmed.

So, while hardware certainly constitutes more than a passing concern, the

dynamical behavior and resistance of the instrument is to a large degree

encapsulated in its programs. In the pages that follow, I will, therefore, direct a

significant amount of attention to issues of software.

In persisting with the standard division between hardware and software, I

hope also to demonstrate the utility of keeping the two layers separate in the

design process. While I shall be discussing just one specific implementation, I

nonetheless hope to make it apparent that in maintaining a loose coupling

between hardware and software components, the potential for reusing those

components is increased. This is particularly true of software components, which

103

may at any time in the future need to be integrated into different

implementational contexts, such as a new hardware framework.1 In that case,

any one particular software framework brings with it a certain modest degree of

generality. And to the extent that the framework continues to evolve across

distinct implementations, we may also see the beginnings of—if not a universal

approach to the design of enactive digital instruments—one that is at least

suitably general and robust.

4.2 Mr. Feely: Hardware

Overview

A device that goes under the name of Mr. Feely represents my first attempt at

the implementation of an enactive digital musical instrument (figure 4.1).

1 For an interesting counter example to this approach, where hardware and software

may in a certain variety of cases be inextricable, see Cook (2004).

104

Figure 4.1. Mr. Feely.

105

Mr. Feely’s computational nucleus resides on a miniature x86 compatible

motherboard, running the Linux 2.6 kernel, with patches applied for low latency

audio throughput and for granting scheduling priority to real-time audio threads.

Eight channel audio A/D and D/A hardware, MIDI A/D and D/A boards, and power

conversion modules are located in the same enclosure as the motherboard. One

of the design goals was to create a silent instrument with no moving parts inside

the enclosure. For that reason, the operating system resides on flash memory,

and a specific motherboard/chipset combination was chosen because of its

capacity for fanless operation.

Integration and Instrumentality

Sukandar Kartadinata has used the term “integrated electronic instruments” to

denote a class of devices characterized by an encompassing approach to their

material realization (Kartadinata 2003). “Encompassing” is used here in its most

literal sense: all of the components of which the instrument is comprised—the

input devices, the output devices, and the internal circuitry—are encompassed

within a single physical entity. Kartadinata notes that total integration is not

ubiquitous among conventional acoustic instruments—e.g., the bow is a distinct

physical entity from the body of the violin—but “total integration” is not really the

point of an integrated approach. Rather, emphasis is placed on the coherence of

the instrument; that is, how the material embodiment affords a performative

encounter with a unity. This is in sharp contrast to the sprawl of individual

devices and cables that characterizes “the often lab-like stage setups built around

general purpose computers (Kartadinata 2003:180).”

106

Integration and coherence of the instrumental embodiment were important

factors in the design of Mr. Feely. From the outset, I had in mind that it was of

critical importance that the instrument should have an instrumentality. This is

suggestive of two different interpretations, both of which figured in my approach

to design, and both of which factor in the perceived coherence of the instrument

to the performer: 1. that the instrument in its material embodiment should be

indicative of a specific purpose, and 2. that the instrument should have the feel

of a musical instrument. It may appear redundant to suggest that an instrument

should be instrumental, but it seemed to me a useful way of distinguishing the

project from those in which the instrument comprises a general purpose, off-the-

shelf computer (with or without an attendant array of peripheral input devices).

Figure 4.2 shows Mr. Feely in the playing position. Because of the

instrument’s weight, it is secured on a stand, but designed to rest in the lap of

the performer.

107

Figure 4.2. Mr. Feely in the playing position.

The playing position ensures that there is constant physical contact between

performer and instrument. This aspect of the design is tied in—in the most literal

sense—with the aim that the instrument should feel like a musical instrument. In

the act of playing, the contact with the instrumental body is intensified by hand

actions at the control surface, as weight is transferred from the upper body to the

thighs. The sense of the instrument’s physically “being there” is, then,

proportional to the amplitude of the human’s motor energy output. But there is

another aspect to this “being there,” and this is tied in with the way in which the

instrument is indicative of its use. The control surface is situated at the

performer’s centre of gravity, and it is angled (with respect to the performer) in

108

such a way that it presents itself optimally to the hands, and occupies the focal

ground in the field of vision. It’s not just that the instrument is “there,” but—to

paraphrase Michael Hamman—that it is “so very there” that the opportunity for

action, for physically engaging the controls, makes itself more than readily

apparent.

Control Surface

Unlike the computer-as-it-comes—a general purpose device—Mr. Feely is a

special purpose device. This means that the instrument is intended to be nothing

but a musical instrument, and that it therefore need not accommodate the

multiple representational paradigms required of a multiplicity of possible usages.

An important aspect, then, of the instrument’s instrumentality, is that the

interface is devoid of representational abstractions. In keeping with Rodney

Brooks’ dictum that “the world is its own best model (Brooks 1991),” I avoided

any graphical representations of the sound or its generating mechanisms at the

interface, giving preference to the performer’s perceptions of the sound itself,

and the cross-coupling of these perceptions with the tactile and visual

engagement of the instrument and its input devices. The way in which Mr. Feely’s

interface is different to that of the computer-as-it-comes, then, is equivalent to

the “considerable difference between using the real world as a metaphor for

interaction and using it as a medium for interaction (Dourish 2001:101).”

Three classes of input device are used on Mr. Feely’s control surface: knobs,

buttons and joysticks. The control surface is partitioned into distinct regions

(figure 4.3), which are distinguished by the points in the audio synthesis system

109

to which they are linked. I will detail the specific functional behaviors and

mapping strategies used to connect the input devices to the audio system in 4.3.

It is, however, worth noting the control surface’s basic partitioning scheme in this

section. Although this unavoidably touches on software issues, the functional

layout of the panel is a hardware concern.

Display &

Patch ControlM u t e

Buttons

P o w e r

On/Off

G l o b a l

Volume

Variants

Joysticks

C h a n n e l

Section

G l o b a l

Section

Figure 4.3. Mr. Feely: Control surface partitioning scheme.

110

Of the eight distinct regions that comprise the control surface, four would

ordinarily be utilized only between periods of performative activity: those labelled

“Display & Patch Control,” “Mute Buttons,” “Global Volume,” and “Power On/Off”

in figure 4.3. The Display and Patch Control section is described under Visual

Display below; the functions of the other three sections are self-explanatory. The

four remaining control surface regions—labelled “Channel Section,” “Global

Section,” “Joysticks,” and “Variants” in figure 4.3—indicate the areas in which

activity is focused during performance.

The Channel Section is partitioned into five discrete channels of three knobs

and one button each; there are respectively mapped to five discrete audio

synthesis networks in the software system. The Global Section is divided into two

subsections, which respectively comprise nine knobs, and three knobs combined

with three buttons. These controllers are mapped to a global audio processing

network, and in certain cases to points in the five discrete channels. Signals from

each of the five discrete synthesis channels are passed as inputs to this

processing network. The Joystick Section is comprised of two x-y joysticks, one

of which springs back to its centre position when not in use. These joysticks are

considered freely assignable to any and multiple input points in the discrete

synthesis channels or the global processing network. The Variants Section is

comprised of six backlit buttons. When one of these buttons is toggled on, all

other buttons will be in their off state. These buttons are used to switch between

pre-stored variants in the synthesis network. These variants may differ by

synthesis parameter settings, by mapping functions, or by synthesis network

topologies.

111

With all these individual input devices and multiple mapping systems, it would

seem that the performer has rather a lot to remember during performance. And if

the performer is required to store such data in conscious memory, then the

instrument is not, in itself, properly or sufficiently indicative of its use. This is not,

however, how things work in practice. Firstly, by partitioning the control surface

into functional regions, the user quickly adapts to the relationship between a

cluster of controls and clusterings of associated behavioral patterns at the

instrument’s output. The physical layout of the control surface, then, reinforces

the relationship between specific functional regions and specific functional

behaviors to both the visual and tactile senses. Secondly, by employing a static

functional structure across different patches—that is, across varying

implementations of the underlying audio synthesis networks—the patterning of

the instrument’s behavior remains relatively constant. This means that motor

patterns do not need to be relearned from scratch from one patch to the next; in

fact they should be optimally adaptable, from a base set of functional

correspondences, across even radically divergent implementations of the sound

generating subsystem.

The performer, then, is not required to store a catalogue of controller

functions and mappings in conscious memory, but rather learns through

performing. The layout of the control panel is designed to facilitate this learning

process. The emphasis is placed on motor memory as opposed to the conscious

storing of data, and the underlying software system is designed in such a way

that motor memory should be transferable and adaptable across varying audio

subsystem implementations. The control surface is still, as a whole, sufficiently

112

complex and multifaceted as to offer resistance to learning. It was my aim that

the degree of resistance should be neither so minimal that the interface would

become quickly transparent to motor memory and activity, or so great that, even

after a significant amount of practice, it would remain beyond grasp.

Visual Display

In chapter 2, I discussed the cost to the nonvisual senses of the visuocentric

approach to interaction as typified by the computer-as-it-comes. This is

something that I tried to avoid in the design of Mr. Feely, not only with a view to

minimizing the cognitive demands of visual attention, but with a view to

rendering the interface as free of abstraction as possible. It proved useful,

however, to integrate a character display with the control surface, which is used

to navigate a patch bank between performances, and to monitor data in the case

of “breakdowns” (e.g. program exceptions, memory errors, CPU overload, etc.).

The display is not intended to be used during performance, except as a

notification mechanism in the case of such a breakdown. It does not, therefore,

make any demands on the performer’s attention, and to the extent that vision is

required for the performance task, it may be directed to the guidance of motor

activities.

Audio Display

An important aspect of the “feel” of many conventional acoustic instruments is

the haptic feedback to the performer from the instrument’s vibrating body as it

radiates sonic energy. Unlike conventional acoustic instruments, electronic

113

instruments require the use of amplifiers and loudspeakers in order to propagate

sound in space. Except in the case that the amplifier/loudspeaker system is built

into the instrumental body, electronic instruments are lacking in the haptic

vibrational feedback that is characteristic of their acoustic counterparts.

This issue was taken into consideration in the design of Mr. Feely, but

unfortunately, in deciding upon an amplifier/loudspeaker system, it was

outweighed by other constraints: 1. that the amplifier be powerful enough for the

instrument to be used without further amplification (e.g. through a P.A. system),

and 2. that the loudspeaker should have a wide radiation pattern. This limited the

options among available technologies, and resulted in the choice of a combined

amplifier/loudspeaker system that, because of its size and weight, could not

practically be integrated with the body of the instrument. Nonetheless, by careful

positioning of the amplifier/loudspeaker in performance, it’s possible to go a

certain way towards the “feel” of a conventional instrument. By placing the

amplifier/loudspeaker on the floor, as close as is practical to the body of the

instrument, the radiation of vibrational energy can be felt through the feet and,

to a lesser extent, the torso. The effect varies with the character of the sound, its

frequency and loudness, the type of floor surface, and the type and number of

reflective and absorbtive material in proximity to the loudspeaker. This speaker

placement has one other advantage: the location of the point source of the

sound—which, in the case of the great majority of acoustic instruments, is the

instrument’s body—is as close as is practical to the body of the instrument. The

perceptual localisation of the origin of the sound is an important indicator of the

114

instrument’s phenomenal presence, both for the performer, fellow performers,

and the audience.

Summary

It would be premature to evaluate the ways in which Mr. Feely offers resistance

to the performer without having paid due attention to software. Nonetheless, it

may be useful to recap on the key aspects of the hardware implementation, and

to point to some implications for embodiment, and for the emergence of an

enactive performance practice.

Firstly, the instrument is integrated and instrumental. This means that the

performer engages an instrument that has a functional coherence to its material

embodiment as well as a tangible physical presence in performance. These

factors contribute to the potential for an encounter with the instrument that is

engaging (one of the five criteria of embodied activity from Chapter 1). Secondly,

the instrumental interface affords distributed motor activities without the burden

of representational abstractions. The interface is, then, motocentric rather than

visuocentric, and encompasses multiple distributed points of interaction. This

stands in contrast to the visuocentric, representation-hungry, singular (as

opposed to distributed), and sequential (as opposed to parallel) mode of

interaction that is idiosyncratic to the computer-as-it-comes. At the same time,

then, that the hardware interface to Mr. Feely avoids the interface model of the

computer-as-it-comes, it also avoids the associated costs of that model for

interaction. Whereas the computer-as-it-comes would situate the user’s attention

in a world of metaphorical abstraction and would provide no guarantee of

115

meeting timing constraints (see 2.4), Mr. Feely situates the user’s attention

directly within the activity, encourages the parallel distribution of the activity

across distinct sensorimotor modalities (touch and proprioception, hearing,

vision), and—because of the distributed and multiply parallel nature of the

performative mode—offers a reasonable chance that the real-time constraints of

musical performance might be met. These factors again correspond to certain of

the five criteria of embodiment; specifically, that embodied activity be situated,

multimodal, and timely.

When the focus is shifted from the instantaneous aspects of embodied activity

to embodiment as an emergent phenomenon, we touch on issues of adaptation

and cognition. Such issues are tied in with the instrument’s behavior; i.e., with

the resistance that it offers to the performer, and the unique dynamical

patterning of thought and activity that comes of that resistance. As a piece of

hardware, Mr. Feely affords embodied modes of interaction. But to get from

interaction to realization—i.e., to the emergence of an enactive performance

practice—the instrument will be required to offer resistance to the performer

through the medium of sound. This brings the discussion around to the

implementation of the instrument’s sonic behavior in software.

116

4.3 Mr. Feely: Software

Overview

Mr. Feely’s software system is written in the SuperCollider programming

language.2 The language was chosen for three main reasons: 1. it is mature and

offers a rich set of built-in features, 2. it is easily extensible with user-defined

modules, primitives, and plug-ins, and 3. it is object-oriented. As the main focus

of my work has been directed at the creation of a system that would allow for

dynamical behaviors, much of the task of programming has involved the

incremental development of a framework—an integrated library of extensions to

the language—that augments the base audio synthesis architecture with modules

that allow for complex dynamical mappings between system entities. The

implementational possibilities of these extensions to the language will comprise

the main focus of this and the next section. First, however, it will be useful to

describe the base architecture on which the framework is built.

SuperCollider Server Architecture

The SuperCollider audio synthesis engine passes signals between nodes on a

server, where those nodes represent instances of user-defined synthesis and

processing functions. A sample signal flow diagram would look familiar to

anybody who has worked with modular synthesis systems (figure 4.4).

2 http://www.audiosynth.com.

117

NODES

SIGNALS

SOUND

Figure 4.4. SuperCollider synthesis server: Signal flow.

A node on the synthesis server may contain parameter slots. For example, a

node that represents an oscillator function may contain slots for frequency, phase

and amplitude parameters. The values of a parameter slot may be set by sending

messages to the node to which the slot belongs, or by mapping the parameter

slot to the output of a bus (figure 4.5).

BUS

MESSAGE

SLOTS

Figure 4.5. Writing values to a node’s parameter slots by 1. sending a

message, and 2. mapping the slot to the output of a bus.

118

A bus is a virtual placeholder for a signal. It’s possible, for example, to tap an

output signal from any node in the synthesis network and route it to a bus, from

which the signal could be rerouted as an audio signal input to any other node, or

mapped to a parameter slot belonging to any other node (figure 4.6).

1 2

BUS 1

BUS 2

Figure 4.6. Signal routing between parallel synthesis networks using

busses. Bus 1 taps an output signal from a node in the first channel and

routes it to the audio input of a node in the second channel. Bus 2 taps an

output signal from a node in the second channel and maps it to a

parameter slot of a node in the first channel.

SuperCollider’s bussing architecture allows for the flexible routing of signals

within the synthesis network. This flexibility is exploited and extended in the

119

extensions to the language that form the basis of Mr. Feely’s mapping

framework.

Mapping Framework

The mapping framework that I have developed for Mr. Feely is primarily

concerned with providing a flexible and intuitive mechanism for routing signals

between components of the audio synthesis network, and for defining functional

mappings between them. A functional mapping can be taken to mean the

transfer function from the output of one component to the input of another. That

is, a function that is applied to the signal such that the signal’s characteristics are

transformed between output at the source component and input at the receiver

component. The mapping framework consists of a hierarchical library of such

functions encapsulated within discrete software objects. The behavior of the

instrument as a whole is in large part determined by these functions and their

various mappings and routings within the audio synthesis network.

As I noted in the previous section, any signal within the audio synthesis

network may be routed to a bus, and rerouted from that bus to any other point in

the network. In Mr. Feely’s mapping framework, the functional transformation of

the signal takes place between the bus and the signal’s destination. The objects

that perform these transformations comprise the mapping layer. The mapping

layer allows for the flexibility to route the signal at a single bus to multiple

destinations with multiple functional mappings (figure 4.7). This is an example of

a “one-to-many (Wanderley 2001)” mapping model.

120

BUS

x

y

z

Mapping Layer

Figure 4.7 The signal at a bus is split into three signals. These signals are

routed to three different parameter slots, effecting a one-to-many

mapping. Each signal is subject to a functional transformation (those

transformations denoted here as x, y and z) between the bus and their

respective parameter slot destinations. The software objects that perform

these transformations comprise the mapping layer.

The mapping framework also allows for the “cross-coupling (Hunt, Wanderley,

and Paradis 2003)” of bus signals, or “many-to-one (Wanderley 2001)” mappings

(figure 4.8).

x

y

BUS 1

BUS 2

Figure 4.8. The signals at two busses are subject to functional

transformations (x and y). The transformed signals are summed, or

121

“cross-coupled,” resulting in a mapping from multiple signal sources to a

single parameter slot.

Additionally, the mapping framework allows for what I have termed “function-

parameter” mappings, where the output of one functional mapping may be

mapped into a parameter slot in another (figure 4.9).

x

y

BUS 1

BUS 2

Figure 4.9. The signals at two busses are subject to functional

transformations (x and y). The output of function x is mapped into a

parameter slot in function y. The output of function y is mapped to a

parameter slot in an audio synthesis network component.

For example, function x in figure 4.9 might scale the output of the signal at BUS

1 into the range [1,10]. Function y might multiply the output value of the signal

at BUS 2 by the value of an argument, where that argument is set at a

parameter slot. When the output of x is mapped into the parameter slot that

corresponds to the multiplicand argument of y, the signal at BUS 2 is multiplied

by the scaled signal at BUS 1. The output of the dependent function y is then

mapped to a parameter slot in an audio synthesis network component. This is a

122

simple example, but it makes clear the kinds of complex interdependencies

between system components that “function-parameter” mappings allow.

Mr. Feely’s hardware controls are connected to the audio synthesis network

through busses (figure 4.10).

ADC

x

y

z

BUS

Figure 4.10. The map from hardware to software. Analog signals are read

by an analog-to-digital converter (ADC) and written to a bus in the audio

synthesis network. The signal at the bus may be treated as though it were

any other signal.

While all busses in the audio synthesis system are instances of a single class of

bus, and therefore have identical implementations, they are nonetheless

classified as having either local or global scope. All busses that are placeholders

for signals routed from audio signals have global scope, and can be routed to any

point in the synthesis network. Busses that are placeholders for signal arriving

from Mr. Feely’s hardware controls, however, are accorded either local or global

scope, depending on the particular input device to which they are connected. In

this scheme, the scope of a bus corresponds to the function of the input device as

defined by the partitioning of Mr. Feely’s control surface into functional regions.

123

Channel Section controllers, for example, are connected to busses that have local

scope within each of the five discrete audio synthesis network channels, while

Global Section controllers are connected to busses that have global scope (figure

4.11).

1 2

L1.1

L1.2

L1.3

L2.1

L2.2

L2.3

GLOBAL

G1

G2

Figure 4.11. Local and global scope of busses. Busses L1.1-3 and L2.1-3

are connected to Channel Section controllers on Mr. Feely’s control panel.

Their scope is local; i.e., they may only be routed to the corresponding

audio synthesis network channels, 1 and 2. The output of these audio

synthesis channels is summed and sent to a global processing network.

124

Global busses G1 and G2 are connected to Global Section controllers on

Mr. Feely’s control panel. The scope of these busses is global; i.e., they

may be routed to the global processing network, or to any of the discrete

audio synthesis channels.

Busses have a special status in the mapping framework. They are

placeholders for signals that originate both outside and inside the audio synthesis

network, and therefore represent the points at which human action and internal

mechanism coincide. It was a deliberate design choice to accord busses this dual

role, as a transparency to the source of signals within the system effectively blurs

the implementational boundary between human and instrumental behaviors. That

is to say, signals are treated as equivalent whether their origins are external or

internal to the system, and this equivalency of signals implies that all signal flow

networks are formed at the same level of structure. The “push-and-pull” of

dynamical forces that is key to the instrument’s resistance, then, is encapsulated

in the structure and behavior of a single integrated signal flow network.

To this point, the simple mapping schemes I have illustrated have not

demonstrated models of dynamical behavior. The only difference, for example,

between the mapping scheme of figure 4.11 and that of a linear summing mixer

is that the bussing architecture in the figure shows the possibility of a flexible

routing of controls signals to individual parameter slots in the various mixer

channels. The dynamical behavior of the system as a whole would, nonetheless,

appear to be relatively flat. Consider a system, however, where the outputs from

125

two discrete audio synthesis networks are routed to global busses, and then back

to parameter slots within the discrete networks (figure 4.12).

1 2

A1 A2

GLOBAL

yx

Figure 4.12. Discrete audio synthesis networks are coupled to form an

interacting composite network. Global busses A1 and A2 serve as

placeholders for the output signals of channels 1 and 2 . These signals are

transformed by functions, x and y, and the continuous outputs of those

functions are routed to parameter slots in the discrete channels. The

output of channel 1, after underdoing functional transformation, is used to

regulate the internal behavior of channel 2, and vice versa.

126

In this example, the output of channel 1 is routed back to a parameter slot in

channel 2, and vice versa. The output signals from the two channels, then, rather

than being summed (as in figure 4.11), could be used to regulate one another’s

behavior. The structure of the network—i.e., its topology—creates a coupling

between the two discrete audio synthesis networks; where they had previously

formed uncoupled autonomous systems, they now form coupled nonautonomous

systems. The way in which the bussed signals act as regulatory mechanisms in

the respective synthesis networks is defined by the mapping functions, indicated

in figure 4.12 as x and y. These functions might encapsulate any number of

behaviors. They might, for example, map the audio signal unaltered into the

parameter slot, scale the audio signal to an effective range, track the signal’s

frequency or amplitude characteristics, scale it to an effective range and map the

resulting signal to the slot, and so on. Any of these choices would create the

possibility for complex behavioral dependencies between the two synthesis

networks, and at the same time, the possibility for nonlinear dynamical behaviors

in the composite (coupled) system.

Summary

From the perspective of either of the discrete networks in figure 4.12, internal

behavior is nonautonomous; i.e., behavioral patterns are determined in part by

signals that originate outside the network. From the perspective of a human

observer, however, the composite network (comprised of the two interacting

subnetworks) could be said to be autonomous, as it operates, and exhibits

behavior, without human intervention. This presents an interesting design

127

problem: we want the instrument to have rich, structured dynamics, but at the

same time, we want those dynamics to emerge in the coupling of the instrument

to a human performer. So, although we could engineer a system that exhibits

dynamical behavior without human involvement, the kind of system that is more

compelling with a view to enactive performance practice would be one that,

rather than exhibiting autonomous dynamical behavior, embodies the potential

for dynamical behavior when coupled to a human performer. This does not rule

out the kind of model encapsulated in figure 4.12. In fact, this model forms the

basis of the first usage example I will outline in the next section. It does,

however, call for calibration of the system—a “tuning” of the system’s dynamical

responsiveness—when human action enters the equation.

In summary, then, the mapping framework allows for the creation of complex

interdependencies between system components. And these interdependencies

are key to the “push-and-pull” dynamics that define the instrument’s kinds of

resistance. But the question remains as to how one might go about calibrating

the system in such a way that it requires human action; i.e., such that when

there is a “push-and-pull” of physical forces at the hardware layer, the

instrument responds and resists with proportionately rich and varied sonic

behavior. I’ll take up this issue by outlining two specific usage examples.

128

4.4 Mr. Feely: Usage Examples

Overview

In this section I outline two examples of Mr. Feely in use. At the present writing,

the first model is in an early stage of development, while the second is relatively

mature. I have chosen these specific examples because of their differences. Or,

more specifically, because their differences illustrate the ways in which diverse

implementations might highlight distinct facets of a single basic concern: enactive

performance practice. The two usage examples are interesting, then, because

they point to different kinds of resistance, to different modes of embodied

activity, and to different realizational potentialities.

Example 1: Pushing the envelope

Figure 4.13 illustrates an extension of the interacting composite network of figure

4.12. As in figure 4.12, the output of channel 1 is mapped via a global bus to a

parameter slot in channel 2, and vice versa, in such a way that the two discrete

audio synthesis networks regulate one another’s behavior in a manner

determined by the output of the functions x and y. The example in figure 4.13

departs from that of figure 4.12, however, through the addition of two local

busses, L1.1 and L2.1.

129

1 2

L1.1 L2.1A1 A2

GLOBAL

x y

ba

Figure 4.13. Functional covariance. Local busses L1.1 and L2.1 are

placeholders for signals from Mr. Feely’s Channel Section. These signals

are mapped to parameter slots of mapping functions internal to the

composite audio synthesis network. This is an instance of “function-

parameter” mapping, where the output of function a serves as a

continuous input, or argument, to function x, and the output of function b

serves as a continuous input to function y.

The local busses L1.1 and L2.1 provide the effective point of access to the

system for human action. Rather than being mapped to parameter slots in the

nodes that comprise the synthesis network, these busses are mapped to

parameter slots of mapping functions that are internal to the system; i.e., they

130

represent “function-parameter” mappings. The way in which the output signals of

the coupled channels regulate one another’s behavior, then, is largely determined

by the functional mapping from the local busses to the parameter nodes of the

global busses, and is covariant with human action.

This network of mappings forms the basis of a performance scenario I’ve

developed for Mr. Feely that goes under the working title “pushing the envelope.”

The mappings illustrated in figure 4.13 represent just a partial view of the entire

system, which utilizes five discrete audio synthesis networks and assigns three

local busses to each network, corresponding to the five channels of three knobs

that comprise Mr. Feely’s Channel Section. The two busses per channel that are

not shown in figure 4.13 are mapped to various parameter nodes in the

respective discrete audio synthesis networks. These mappings vary across

different implementations of the basic system, but in all instances map into

continuous ranges as suitable to the synthesis parameter in question. The

functional mappings from the local busses L1.1 and L2.1—the busses that are

shown in figure 4.13—are key to the dynamical responsiveness of this particular

network. It’s their role that I will focus on here.

In the “pushing the envelope” model, the functions x and y (figure 4.13)

represent composite functions: amplitude followers (on the signals at A2 and A1

respectively) modulated by the output of a logistic mapping function:

xn+1 = µxn(1 - xn)

131

The outputs of x and y are connected as level controls at the output stage of

channels 1 and 2 respectively, effecting a coupling between the two channels.

The logistic mapping function is interesting because the trajectory of its orbit

varies with different values of the variable µ. It represents a simple nonlinear

system, the response of which becomes increasingly chaotic when the value of µ

is greater than 3, and is entirely unstable when µ is greater than 3.87 (assuming

values of x in the range [-1, 1]). The mapping functions x and y, then, already

embody the potential for complex dynamical behavior, where the dynamical

contour of the modulated signals derived from A2 and A1 may be more or less

chaotic or “flat” depending on the assignment of a constant value to µ.

The functions a and b in figure 4.13 represent the slope (rate of change) of

the signals at busses L1.1 and L1.2 respectively. The amplitude of this function’s

output will vary proportionately with the rate of performer activity at the

hardware controls—i.e. the corresponding knobs in Mr. Feely’s Channel Section—

that are connected to the bus. This means, essentially, that the more “active” the

activity, the greater the amplitude of the resulting signal.

The parameter slots in the mapping functions x and y (figure 4.13) represent

the variable µ in the logistic mapping function. This creates for a potentially very

interesting mapping. As the outputs of a and b are effectively plugged into µ, the

dynamical contour of the outputs of x and y are directly proportional to the rate

of performer activity. The effective ranges of a and b are scaled to a dynamically

rich range in µ (between 2.9 and 3.87; a range that encompasses the

discontinuous transition from flat to chaotic dynamics through successive period

doublings), which results in the system as a whole having response

132

characteristics that vary dynamically with the “push-and-pull” of human motor

actions. For example, an increase in the rate of left-right knob “twiddling” with

respect to time (figure 4.14) will result in a proportionate increase in the “degree

of chaos” in the outputs of functions x and y.

TIME

Figure 4.14. Left-right knob manipulation with respect to time.

In practice, the “pushing the envelope” model has certain interesting

implications for performance. Firstly, because of the way the system is

calibrated—specifically the “tuning” of the logistic map variable µ in relation to

the rate of change of motor activity—it requires a performer; i.e., without

performer action, the response of the system is dynamically flat. Secondly, the

system requires considerable physical effort on the part of the performer to elicit

dynamically rich responses from the software system. To that extent, the system

doesn’t just require the performer, it requires a considerable investment of

performative energy. Thirdly, the behavior of the system as a whole is far from

133

transparent at first use, and in fact demands significant experimentation before

certain consistent patterns and responses begin to reveal themselves. The

complexity of the system’s dynamical responsiveness is effectively guaranteed by

the interdependencies of the five discrete audio synthesis networks, as

encapsulated in the functional mappings from outputs in one channel to

parameter nodes in another. The key implication of these interdependencies is

that performative actions directed toward a single channel of controls will have

consequences beyond the scope of the discrete audio synthesis network to which

those controls are connected. That is to say, although the performer may place

the focus of activity at any one moment within a specific channel—and the

human anatomical constraint of two-handedness tends to determine this kind of

pattern in performance—the effects of that activity will nonetheless be felt

throughout the composite network comprised of all five channels.

In my experience thus far with this system, I’ve found that it’s not possible to

get an overall conceptual grasp on its range of behavior, and particularly on the

way that dynamical changes propagate through the composite network.

Nonetheless, certain recurrent patterns of motor activity have begun to emerge,

and these patterns are yielding varieties of sonic responsiveness that, at the

same time that they continue to be more closely aligned to certain expectations,

also continue to yield new and often surprising dynamical contours.

Example 2: Surfing the fractal wave (at the end of history)

In certain respects, there are parallels in the dynamics of the “pushing the

envelope” network to the dynamics of many conventional acoustic instruments.

134

When there is no input of human energy, for example, the instrument’s response

is “flat.” And when human energy is transmitted to the system, the system’s

dynamical responsiveness is proportionate to the amplitude of that energy. There

is, then, a particular way in which the model requires the performer: it requires a

“pushing”—a directed expenditure of kinetic energy—to actualize the dynamic

potential that is immanent to the network.

The model I outline in this section—“surfing the fractal wave (at the end of

history)”3—embodies an altogether different kind of resistance and affords an

altogether different variety of motor activity. Where performance with

conventional acoustic instruments ordinarily requires a “pushing” of kinetic

energy into the instrumental mechanism in order to set things in motion, in the

“surfing the fractal wave” model, things are already in motion in the instrumental

mechanism. The mode of performance, then, is more concerned with giving

dynamical shape and contour to these motions; an “absorbed coping” that is

about the timely navigation of energy flows in the environment, rather than the

directed transmission of energy flows that originate in the body. Hence the

distinction between “surfing” and “pushing” analogies.

Patterns of motor activity in “surfing the fractal wave” are designed around

the asymmetry of “handedness” (Guiard 1987); i.e., dominant and non-dominant

3 The name is borrowed from the title of a 1997 Terence McKenna lecture

(http://www.abrupt.org/LOGOS/tm970423.html). My appropriation, however, has very

little to do with McKenna's original intention.

135

hands are afforded independent sub-tasks, but they cooperate in the

accomplishment of the larger task that those sub-tasks comprise. Kabbash,

Buxton and Sellen describe three characteristic ways in which the two hands are

asymmetrically dependent in select everyday tasks:

1. The left hand sets the frame of reference for action of the right. For example, in

hammering a nail, the left hand holds the nail while the right does the hammering.

2. The sequence of motion is left then right. For example, the left hand grips the

paper, then the right starts to write with the pen.

3. The granularity of action of the left hand is coarser than that of the right. For

example the left hand brings the painter’s palette in and out of range, while the

right hand holds the brush and does the fine strokes onto the canvas.

(Kabbash, Buxton, and Sellen 1994:418)

Each of these examples could be viewed as aspects of a single embodied

tendency; a tendency that is self-reinforcing across a wide range of activities and

over repeated performances. Kabbash et al. advocate the design of human-

computer interfaces that exploit the habitual ways in which humans tend to use

their hands in skillful activity. The “surfing the fractal wave” model heads in this

direction.

Figure 4.15 represents a partial view of the “surfing the fractal wave” network

model.

136

C1.1

C1.2

C1.3

Audio

Network

JSY

GLOBAL

JSX

1

C2.1

C2.2

C2.3

2

Audio

Network

LH RH

a b

SEQ

Figure 4.15. “Surfing the fractal wave” network model. The x and y

outputs of a joystick with global scope (JSX, JSY) are mapped to

parameter slots of a chaotic sequencer function (SEQ). The sequencer

sends a stream of timed triggers to parameters in each of five discrete

audio synthesis networks (for clarity, only two are shown). Local busses

(C1.1-3 and C2.1-3) read signals from the knobs in Mr. Feely’s Channel

Section. These controls “filter” the results of the mapping from the

sequencer stream to each of the discrete audio synthesis networks.

Joystick manipulations are always performed by the left hand. Knob

manipulations are in most instances performed by the right hand. Some

feedback networks, mapping functions and audio synthesis network

schemata have been omitted for clarity.

137

The diagram divides the network space into left hand and right hand regions. In

performance, the pads of the left hand fingers tend to “ride” the joystick, where

certain gestural patterns emerge in response to the dynamical properties of the

“function-parameter” mappings of the global busses JSX and JSY (placeholders

for continuous signals from the x and y axes of the joystick, respectively) into the

output of a chaotic sequencer (SEQ).4 The sequencer is calibrated in such a way

that its output is more or less stable when the values of the mapping functions a

and b are close to the centre of their effective ranges. In practice this means that

when the joystick is in its centre position (the resting position for a “spring-back”

style joystick), the sequencer clock outputs a steady stream of pulses, at a

medium tempo, with a regular and stable amplitude pattern (figure 4.16).

Time

Figure 4.16. Sequencer pulse stream when the joystick is in centre

(“resting”) position.

4 The "chaotic" sequencer function is not technically chaotic (in mathematical terms).

The designation can be taken to be qualitative.

138

The mapping functions a and b determine, however, that deviations in the x and

y axes of the joystick result in more complex behaviors in the pulse stream. The

parameter slot to which a is mapped represents a multiplication argument for the

sequencer’s clock frequency and base amplitude. An increase in the signal at

JSX, then—corresponding to a left-to-right movement across the joystick’s x

axis—results in an increase in the pulse stream’s frequency and amplitude (figure

4.17).

Time

JSX

L

R

SEQ

Figure 4.17. Sequencer pulse stream when there is a left-to-right

movement across the joystick’s x axis.

The parameter slot to which the mapping function b is connected represents a

chaotic variable in the sequencer function. In short, this single variable

determines two aspects of the sequencer’s behavior: 1. the degree of pulse

“nestedness,” and 2. the probability that successive values read from an internal

finite state machine are mapped to the amplitude of the pulse stream. An

increase in the value of both of these parameters (corresponding to a bottom-to-

139

top movement in the joystick’s y axis) results in an increase in the system’s

entropy, where pulse “nestedness” implies a greater likelihood of frequency

multiplication from one pulse to the next (and therefore a greater likelihood of

extra pulses being “nested” into the pulse stream), and where the irregularly

patterned output of the internal finite state machine incrementally encroaches on

the otherwise linear behavior of the amplitude mapping in the mapping function a

(corresponding to the left-to-right movement across the joystick’s x axis). Figure

4.18 adds a bottom-to-top movement in the joystick’s y axis to the left-to-right

movement in the x axis illustrated in figure 4.17. The output of the pulse stream

shows the trajectory towards a higher “degree of chaos” over time.

Time

JSX

L

R

SEQ

JSY

B

T

Figure 4.18. Sequencer pulse stream when there is a left-to-right

movement across the joystick’s x axis, and a bottom-to-top movement

across the y axis. The increase in the signal at JSY results in a greater

140

likelihood of “nestedness” in the pulse stream, and a greater likelihood of

irregularities in amplitude patterns.

The perceptual guiding of left-hand actions in “surfing the fractal wave” is

more integrated than figure 4.18 would suggest. While the joystick operates

across two degrees of freedom—the x and y axes—the performer does not break

the activity down into separate movements in two dimensions (as figure 4.18

would indicate). Rather, the performer guides the left-hand through singular

trajectories across a two-dimensional space. And it’s in these motions that a

“feel” develops for the sequencer’s stable and chaotic regions, the transitions

between then, and for the shift from greater-to-lesser and lesser-to-greater

degrees of event density with respect to time. But these motor patterns

constitute only one part of the coordinated left hand/right hand movements that

amount to “surfing the fractal wave.” And while it’s useful to break the activity

down into left and right hand sub-tasks, there can be no complete picture without

considering how these sub-tasks coordinate and cooperate.

The output of the chaotic sequencer is mapped to parameters in each of the

five discrete audio synthesis networks. While each of these networks

encapsulates different dynamical responses, there are strong symmetries

between their behaviors, and between the kinds of responses that right hand

actions might elicit from each of the networks. Each of the five synthesis

networks implements a resonator function, where the pulses that are mapped

into each of network serve as excitors. These resonators embody different

resonance models (with different dynamical responses), but there are certain

141

perceptual constants from one network to the next. Figure 4.19 shows the

mapping from local busses to two of the five discrete audio synthesis channels.

C1.1

C1.2

C1.3

Audio

Network

1

C2.1

C2.2

C2.3

2

Audio

Network

GATE

WIDTH

RESONANCE

Pulse Stream

3,4,5

Figure 4.19. Perceptual symmetries in the functional mapping from

busses to the audio networks across distinct channels. Percepts (“Gate,”

“Width,” “Resonance”) are assigned to corresponding busses across each

channel. The symmetry holds at the level of hardware, where rows of

knobs in Mr. Feely’s Channel Section correspond to rows of busses in the

diagram.

High level “percepts” are symmetrical across each of the five channels, where

each of those percepts corresponds to the same bus number assignment in each

channel. That is, “Gate” corresponds to busses C1-5.1, “Width” corresponds to

busses C1-5.2, and “Resonance” corresponds to busses C1-5.3. This has the

142

effect of similar classes of response being elicited from corresponding knobs in

each of the five channels of Mr. Feely’s Control Section.

Of course, these “percepts” require a symmetry in terms of the effect of

functional mappings into each of the discrete audio synthesis networks if their

particular perceptual qualities are to be discerned and distinguished. The “Gate”

mechanism is functionally identical across all five channels: turning the

corresponding knob from left to right has the effect of allowing a greater number

of pulses to pass through a gated input to each resonator. It acts, then, as an

event filter on the pulse stream, where no pulses are passed to the resonator

system when the gate’s value is zero, all pulses are passed when the gate’s value

is one, and each pulse in the stream has a 0.5 probability of passing when the

gate’s value is 0.5.

The implementation of the “Width” mechanism varies slightly from one

channel to the next, but its effect is symmetrical: turning the corresponding knob

from left to right has the effect of “loosening the elasticity” of each resonator; i.e.

a tighter “elasticity” (implemented as a shorter impulse response in the delay

lines in the resonator’s filterbank) will result in shorter output events, whereas

these events will take on longer durations (correlating to the perception of having

a greater temporal width) as the resonator’s “elasticity” is slackened.

The “Resonance” mechanism is the most varied in terms of implementation

across the five channels. It is tied in specifically to parameter nodes in the

resonator that change the resonator’s dynamical responsiveness; i.e., the

resonant frequencies, their bandwidths, and the ways in which the filters that

143

comprise the resonator’s internal filterbank interact. Across all five channels,

turning the “Resonance” knob from left to right tends to shift the dynamical

response of the resonator increasingly towards distortion, self-oscillation and

nonlinear behavior.

In the breakdown of right hand and left hand tasks in “surfing the fractal

wave,” there is a correspondence to each of the three characteristic behaviors of

bimanual asymmetric action that Kabbash et al. point out. It’s worth addressing

each point in turn:

1. The left hand sets the frame of reference for action of the right.

In the “surfing the fractal wave” model, left hand movements give contour to the

dynamical unfolding of the pulse stream, while the right hand acts as an event

filter on the stream, and a modifier of the dynamical properties of the events that

emerge from pulses hitting the resonator functions. The pulse stream, as it

unfolds, is the frame of reference for the “picking” and “shaping” of discrete

events that characterizes right hand actions.

2. The sequence of motion is left then right.

This follows from the first point: the right hand modifies the event stream only

after the left hand has given the stream its dynamical contour. But unlike

Kabbash et al.’s corresponding example (“the left hand grips the paper, then the

right starts to write with the pen”), the respective actions form a continuous

interplay of complementary motions—as opposed to a sequence of isolated

events—and the transference from left-handed to right-handed motions takes

place at a much finer granularity of temporal scale.

144

3. The granularity of action of the left hand is coarser than that of the right.

In “surfing the fractal wave” the left hand is designated to control the joystick.

These joystick manipulations do not require the hand to reposition itself across

discrete points on the control surface, and they do not require grasping, turning,

or other finger motions that are performed at a fine granularity of scale. I’ve

found that in playing with the model, my left hand will often span the distance

from the joystick to the top row of knobs in Mr. Feely’s Channel Section, leaving

the little finger to move the joystick through the two dimensional plane while the

thumb and pointer finger turn the knobs. But even this action is of a coarser

granularity than the actions designated to the right hand; actions that involve a

constant “hopping” between the fifteen knobs that comprise the Channel Section,

and finely detailed turnings and twiddlings of those knobs.

It’s interesting to note that in the act of playing, left hand activities do not

seem to require any conscious attention, while the right hand activities demand

on-going and focused attention. That the dominant hand should be at the centre

of attention in the midst of bimanual action is not a point that Kabbash et al.

discuss, but it seems that my experience of this phenomenon with “surfing the

fractal wave” might also apply to other activities, such as Kabbash et al.’s

corresponding example: “the left hand brings the painter’s palette in and out of

range, while the right hand holds the brush and does the fine strokes onto the

canvas.”

The two key aspects to the model of activity in “surfing the fractal wave” are

the “surfing” aspect, and the engineering of the interface around habitual

145

embodied patterns of “handedness.” It’s these aspects—or, more specifically, the

entangling of these aspects in the midst of performance—that give the model its

idiosyncratic kind of resistance. In contrast to the “pushing the envelope” model,

in which events are initiated when the performer transmits kinetic energy to the

instrumental mechanism, the “surfing the fractal wave” model is built around a

persistent stream of events. And these events can go by very fast. Motor

patterns, then, emerge not only in the interdependencies between the two hands,

but in the coordination of the hands with respect to timing constraints.

It’s been interesting to note that, over the period of time that I’ve worked

with this model, and as my hands have become both better coordinated and

more individually dexterous, I’ve had a better capacity to deal with the system’s

unfolding in a timely manner, and this in turn has led to a higher level of detail

and nuance in both the shaping of individual sounds at the event level, and in the

elaboration of larger scale events, such as phrases and gestures. This seems to

me indicative of the coevolution of sensory, motor and cognitive competencies

that is definitive of enaction.

Summary

The conventional metaphors of computer science tend to regard computation as

an inherently sequential process. That is, as a function from input to output

comprised of a series of discrete and causally related steps, where the desired

outcome of the function is known in advance of its execution. This is at odds with

the enactive model of interaction, where activity takes place across a network of

interacting components, and where the behavior of those components, and

146

therefore of the network as a whole, is adaptive and emergent with respect to

the ongoing push-and-pull of interactional dynamics. An enactive digital musical

instrument, then, will depend on a fundamentally different view of computation

to that of conventional computer science. Rather than falling back on the

“computation-as-calculation” model, computation would be viewed as a process

in which “the pieces of the model are persistent entities coupled together by their

ongoing interactive behavior (Stein 1999:483).”

This model of “computation-as-interaction” underlies the design of Mr. Feely’s

software system. The system allows for human action to be folded into the

dynamical processes of interacting network components, and to that extent it

also allows for a structural coupling of performer and instrument. Structural

coupling is not, however, a given property of the system; while the software

system provides the required technical infrastructure, the kinds of resistance that

the instrument affords to the human remains a matter of how the infrastructure

is utilized; i.e., a matter of design. And the “right” kinds of resistances—at least

with a view to structural coupling, realization and enaction—will be those that are

neither so transparent to human action that they demand little thought or effort,

or so ungraspable that they forever remain beyond motor and cognitive

capability.

I suggested in chapter 2 that while there is much to be learned from the

physical modeling of conventional acoustic instruments, the focus of my work is

directed more towards the development of instrumental behaviors that are

indigenous to computing media. The examples I’ve outlined in this section point,

however, to a kind of physical model, in that they embody networks of dynamical

147

dependencies in which human action is resisted by forces that are immanent to

the software network. But in contrast to physical models of conventional

instruments, the virtual physics of these systems is speculative; i.e., the models

are not based on data from real world measurements, or on differential equations

that describe well known physical systems. Rather, they are evolved interactively

through experimentation with various mapping and calibration schemes. So,

while the components of the audio synthesis network certainly continue to play a

critical role in the instrument’s behavior, the focus of development is shifted to

the mapping framework. I’m suggesting that it’s through this shift that we see

the potential arise for what I have called an “indigenous” computer music.

In the approach I’ve taken, design choices as to “kinds of resistance”—i.e.,

classes of behavior—are effectively decoupled from audio synthesis

implementations, leaving the designer free to experiment with any manner of

sound-producing and processing components. I’ve found that the “right” kinds of

resistances, however, continue to be those that are resonant with phenomenal

experience and past practices of embodiment. Essentially, this means that the

simulated physics of resistance will be—in some way or other—functionally

related to physical descriptions of real world behavior. The design of these

systems, then, takes a middle course between normative and speculative modes

of interactivity; between that which is familiar and that which is other to every

day phenomenal experience. If the balance between these two poles is apposite,

then the kinds of resistance that the systems afford will be sufficiently rich in

dynamical potential that, over a sustained period of time, the performer will

continue to realize new practices, and new ways of encountering the instrument.

148

4.5 Prospects

The two usage examples I’ve outlined in this chapter demonstrate just a small

number of possible approaches to engineering the kinds of resistance that digital

musical instruments might store in potentia. In my work with Mr. Feely, as both

designer and performer, it seems I’m still just scratching at the surface of these

matters, and that there are a great many implementational possibilities yet to be

uncovered. It also seems that at a certain point, these “uncoverings” will

necessarily require the development of patterns, in both design and performance,

that are of a higher order than those I’ve outlined to this point.

For design, this will likely be a matter of evolving a body of general principles

that might be employed such that design knowledge can be added to

incrementally. At first glance, this may appear to contradict my observation at

the beginning of this chapter that the task of arriving at a universal template for

the design of enactive instruments may be ultimately impracticable. But the issue

I’m raising here is more directly concerned with arriving at general principles that

operate at a higher level of abstraction than purely implementational concerns.

The concern, rather, would lie with the way in which models might be generated

from a consistent but open-ended application of principles that emerge from the

interaction between philosophical and technical problematics. This would be a

kind of meta-design. Rather than persistently hopping back and forth between

philosophical and technical discourses, there would exist an evolving metric for

balancing the constraints of one against the other in an integrated framework. At

this point in my work, I can’t say for certain how one would go about putting

149

such a framework together. But problems such as these are not without

precedent in the history of design,5 and it seems to me a potentially very

productive avenue of investigation.

The development of higher order patterns in performance is also a matter of

balancing opposing constraints. As I’ve been careful to make clear, the two usage

examples I’ve outlined in this chapter embody very different kinds of resistance,

and therefore afford very different varieties of human action. It’s interesting to

consider, though, how these models might be interleaved in the context of the

same performance. This kind of multitasking is part and parcel of expert

musicianship. And while the two usage examples I outlined in the previous

section might involve a certain degree of multitasking in and of themselves, there

is a higher order of multitasking that could potentially encompass both models

simultaneously. At the same time that this may eventually lead to more complex

and diverse sonic utterances, it may also lead to a heightened sense of flow—of

performative embodiment.

In considering the merging of the two models into a single integrated model,

it would seem that they are in fact so different in playing technique as to be

incompatible. Again, the issue comes back to design. Multitasking must

necessarily involve some degree of compatibility between the actional patterns

that comprise the sub-tasks. In designing for multitasking, then, it may prove

useful to have in store some metric of actional distance between the kinds of

5 For example, see Alexander ([1964] 1997).

150

motor activities that different models afford. The balancing of these constraints

may prove to be difficult. But again, such approaches are not without precedent

in design.6

With or without these higher order design methods, the products of design

will invariably afford opportunities for action that were at no point factored into

the design process. This has certainly been the case with conventional acoustic

instruments—and is perhaps definitive of so-called “extended” techniques—and

there’s no reason to assume that the situation should be any different for digital

musical instruments. It’s been interesting for me to note that, the more I play

with the “surfing the fractal wave” model, the more I’m able to isolate certain

quirks and glitches in the system.7 These kinds of discoveries constitute an

important aspect of the learning process; not just because they can be

assimilated into the accumulating motor and sonic vocabulary, but because in

certain cases they can lead to entirely new avenues of investigation—avenues

that would have remained closed had the system been insulated from random

environmental inputs in the first instance. There is a stochastic element in

enactive process, and this element is accounted for in the contingencies of

environmental dynamics. The glitch, then, is simply folded into the enactive

model of interaction. Its appearance or suppression in performance becomes a

6 For example, see Wild, Johnson and Johnson (2004).

7 It's also interesting to note that, at least to this point, the "pushing the envelope"

model has yielded no such interesting anomalies.

151

matter for human intentionality. Either choice will lead to the appropriate

refinement of actional dispositions.

152

5 Groundlessness

Whatever comes into being dependent on another

Is not identical to that thing.

Nor is it different from it.

Therefore it is neither nonexistent in time nor permanent.

— Nagarjuna, Mūlamadhyamakakārikā XVIII:10

The important thing is to understand life, each living individuality, not as a form,

or a development of form, but as a complex relation between differential

velocities, between deceleration and acceleration of particles.

— Gilles Deleuze, Spinoza: Practical Philosophy

The main thing is that you forget yourself.

— Barbara McClintock

153

The structure of our language typically leads us to characterizations of interaction

that focus on one side or the other of the interactional loop. “Humans use

technologies,” “technologies determine humans,” and so on. These are, of

course, the unavoidable products of a subject/object syntax, and my writing in

this essay has not been immune to the lopsided characterizations of interaction

that such products embody. But despite the inevitable linguistic constraints, I’ve

sought to describe the inherent circularity of the continuous interactional

unfolding that is definitive of enactive process; a process that is not concerned

with subjects and objects, but with relations, linkages, heterogeneity, and the

dynamic momentum of the emergent system that arises in the relations and

linkages between heterogeneous elements.

One of the more radical outcomes of Varela, Rosch and Thompson’s outline of

an enactive cognitive science is the model of subjectivity that necessarily follows

from enactive process. It’s precisely because enactive process concerns “the

processual transformation of the past into the future through the intermediary of

transitional forms that in themselves have no permanent substance (Varela,

Thompson, and Rosch 1991:116),” that enactive theory necessarily implies a

“groundless” or “selfless” self—i.e., a self with “no permanent substance;” a

“subjectless subjectivity (Deleuze and Guattari 1987).” This is the non-self that

appears in the experience of flow—in an unselfconscious, active and embodied

participation in the dynamical unfolding of real time and space—and it’s the same

non-self that vanishes the moment that attention is turned inward, and

perception is geared towards abstract contemplation of the objectness of things

in the world.

154

In this essay, I have not dealt with the epistemological or ontological

implications of an enactive approach to design in any significant manner. But to

my mind (however that may now be defined), it’s precisely these implications

that are most critical when thinking about design, or when implementing

implementations. If an implementation might afford the potential to undermine

essentialist ways of being—i.e., if the performative way of being that it brings

about is concerned with the unfolding of relations rather than the ordering of

things—then I would say that the implementation in question has utility, and that

the epistemological and ontological qualities that it embodies necessarily imply an

ethics.

At various points throughout the essay, I’ve invoked Heidegger’s use of the

term “equipment.” In Heidegger’s terminology, an equipment is a tool that

presents itself to human perception and intentionality as something-in-order-to.

That is, it affords a particular utility. It would be easy enough to arrive at the

conclusion that, in designing a digital musical instrument, we are designing

something-in-order-to-perform-music. While the statement is obviously true, it is

not, I think, a conclusion. An enactive approach to digital musical instrument

design would necessarily account for the realizational potential of the instrument;

i.e., a potential which would lead to an incremental unfolding of relationality, and

which at the same time would serve as the measure of the instrument’s

resistance. The concern for design, then, is directed towards designing an

encounter. Or, it’s directed towards designing something-in-order-to-not-be-

some-thing. In this respect, computers have a significant potential.

155

Bibliography

Agre, P. 1995. “Computation and embodied agency.” Informatica 19 (4):527-535.

———. 1996. “Computational research on interaction and agency.” In P. Agre and S.

Rosenschein, ed. Computational theories of interaction and agency. Cambridge, MA: MIT

Press, pp.1-52.

———. 1997. Computation and human experience. Cambridge, UK: Cambridge University

Press.

———. 2002. “The practical logic of computer work.” In Computationalism: New

directions. Cambridge, MA: MIT Press, pp.130-142.

Alexander, C. [1964] 1997. Notes on the synthesis of form. Cambridge, MA: Harvard

University Press.

Anderson, C. 2005. “Dynamic networks of sonic interactions: An interview with Agostino Di

Scipio.” Computer Music Journal 29 (3):11-28.

Arbib, M., and J.-S. Liaw. 1996. “Sensorimotor transformations in the worlds of frogs and

robots.” In P. Agre and S. Rosenschein, ed. Computational theories of interaction and

agency. Cambridge, MA: MIT Press, pp.53-79.

Ashby, W. R. [1952] 1960. Design for a brain: The origin of adaptive behaviour. New

York: John Wiley & Sons.

———. [1956] 1965. An introduction to cybernetics. London: William Clowes and Sons.

Bahn, C., T. Hahn, and D. Trueman. 2001. “Physicality and feedback: A focus on the body

in the performance of electronic music.” In Proceedings of the 2001 International

Computer Music Conference. San Francisco, CA: International Computer Music

Association, pp.44-51.

Bailey, D. 1992. Improvisation: Its nature and practice in music. New York: Da Capo

Press.

Barbaras, R. 1999. “The movement of the living as the originary foundation of perceptual

intentionality.” In B. Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing

156

phenomenology: Issues in contemporary phenomenology and cognitive science. Stanford,

CA: Stanford University Press, pp.525-538.

Bateson, G. 1980. Mind and nature: A necessary unity. Toronto; New York: Bantam

Books.

Beer, R. D. 1990. Intelligence as adaptive behavior: an experiment in computational

neuroethology. San Diego, CA: Academic Press Professional, Inc.

———. 1996. “A dynamical systems perspective on agent-environment interaction.” In P.

Agre and S. Rosenschein, ed. Computational theories of interaction and agency.

Cambridge, MA: MIT Press, pp.173-216.

———. 1997. “The dynamics of adaptive behavior: A research program.” Robotics and

Autonomous Systems 20:257-289.

———. 2000. “Dynamical approaches to cognitive science.” Trends in Cognitive Sciences 4

(3):91-99.

———. 2004. “Autopoiesis and cognition in the game of life.” Artificial Life 10:309-326.

Blum, T. 1979. “Herbert Brün: Project sawdust.” Computer Music Journal 3 (1):6-7.

Bongers, B. 2000. “Physical interfaces in the electronic arts: Interaction theory and

interfacing techniques for real-time performance.” In M. Wanderley and M. Battier, ed.

Trends in gestural control of music. Paris: IRCAM, pp.41-70.

Borgo, D. 2005. Sync or swarm: Improvising music in a complex age. New York:

Continuum.

Bourdieu, P. 1977. Outline of a theory of practice. R. Nice, trans. Cambridge, U.K.:

Cambridge University Press.

———. 1990. The logic of practice. Cambridge, U.K.: Polity.

———. 1991. Language and symbolic power. G. Raymond and M. Adamson, trans.

Cambridge, MA: Harvard University Press.

———. 1993. The field of cultural production: Essays on art and literature. New York:

Columbia University Press.

Brooks, R. 1991. “Intelligence without representation.” Artificial Intelligence 47:139-159.

———. 1991. “New approaches to robotics.” Science 253 (5025):1227-1232.

157

———. 1992. “Artificial life and real robots.” In F. Varela and P. Bourgine, ed. Toward a

practice of autonomous systems: Proceedings of the first European conference on artificial

life. Cambridge, MA: MIT Press, pp.3-10.

Brün, H. 1969. “Infraudibles.” In H. Von Foerster and J. Beauchamp, ed. Music by

computer. New York: John Wiley and Sons.

Bruner, J. 1987. Actual minds, possible worlds. Cambridge, MA: Harvard University Press.

———. 1991. Acts of meaning. Cambridge, MA: Harvard University Press.

Burzik, A. 2003. “Go with the flow.” The Strad:714-718.

Buxton, W. 1986. “There’s more to interaction than meets the eye: Some issues in manual

input.” In D. Norman and S. W. Draper, ed. User centered system design: New

perspectives on human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates,

pp.319-337.

Capra, F. 1996. The web of life: A new scientific understanding of living systems. New

York: Anchor Books.

Casati, R. 1999. “Formal structures in the phenomenology of motion.” In B. Pachoud, J.

Petitot, J.-M. Roy and F. Varela, ed. Naturalizing phenomenology: Issues in contemporary

phenomenology and cognitive science. Stanford, CA: Stanford University Press, pp.372-

384.

Cascone, K. 2000. “The aesthetics of failure: ‘Post-digital’ tendencies in contemporary

computer music.” Computer Music Journal 24 (4):12-18.

———. 2002. “Laptop music - counterfeiting aura in the age of infinite reproduction.”

Parachute (107):56.

———. 2003. “Grain, sequence, system: Three levels of reception in the performance of

laptop music.” Contemporary Music Review 22 (4):101-104.

———. 2003. “Introduction.” Contemporary Music Review 22 (4):1-2.

Casserley, L. 2001. “Plus ça change: Journeys, instruments and networks, 1966-2000.”

Leonardo Music Journal 11:43-49.

Chabot, X. 1994. “To listen and to see: Making and using electronic instruments.”

Leonardo Music Journal 3:11-16.

158

Chadabe, J. 1997. Electric sound: The past and promise of electronic music. Upper Saddle

River, NJ: Prentice Hall.

———. 2002. “The limitations of mapping as a structural descriptive in electronic

instruments.” In E. Brazil, ed. Proceedings of the 2002 New Interfaces for Musical

Expression Conference. pp.197-201.

Chiel, H. J., and R. D. Beer. 1997. “The brain has a body: adaptive behavior emerges from

interactions of nervous system, body and environment.” Trends in Neurosciences 20

(12):553-557.

Choi, I. 1995. “A manifold interface for a high dimensional control space.” In Proceedings

of the 1995 International Computer Music Conference. San Francisco, CA: International

Computer Music Association, pp.385-392.

———. 2003. “A component model of gestural primitive throughput.” In Proceedings of

the 2003 New Interfaces for Musical Expression Conference. pp.201-204.

Church, A. 1932. “A set of postulates for the foundation of logic.” Annals of Mathematics,

second series 33:346-366.

———. 1936. “An unsolvable problem of elementary number theory.” American Journal of

Mathematics 58:345-363.

Clancey, W. 1997. Situated cognition: On human knowledge and computer

representations. Cambridge, U.K.: Cambridge University Press.

Clark, A. 1995. “Moving minds: Situating content in the service of real-time success.”

Philosophical Perspectives 9:89-104.

———. 1997. Being there: Putting brain, body and world together again. Cambdrige, MA:

MIT Press.

———. 2003. Natural-born cyborgs: Minds, technologies, and the future of human

intelligence. Oxford: Oxford University Press.

Clark, A., and D. Chalmers. 1998. “The extended mind.” Analysis 58 (1):7-19.

Clarke, E. F. 1993. “Generativity, mimesis and the human body in music performance.”

Contemporary Music Review 9:207-220.

Collins, N. 2003. “Generative music and laptop performance.” Contemporary Music Review

22 (4):67-79.

159

Cook, P. 2001. “Principles for designing computer music controllers.” In Proceedings of

the 2001 New Interfaces for Musical Expression Conference.

Cook, P. R. 2004. “Remutualizing the musical instrument: Co-design of synthesis

algorithms and controllers.” Journal of New Music Research 33 (3):315-320.

Csikszentmihaly, M. 1991. Flow: The psychology of optimal experience. New York: Harper

Perennial.

Cull, J. 2000. “The circularity of living systems: The movement and direction of behavior.”

Journal of Applied Systems Studies 1 (1):51-65.

Damasio, A. R. 1994. Descartes’ error: Emotion, reason and the human brain. New York:

Putnam.

De Certeau, M. 1984. The practice of everyday life. Berkeley, CA: University of California

Press.

Dedieu, E., and E. Mazer. 1992. “An approach to sensorimotor relevance.” In F. Varela and

P. Bourgine, ed. Toward a practice of autonomous systems: Proceedings of the first

European conference on artificial life. Cambridge, MA: MIT Press, pp.88-95.

Deleuze, G. 1988. Spinoza: Practical philosophy. R. Hurley, trans. San Francisco: City

Lights.

———. 1990. Expressionism in philosophy: Spinoza. M. Joughin, trans. New York: Zone

Books.

———. 1991. Bergsonism. H. Tomlinson and B. Habberjam, trans. New York: Zone Books.

———. [1968] 1994. Difference and repetition. P. Patton, trans. New York: Columbia

University Press.

Deleuze, G., and F. Guattari. 1983. Anti-Oedipus: Capitalism and schizophrenia.

Minneapolis: University of Minnesota Press.

———. 1987. A thousand plateaus: Capitalism and schizophrenia. B. Massumi, trans.

Minneapolis: University of Minnesota Press.

Di Scipio, A. 1994. “Formal processes of timbre composition: Challenging the dualistic

paradigm of computer music.” In Proceedings of the 1994 International Computer Music

Conference. San Francisco, CA: International Computer Music Association, pp.202-208.

160

———. 1997. “Interpreting music technology: From Heidegger to subversive

rationalization.” Sonus 18 (1):63-80.

———. 2000. “Ecological modeling of textural sound events by iterated nonlinear

functions.” In Proceedings of the 2000 Colloquium on Musical Informatics. pp.33-36.

Dietrich, E. 1990. “Computationalism.” Social Epistemology 4 (135-154).

Dourish, P. 1999. Embodied interaction: Exploring the foundations of a new approach to

HCI. http://www.ics.uci.edu/~jpd/publications/misc/embodied.pdf.

———. 2001. Where the action is: Foundations of embodied interaction. Cambridge, MA:

MIT Press.

Dreyfus, H. 1991. Being-in-the-world: A commentary on Heidegger’s ‘Being and Time,

Division I’. Cambrdige, MA: MIT Press.

———. 1992. What computers still can’t do: A critique of artificial reason. Cambridge, MA:

MIT Press.

———. 1993. “Heidegger’s critique of the Husserl/Searle account of intentionality.” Social

Research 60 (1):17-38.

———. 1996. The current relevance of Merleau-Ponty’s phenomenology of embodiment.

Electronic Journal of Analytic Philosophy (4),

http://ejap.louisiana.edu/EJAP/1996.spring/dreyfus.1996.spring.html.

Emmerson, S. 2000. “’Losing touch?’: The human performer and electronics.” In S.

Emmerson, ed. Music, Electronic Media and Culture. Aldershot: Ashgate Publishing,

pp.194-216.

Evens, A. 2005. Sound ideas: music, machines, and experience. Minneapolis: University

of Minnesota Press.

Feenberg, A. 1991. Critical theory of technology. Oxford: Oxford University Press.

———. 1999. Questioning technology. London: Routledge.

———. 2000. “From essentialism to constructivism: Philosophy of technology at the

crossroads.” In E. Higgs, A. Light and D. Strong, ed. Technology and the good life?

Chicago: University of Chicago Press, pp.294-315.

———. 2002. Transforming technology. Oxford: Oxford University Press.

161

Fishkin, K., A. Gujar, B. Harrison, T. Moran, and R. Want. 2000. “Embodied user interfaces

for really direct manipulation.” Communications of the ACM 43 (9):75-80.

Fitzmaurice, G., and W. Buxton. 1997. “An empirical evaluation of graspable user

interfaces: towards specialized, space-multiplexed input.” In Proceedings of the 1997

SIGCHI Conference on Human Factors in Computing Systems. pp.43-50.

Fodor, J. A. 1983. The modularity of mind. Cambridge, MA: MIT Press.

Gallagher, S. 2003. How the body shapes the mind. Oxford: Oxford University Press.

Garnett, G. E. 2001. “The aesthetics of interactive computer music.” Computer Music

Journal 25 (1):21-33.

Gibson, J. J. 1977. “The theory of affordances.” In R. E. Shaw and J. Bransford, ed.

Perceiving, acting, and knowing: Toward an ecological psychology. Hillsdale, NJ: Lawrence

Erlbaum Associates.

———. 1979. The ecological approach to visual perception. New York: Houghton-Mifflin.

Gillespie, B. 1999. “Haptics.” In P. Cook, ed. Music, cognition and computerized sound.


———. 1999. “Haptics in manipulation.” In P. Cook, ed. Music, cognition and computerized

sound. Cambridge, MA: MIT Press, pp.261-276.

Giunti, M. 1997. Computation, dynamics and cognition. Oxford: Oxford University Press.

Goudeseune, C. 2001. Composing with parameters for synthetic instruments, University of

Illinois at Urbana-Champaign, Urbana-Champaign, IL.

Greenfield, A. 2006. Everyware: The dawning age of ubiquitous computing. Berkeley, CA:

New Riders Press.

Guiard, Y. 1987. “Asymmetric division of labor in human skilled bimanual action: The

kinematic chain as a model.” Journal of Motor Behavior 19 (4):486-517.

Gunther, E., G. Davenport, and S. O’modhrain. 2002. “Cutaneous grooves: Composing for

the sense of touch.” In E. Brazil, ed. Proceedings of the 2002 New Interfaces for Musical

Expression Conference. pp.37-43.

Hamman, M. 1997. “Interaction as composition: Toward the paralogical in computer

music.” Sonus 18 (1):26-44.

162

———. 1999. “From symbol to semiotic: Representation, signification, and the composition

of music interaction.” Journal of New Music Research 28 (2):90-104.

———. 1999. “Structure as performance: Cognitive musicology and the objectification of

procedure.” In J. Tabor, ed. Otto Laske: Navigating new musical horizons. Westport:

Greenwood Press.

———. 2000. “Priming computer-assisted music composition through design of

human/computer interaction.” In N. Mastorakis, ed. Mathematics and computers in

modern science: Acoustics and music, biology and chemistry, business and economics.

Athens: World Scientific Engineering Society.

———. 2000. “From technical to technological: Interpreting technology through

composition.” In Proceedings of the 2000 Coloquium on Musical Informatics.

———. 2002. The technical as aesthetic: Technology, composition, interpretation.

http://www.shout.net/~mhamman/papers/montpelier_2000.pdf.

———. 2002. “From technical to technological: The imperative of technology in

experimental music composition.” Perspectives of New Music 40 (1):92-120.

Haraway, D. 1991. “A cyborg manifesto: Science, technology, and socialist-feminism in

the late ywentieth century.” In. New York: Routledge, pp.149-181.

Haugeland, J. 1985. Artificial intelligence: The very idea. Cambridge, MA: MIT Press.

———. 2002. “Authentic intentionality.” In M. Scheutz, ed. Computationalism: new


Hayles, N. K. 1999. How we became posthuman: Virtual bodies in cybernetics, literature

and informatics. Chicago: University of Chicago Press.

Heidegger, M. 1988. Basic problems of phenomenology. A. Hofstadter, trans.

Bloomington: Indiana University Press.

———. [1927] 1962. Being and time. J. Macquarrie and E. Robinson, trans. London: SCM

Press.

———. [1949] 1977. “The question concerning technology.” In The question concerning

technology and other essays. New York: Harper and Row, pp.3-35.

Hendriks-Jansen, H. 1996. Catching ourselves in the act: Situated activity, interactive

emergence, evolution, and human thought. Cambridge, MA: MIT Press.

163

Hinckley, K., R. Pausch, D. Proffitt, J. Patten, and N. Kassell. 1997. “Cooperative bimanual

action.” In Proceedings of the 1997 SIGCHI Conference on Human Factors in Computing

Systems. pp.27-34.

Holland, J. 1992. Adaptation in natural and artificial systems: An introductory analysis

with applications to biology, control, and artificial intelligence. Cambridge, MA: MIT Press.

———. 1995. Hidden order: How adaptation builds complexity. Cambridge, MA: Perseus

Book.

———. 1998. Emergence: From chaos to order. Cambridge, MA: Perseus Books.

Honing, H. 2003. Some comments on the relation between music and motion. Music

Theory Online (1), http://smt.ucsb.edu/mto/issues/mto.03.9.1/mto.03.9.1.honing.html.

Horkheimer, M., and T. W. Adorno. 1972. Dialectic of enlightenment. J. Cumming, trans.

New York: Continuum.

Horswill, I. 1996. “Analysis of adaptation and environment.” In P. Agre and S.

Rosenschein, ed. Computational theories of interaction and agency. Cambridge, MA: MIT

Press, pp.367-396.

Hunt, A., M. Wanderley, and R. Kirk. 2000. “Towards a model for instrumental mapping in

expert musical interaction.” In Proceedings of the 2000 International Computer Music

Conference. San Francisco, CA: International Computer Music Association, pp.209-212.

Hunt, A., M. Wanderley, and M. Paradis. 2003. “The importance of parameter mapping in

electronic instrument design.” Journal of New Music Research 32 (4):429-440.

Hutchins, E. 1995. Cognition in the wild. Cambridge, MA: MIT Press.

Iazzetta, F. 1996. “Formalization of computer music interaction through a semiotic

approach.” Journal of New Music Research 25 (3):212-230.

———. 2000. Meaning in musical gesture. Trends in gestural control of music.

Ihde, D. 1983. Existential technics. Albany, NY: State University of New York Press.

———. 1990. Technology and the lifeworld: From garden to earth. Bloomington, IN:

Indiana University Press.

———. 1991. Instrumental realism: The interface between philosophy of science and

philosophy of technology. Bloomington, IN: Indiana University Press.

———. 1993. Philosophy of technology: An introduction. New York: Paragon House.

164

———. 2002. Bodies in technology. Minneapolis: University of Minnesota Press.

Jackendoff, R. 1987. Consciousness and the computational mind. Cambridge, MA: MIT

Press.

Jacob, R., L. Sibert, D. Mcfarlane, and M. Mullen. 1994. “Integrality and separability of

input devices.” ACM Transactions on Computer-Human Interaction (1):3-26.

Jaeger, T. 2003. “The (anti-)laptop aesthetic.” Contemporary Music Review 22 (4).

Johnson, M. 1987. The body in the mind: The bodily basis of meaning, imagination, and

reason. Chicago: University of Chicago Press.

Jordà, S. 2002. “Improvising with computers: A personal survey (1989-2001).” Journal of

New Music Research 31 (1):1-10.

———. 2002. “FMOL: Toward user-friendly, sophisticated new musical instruments.”

Computer Music Journal 26 (3):23-39.

———. 2003. “Interactive music systems for everyone: Exploring visual feedback as a way

for creating more intuitive, efficient and learnable instruments.” In Proceedings of the

2003 Stockholm Music Acoustics Conference.

Jordan, S. 2003. “The embodiment of intentionality.” In W. Tschacher, ed. Dynamical

systems approach to cognition: concepts and empirical paradigms based on self-

organization, embodiment, and coordination dynamics. Cambridge, MA: MIT Press,

pp.201-228.

Kabbash, P., B. Buxton, and A. Sellen. 1994. “Two handed input in a compound task.” In

Proceedings of the 1994 SIGCHI Conference on Human Factors in Computing Systems.

pp.417-423.

Karmiloff-Smith, A. 1992. Beyond modularity: A developmental perspective on cognitive

science. Cambridge, MA: MIT Press.

Kartadinata, S. 2003. “The gluiph: A nucleus for integrated instruments.” In Proceedings

of the 2003 New Interfaces for Musical Expression Conference. pp.180-183.

Kauffman, S. 1993. The origins of order: Self-organization and selection in evolution.

Oxford: Oxford University Press.

Krefeld, V. 1990. “The Hand in the web: An interview with Michel Waisvisz.” Computer

Music Journal 7 (7):43-55.

165

Lakoff, G., and M. Johnson. 1980. Metaphors we live by. Chicago: University of Chicago

Press.

———. 1999. Philosophy in the flesh: The embodied mind and its challenge to western

thought. New York: Basic Books.

Lakoff, G., and R. Núñez. 2000. Where mathematics comes from: How the embodied

mind brings mathematics into being. New York: Basic Books.

Lansky, P. 1990. “A view from the bus: When machines make music.” Perspectives of

New Music 28 (2):102-110.

Laske, O. E. 1991. “Toward an epistemology of composition.” Interface (20):235-269.

Latour, B. 1993. We have never been modern. Hemel Hempstead: Harvester Wheatsheaf.

Leppert, R. 1993. The sight of sound: Music, representation, and the history of the body.

Berkeley, CA: University of California Press.

Lidov, D. 1987. “Mind and body in music.” Semiotica 1 (3):69-97.

Loren, L., E. Dietrich, C. Morrison, and J. Beskin. 1998. “What is means to be ‘situated’.”

Cybernetics and Systems 29:751-777.

Lyons, D. M., and A. J. Hendriks. 1996. “Exploiting patterns of interaction to achieve

reactive bahvior.” In P. Agre and S. Rosenschein, ed. Computational theories of interaction

and agency. Cambridge, MA: MIT Press, pp.483-514.

Maes, P. 1992. “Learning behavior networks from experience.” In F. Varela and P.

Bourgine, ed. Toward a practice of autonomous systems: Proceedings of the first European

conference on artificial life. Cambridge, MA: MIT Press, pp.48-57.

Maturana, H. R., and F. J. Varela. 1980. Autopoiesis and cognition: The realization of the

living. Dordrecht, Holland: D. Reidel Publishing Company.

———. 1987. The tree of knowledge: The biological roots of human understanding.

Boston: New Science Library.

Merleau-Ponty, M. 1968. The visible and the invisible. A. Lingis, trans. Evanston, IL:

Northwestern University Press.

———. [1945] 2004. The phenomenology of perception. C. Smith, trans. London:

Routledge.

Minsky, M. 1986. The society of mind. New York: Simon and Schuster.

166

Mulder, A. 1999. Radical user interfaces for real-time musical control, University of York.

Mumma, G. 1967. “Creative aspects of live electronic music technology.” In Proceedings

of the 33rd National Convention of the American Audio Engineering Society. New York:

American Audio Engineering Society.

———. 1974. “Notes on cybersonics: Artificial intelligence in live musical performance.” In.

London: Guildhall Music and Drama Annual.

———. 1974. “Live electronic music.” In J. Appleton and R. C. Perera, ed. The

Development and Practice of Electronic Music. Englewood Cliffs, NJ: Prentice Hall, pp.286-

335.

Nardi, B. 1996. Context and consciousness: Activity theory and human-computer

interaction. Cambridge, MA: MIT Press.

Nardi, B., and V. O’day. 1999. Information ecologies: Using technology with heart.

Cambridge, MA: MIT Press.

Ng, K. 2002. “Interactive gesture music performance interface.” Paper read at Proceedings

of the 2002 New Interfaces for Musical Expression Conference.

Noë, A. 2004. Action in perception. Cambridge, MA: MIT Press.

Norman, D. 1986. “Cognitive engineering.” In D. Norman and S. Draper, ed. Hillsdale, NJ:

Lawrence Erlbaum Associates.

———. 1999. “Affordances, conventions and design.” Interactions 6 (3):38-43.

———. 1999. The invisible computer: Why good products can fail, the personal computer

is so complex, and information appliances are the solution. Cambridge, MA: MIT Press.

———. 2002. The design of everyday things. New York: Basic Books.

Norman, D., and S. Draper, eds. 1986. User centered system design: New perspectives on

human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates.

Norman, D., J. D. Holland, and E. L. Hutchins. 1986. “Direct manipulation interfaces.” In

D. Norman and S. Draper, ed. Hillsdale, NJ: Lawrence Erlbaum Associates.

Ostertag, B. 2002. “Human bodies, computer music.” Leonardo Music Journal 12:11-14.

Pacherie, E. 1999. “’Leibhaftigkeit’ and representational theories of perception.” In B.

Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing phenomenology: Issues in

167

contemporary phenomenology and cognitive science. Stanford, CA: Stanford University

Press, pp.148-160.

Pachoud, B. 1999. “The teleological dimension of perceptual and motor intentionality.” In

B. Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing phenomenology: Issues in


Press, pp.196-219.

Pachoud, B., J. Petitot, J.-M. Roy, and F. Varela. 1999. “Beyond the gap: An introduction

to naturalizing phenomenology.” In B. Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed.

Naturalizing phenomenology: Issues in contemporary phenomenology and cognitive

science. Stanford, CA: Stanford University Press, pp.1-80.

Paradiso, J. A. 1997. “Electronic music: New ways to play.” IEEE Spectrum 34 (12):18-30.

———. 2003. “Current trends in electronic music interfaces.” Journal of New Music

Research 32 (4):345-349.

Pask, G. 1962. An approach to cybernetics. New York, NY: Harper.

Pask, G., and S. Curran. 1982. Micro man: Computers and the evolution of consciousness.

New York: Macmillan.

Perkis, T. 1996. “Bringing digital music to life.” Computer Music Journal 20 (2):28-32.

Pfeifer, R., and P. Verschure. 1992. “Distributed adaptive control: A paradigm for

designing autonomous agents.” In F. Varela and P. Bourgine, ed. Toward a practice of

autonomous systems: Proceedings of the first European conference on artificial life.


Pinker, S. 1997. How the mind works. New York: W. W. Norton.

Prem, E. 1997. “Epistemic autonomy in models of living systems.” In P. Husbands and I.

Harvey, ed. Fourth European conference on artificial life. pp.2-9.

Preston, E. F. 1988. Representational and non-representational intentionality: Husserl,

Heidegger, and artificial intelligence. Ph.D., Philosophy, Boston University, Boston.

———. 1993. “Heidegger and artificial intelligence.” Philosophy and Phenomenological

Research 53 (1):43-69.

Prévost, E. 1995. No sound is innocent. Matching Tye, Essex: Copula.

168

Reddell, T. 2003. “Laptopia: The spatial poetics of networked laptop performance.”

Contemporary Music Review 22 (4):11-22.

Riethmüller, A. 1994. “The matter of music is sound and body-motion.” In H. U.

Gombrecht and K. L. Pfeiffer, ed. Materialities of communication. Stanford, CA: Stanford

University Press, pp.148-156.

Roads, C. 1985. “Improvisation with George Lewis.” In. Los Altos, CA: William Kaufmann

Inc.

———. 1989. “Active music representations.” In Proceedings of the 1989 International

Computer Music Conference. San Francisco, CA: International Computer Music

Association, pp.257-259.

Rosenschein, S., and L. P. Kaelbling. 1996. “A situated view of representation and

control.” In P. Agre and S. Rosenschein, ed. Computational theories of interaction and

agency. Cambridge, MA: MIT Press, pp.541-596.

Rowe, R. 1993. Interactive music systems: Machine listening and composing. Cambridge,

MA: MIT Press.

———. 2001. Machine musicianship. Cambridge, MA: MIT Press.

Roy, J.-M. 1999. “Saving intentional phenomena: Intentionality, representation and

symbol.” In B. Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing

phenomenology: Issues in contemporary phenomenology and cognitive science. Stanford,

CA: Stanford University Press, pp.111-147.

Ryan, J. 1991. “Some remarks on musical instrument design at STEIM.” Contemporary

Music Review 6 (1):3-17.

———. 1992. “Effort and expression.” In A. Strange, ed. Proceedings of the 1992

International Computer Music Conference. San Francisco, CA: International Computer

Music Association, pp.414-416.

Sapir, S. 2002. “Gestural control of digital audio environments.” Journal of New Music

Research 31 (2):119-129.

Scheutz, M. 2002. “Computationalism - The next generation.” In Computationalism: New


Schloss, W. A. 2003. “Using contemporary technology in live performance: The dilemma of

the performer.” Journal of New Music Research 32 (3):239-242.

169

Shannon, C. 1949. The mathematical theory of communication. Urbana, IL: University of

Illinois Press.

Sheets-Johnstone, M. 2000. The primacy of movement. Philadelphia: John Benjamins.

Shove, P. 1995. “Musical motion and performance: Theoretical and empirical

perspectives.” In J. Rink, ed. Cambridge, UK: Cambridge University Press, pp.55-83.

Simon, H. A. 2001. The sciences of the artificial. Cambridge, MA: MIT Press.

Small, C. 1998. Musicking: The meanings of performing and listening. Hanover, NH:

Wesleyan University Press.

Smith, B. C. 1996. On the origin of objects. Cambridge, MA: MIT Press.

———. 2002. “The foundations of computing.” In M. Scheutz, ed. Computationalism: New


Smith, D. W. 1999. “Intentionality naturalized?” In B. Pachoud, J. Petitot, J.-M. Roy and F.

Varela, ed. Naturalizing phenomenology: Issues in contemporary phenomenology and

cognitive science. Stanford, CA: Stanford University Press, pp.83-110.

Smyth, T., and J. Smith. 2002. “Creating sustained tones with the cicada’s rapid

sequential buckling mechanism.” Paper read at Proceedings of the 2002 New Interfaces for

Musical Expression Conference.

Spiegel, L. 1992. “An alternative to a standard taxonomy for electronics and computer

instruments.” Computer Music Journal 16 (3).

Stein, L. A. 1998. “What we’ve swept under the rug: Radically rethinking CS1.” Computer

Science Education 8 (2):118-129.

———. 1999. “Challenging the computational metaphor: Implications for how we think.”

Cybernetics and Systems 30 (6):473-507.

Stuart, C. 2003. “The object of performance: Aural performativity in contemporary laptop

music.” Contemporary Music Review 22 (4):59-65.

Suchman, L. 1987. “Plans and situated actions: The problem of human-machine

communication.” In. Cambridge, U.K.: Cambridge University Press.

Sudnow, D. 2001. Ways of the hand: A rewritten account. Cambridge, MA: MIT Press.

170

Tanaka, A., and R. B. Knapp. 2002. “Multimodal interaction in music using the

electromyogram and relative position sensing.” In E. Brazil, ed. Proceedings of the 2002

New Interfaces for Musical Expression Conference. pp.43-48.

Temprado, J. J. 2003. “Cognition in action: The interplay of attention and bimanual

coordination dynamics.” In W. Tschacher, ed. Dynamical systems approach to cognition:

Concepts and empirical paradigms based on self-organization, embodiment, and

coordination dynamics. Cambridge, MA: MIT Press, pp.93-132.

Thelen, E. 1994. A dynamic systems approach to the development of cognition and action.

Cambridge, MA: MIT Press.

———. 1995. “Time scale dynamics and the development of an embodied cognition.” In R.

Port and T. Van Gelder, ed. Mind as motion: Explorations in the dynamics of cognition.


———. 2003. “Grounded in the world: Developmental origins of the embodied mind.” In

W. Tschacher, ed. Dynamical systems approach to cognition: Concepts and empirical

paradigms based on self-organization, embodiment and coordination dynamics.

Singapore: World Scientific Publishing Company, pp.17-44.

Thompson, E., A. Noë, and L. Pessoa. 1999. “Perceptual completion: A case study in

phenomenology and cognitive science.” In B. Pachoud, J. Petitot, J.-M. Roy and F. Varela,

ed. Naturalizing phenomenology: Issues in contemporary phenomenology and cognitive

science. Stanford, CA: Stanford University Press, pp.161-195.

Todd, N. P. M. 1992. “The dynamics of dynamics: A model of musical expression.” Journal

of the Acoustical Society of America 91 (6):3540-3550.

———. 1999. “Motion in music: A neurobiological perspective.” Music Perception 17

(1):115-126.

Todes, S. 2001. Body and world Cambridge, MA: MIT Press.

Trueman, D. 1999. Reinventing the violin. Ph.D., Music, Princeton University, Princeton,

NJ.

———. 2000. “BoSSA: The deconstructed violin reconstructed.” Journal of New Music

Research 31 (2):119-129.

Trueman, D., C. Bahn, and P. Cook. 2000. “Alternative voices for electronic sound:

Spherical speakers and sensor-speaker arrays (SenSAs).” In Proceedings of the 2000

171

International Computer Music Conference. San Francisco, CA: International Computer

Music Association, pp.248-251.

Turing, A. M. 1936. “On computable numbers, with an application to the

Entscheidungsproblem.” Proceedings of the London Mathematical Society, Series 2

42:230-265.

———. 1950. “Computing machinery and intelligence.” Mind LIX (236):433-460.

Turkle, S. 1984. The second self: Computers and the human spirit. New York: Simon and

Schuster.

Turner, T. 2003. “The resonance of the cubicle: Laptop performance in post-digital

musics.” Contemporary Music Review 22 (4):81-92.

Ullmer, B., and H. Ishii. 2001. “Emerging frameworks for tangible user interfaces.” In J. M.

Carroll, ed. Human-computer interaction in the new millenium. Addison-Wesley, pp.579-

601.

Ungvary, T., and R. Vertegaal. 2000. “Designing musical cyberinstruments with body and

soul in mind.” Journal of New Music Research 19 (3):245-255.

Van Gelder, T. 1998. “The dynamical hypothesis in cognitive science.” Behavioral and

Brain Sciences 21:615-628.

Van Nort, D., M. Wanderley, and P. Depalle. 2004. “On the choice of mappings based on

geometric properties.” In Proceedings of the 2004 New Interfaces for Musical Expression

Conference. pp.87-91.

Varela, F. 1979. Principles of biological autonomy. Amsterdam: Elsevier (North Holland).

———. 1980. “Describing the logic of the living: The adequacy and limitations of the idea

of autopoiesis.” In M. Zeleny, ed. Autopoiesis: A theory of living organization. New York:

North Holland, pp.36-48.

———. 1992. “Making it concrete: Before, during and after breakdowns.” In J. Ogilvy, ed.

Revisioning Philosophy. Albany, NY: State University of New York Press, pp.97-109.

———. 1999. “The specious present: A neurophenomenology of time consciousness.” In B.

Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing phenomenology: Issues in


Press, pp.266-314.

172

Varela, F., and E. Thompson. 2001. “Radical embodiment: Neural dynamics and

consciousness.” Trends in Cognitive Sciences 5 (10):418-425.

Varela, F., E. Thompson, and E. Rosch. 1991. The embodied mind: Cognitive science and

human experience. Cambridge, MA: MIT Press.

Verplank, W. 2001. “A course on controllers.” In Proceedings of the 2001 New Interfaces

for Musical Expression Conference.

Von Neumann, J. 1958. The computer and the brain. New Haven, CT: Yale University

Press.

Wanderley, M. 2001. Performer-instrument interaction: applications to gestural control of

sound synthesis, University Paris 6, Paris.

Webb, B., and T. Smithers. 1992. “The connection between AI and biology in the study of

behavior.” In F. Varela and P. Bourgine, ed. Toward a practice of autonomous systems:

Proceedings of the first European conference on artificial life. Cambridge, MA: MIT Press,

pp.421-428.

Wegner, P. 1997. “Why interaction is more important than algorithms.” Communications

of the ACM 40 (5):80-91.

Weinberg, G. 2005. “Interconnected musical networks: Toward a theoretical framework.”

Copmuter Music Journal 29 (2):23-39.

Weiser, M. 1988. Ubiquitous computing #1 and #2. Palo Alto, CA: Xerox PARC.

———. 1991. “The computer for the twenty-first century.” Scientific American 265 (3):94-

104.

———. 1994. “The world is not a desktop.” Interactions 1 (1):7-8.

Weiser, M., and S. Brown. 1996. The coming age of calm technology. Palo Alto, CA: Xerox

PARC.

Wessel, D., and M. Wright. 2001. “Problems and prospects for intimate musical control of

computers.” In Proceedings of the 2001 New Interfaces for Musical Expression

Conference.

Whitelaw, M. 2003. “Sound particles and microsonic materialism.” Contemporary Music

Review 22 (4):93-100.

173

Wiener, N. 1961. Cybernetics; Or, control and communication in the animal and the

machine. Cambridge, MA: MIT Press.

Wild, P., P. Johnson, and H. Johnson. 2004. “Towards a composite modelling approach for

multitasking.” In Proceedings of the 3rd International Workshop on Task Models and

Diagrams for User Interface Design. Prague: ACM International Conference Proceeding

Series, pp.17-24.

Wilden, A. 1977. System and structure: Essays in communication and exchange. London:

Tavistock Publications.

Winkler, T. 1995. “Making motion musical: Gesture mapping strategies for interactive

computer music.” In Proceedings of the 1994 International Computer Music Conference.

San Francsico, CA: International Computer Music Association, pp.261-264.

Winograd, T., and F. Flores. 1986. Understanding computers and cognition: A new

foundation for design. Norwood, NJ: Ablex Publishing.

Zumthor, P. 1994. “Body and performance.” In H. U. Gombrecht and K. L. Pfeiffer, ed.

Stanford, CA: Stanford University Press, pp.218-226.

Documents

An Enactive Approach to Digital Musical Instrument Design