Upload
joseph510
View
69
Download
3
Embed Size (px)
Citation preview
An Enactive Approach to Digital Musical Instrument Design
Newton Armstrong
A DISSERTATION PRESENTED TO THE FACULTY OF PRINCETON UNIVERSITY IN
CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
RECOMMENDED FOR ACCEPTANCE BY THE DEPARTMENT OF MUSIC
November 2006
© Copyright by Newton Blaire Armstrong, 2006. All rights reserved.
iii
Table of Contents
Table of Contents ............................................................................ iii
Abstract ..........................................................................................v
Acknowledgements ......................................................................... vii
1 Introduction............................................................................... 1
1.1 The Disconnect .........................................................................2
1.2 Flow........................................................................................6
1.3 The Criteria of Embodied Activity ................................................8
1.4 The Computer-as-it-comes.......................................................12
2 The Interface ............................................................................16
2.1 Interaction and Indirection.......................................................16
2.2 Representation and Cognitive Steering ......................................18
2.3 Computationalism ...................................................................23
2.4 Sensing and Acting .................................................................32
2.5 Functional and Realizational Interfaces ......................................41
2.6 Conclusion .............................................................................49
3 Enaction ....................................................................................51
3.1 Two Persistent Dualisms ..........................................................51
3.2 Double Embodiment ................................................................55
iv
3.3 Structural Coupling .................................................................61
3.4 Towards an Enactive Model of Interaction ..................................69
3.5 The Discontinuous Unfolding of Skill Acquisition ..........................82
3.6 Conclusion .............................................................................98
4 Implementation ......................................................................100
4.1 Kinds of Resistance ............................................................... 100
4.2 Mr. Feely: Hardware.............................................................. 103
4.3 Mr. Feely: Software............................................................... 116
4.4 Mr. Feely: Usage Examples .................................................... 128
4.5 Prospects............................................................................. 148
5 Groundlessness.......................................................................152
Bibliography................................................................................. 155
v
Abstract
Digital musical instruments bring about problems for performance that are
different in kind to those brought about by conventional acoustic instruments. In
this essay, I argue that one of the most significant of these problems is the way
in which conventional computer interfaces preclude embodied modes of
interaction. I examine the theoretical and technological foundations of this
“disconnect” between performer and instrument, and sketch an outline for the
design of embodied or “enactive” digital instruments.
My research builds on recent work in human-computer interaction and “soft”
artificial intelligence, and is informed by the phenomenology of Heidegger and
Merleau-Ponty, as well as the “enactive cognitive science” of Francisco Varela and
others. I examine the ways in which the conventional metaphors of computer
science and “hard” artificial intelligence derive from a mechanistic model of
human reasoning, and I outline how this model has informed the design of
interfaces that inevitably lead to disembodied actional modes. I propose an
alternative model of interaction that draws on various threads from the work of
Heidegger, Merleau-Ponty, and the enactive cognitive scientists. The “enactive
model of interaction” that I propose is concerned with circular chains of embodied
interdependency between performer and instrument, instrumental “resistance” to
human action and intentionality, and an integrative approach to the roles of
sensing, acting and cognitive process in the incremental acquisition of
performative skill.
vi
The final component of the essay is concerned with issues of implementation.
I detail a project in hardware and software that I present as a candidate
“enactive digital musical instrument,” I outline some specific usage examples,
and I discuss prospects for future work.
vii
Acknowledgements
This paper would have been a much bigger mess were it not for the timely
contributions of a number of people. In particular, I have benefited from the very
careful readings and insightful criticisms of my advisor, Barbara White, and my
first reader, Dan Trueman. Paul Lansky has uttered more wise words than I could
count, and he has changed my mind about many things during my time at
Princeton (although, as far as I can tell, that was never really his intention).
Perry Cook has taught me a great deal about interaction, both in his classes and
in the approach to design that he takes in his own projects. He has been an
outstanding role model in terms of bridging the gap between theory and practice,
and knowing when it’s time to just sit down with a soldering iron.
I have also benefited greatly from conversations with other graduate students
while at Princeton. In particular, I’d like to thank Ted Coffey, Paul Audi, Mary
Noble, Seth Cluett, Scott Smallwood and Ge Wang, each of whom has given me
feedback on my work, in the form of both critical readings and more casual
conversations about the core topics. I’m also grateful to the other composers in
my intake year: Paul Botelho, Stefan Weisman and Miriama Young. Together we
represent a diverse group, but there has been a considerable and on-going
interest in each other’s work, and this interest has been borne out in tangible
forms of support for our respective projects and activities.
The history of electronic music performance goes largely without mention in
my paper. But the research would not have been possible in the first place were
viii
it not for those practitioners, from David Tudor to Toshimaru Nakamura, who
would question the hidden nature of electronic media in order to uncover not just
new sounds, but new potentialities of the body. I am indebted to all those
electronic performers whose work I have engaged, whether through written
accounts and recordings, or through personal contact and performance
collaborations. Although my fingers are rusty from typing, I’m looking forward to
rejoining the ranks of the improvising community in a less part-time capacity.
1
1 Introduction
Electronics for its own sounds’ sake is a resource that one would be stupid to
dismiss, but the implication is irrelevant, even misleading.
It says that music is a pure art of sound, for people with ears, but with little else—
no eyes, no nerve endings anywhere but the ears, no interrelated functions. And
as a matter of fact much electronic music leaves the impression that this IS the
attitude in which sounds are composed.
It says that the functional shape of an instrument is not important as a sculptural
object, and that the techniques developed on it, because of its particular virtues
and its particular defects, are obsolescent. That the physical, sensual vision of the
playing of it is no longer required.
— Harry Partch, Some New and Old Thoughts After and Before “The Bewitched”
2
1.1 The Disconnect
A wooden wheel placed on the ground is not, for sight, the same thing as a wheel
bearing a load. A body at rest because no force is being exerted upon it is again
for sight not the same thing as a body in which opposing forces are in equilibrium.
— Maurice Merleau-Ponty, The Phenomenology of Perception
The mid-1990s marked a juncture in the short history of computer music. For the
first time, the personal computer was becoming fast enough to be used as a real-
time synthesizer of sound; a capability that had previously been the reserve of
special purpose machines that were for the most part inaccessible to people
working outside an institutional framework. In the years since the mid-1990s,
there has been a rapid proliferation of new software and input devices designed
specifically for musical performance with general purpose computers, and a
burgeoning corpus of new theories, performance practices and musical idioms
have emerged in tandem to the new technologies. But while the widespread
availability of the personal computer to the first world middle class has resulted
in the medium finding its way into any number of new and diverse musical
contexts, the question as to whether the computer should be properly considered
a musical instrument continues, at least in certain quarters, to generate some
controversy.
More often than not, these controversies revolve around the relationship
between the human performer and the performance medium. Or, more
specifically, they revolve around an apparent lack of embodied human presence
3
and involvement in computer music performance practice. The complainants
argue that the performer is either absorbed in near-motionless contemplation of
the computer screen—the repertoire of performance gestures not substantively
different from those that comprise any routine interaction with a personal
computer—or that there is a high degree of arbitrariness to the performer’s
actions, where the absence of any explicit correlation between motor input and
sonic output results in a disassociation of performer from performance medium.
In both instances, what is witnessed is a disconnect; between performer and
audience, and between performer and instrument.
Those who complain about the current state of computer music performance
practice reveal something of their assumptions and expectations as regards
musical performance: that the involvement of the performer’s body constitutes a
critical dimension of the practice, and that for such an involvement to be tangible
to the audience, it’s necessary that that audience picks up on somatic cues that
signal the point of origin, in real time and real space, of the sounds they are
hearing. Defenders of the “near-motionless” school of computer music
performance have suggested that complaints such as these arise not because
there is something substantive missing from the interaction between performer
and performance medium, but because conventional expectations as regards the
constitutive elements of musical performance have not yet caught up to an
essentially new performance practice (Cascone 2000; Stuart 2003). The
argument has it that the computer, considered as a performance medium, brings
a unique set of issues and concerns to the problem of musical performance, and
that the attributes of the medium necessitate a break with established
instrumental conventions, the modes of performance that are attendant to those
4
conventions, and the expectations, assumptions and receptive habits of
audiences. It’s been suggested that those who take issue with the apparent lack
of human motor involvement in current computer music performance practice
reveal a mindset “created by constant immersion in pop media (Cascone 2000:
101-102),” and that the emergence of the new performance paradigm signals a
shift away from the locus of the body of the performer; the “object of
performance” is instead transferred to the ears of the audient, who needs to
relearn “active” modes of listening, or “aural performativity (Stuart 2003).”
On both sides of the argument over the state of computer music performance
practice, there is a suggestion that something is missing. What distinguishes one
side from the other is where that missing something is located: with the
performer, or with the audient. It’s difficult to defend either position, based as
they are on speculative assessments of the receptive habits and practices of
listeners. But what can be called into question is the implied corollary to the
apologist’s claim that the burden of responsibility lies with the audient; that is,
that computer music performance practice, as it stands, is already mature. In
this essay, I take the opposite position: that computer music performance
practice remains both theoretically and technologically under-developed, and that
most of the interesting and significant work in the field remains to be done.
In certain respects, then, the present study is a legitimation of the complaints
being uttered against the current state of computer music performance practice.
But more pressingly, it is borne out of frustration as a computer music performer.
Despite investing a number of years in the development of both hardware and
software designed specifically for performance, I’ve found that the performance
medium has in all but a few instances managed to maintain a safe distance,
5
corroborating (from the shaky perspective of first person phenomenal
experience) the complaint of the disconnect. I’ve come to believe that there is
something intrinsic to the computer, something embedded in the medium itself,
that is the cause of all this; something that necessarily and inevitably brings
about a disconnect. If it turns out that is in fact the case, then the medium
effectively guarantees that an embodied coupling of human and instrument—a
coupling that creates the possibility of engaged and involved experience—never
quite takes place.
Unlike the apologists for the currently predominant modes of computer music
performance practice, I’m going to suggest that the perceived disconnect, or
“missing dimension,” that certain people have been complaining about, is not due
to a conditioned desire for spectacle, or an ingrained expectation that an
explicitly causal relation is witnessed between performance gesture and sonic
result. Rather, it seems to me that there is something more fundamental to the
issue: that an engaged and embodied mode of performance leads to a more
compelling, dynamic, and significant form of music making; for the performer,
the audience, and for the social space that they co-construct through the
performance ritual. If the attributes of the computer preclude such a mode of
performance, then the medium deserves to be examined, in order to determine
what can be done to engender the technical conditions from which an embodied
performance practice might arise.
6
1.2 Flow
The matter of music is sound and body motion.
— Aristides, De Musica
Performers of conventional acoustic instruments often talk of the sense of flow
they experience while playing.1 It’s a way of being that consists in the merging of
action and awareness, and the loss of any immediate sense of severance
between agent (the performer) and environment (the instrument, the acoustic
space, the social setting, and other providers of context). It’s the kind of
absorbing experience that can arise in the directed exchange between an
embodied agent and a physical mechanism, and it’s a coupling that happens as a
matter of course with acoustic instruments. Conventional acoustic instruments
offer resistance to the body of the performer, and their responses are tightly
correlated to the variety of inputs from the performer’s body that are afforded by
the mechanism.2 In a sequence of on-going negotiations between performer and
1 For a more complete account of "flow," in the sense that I will use the term, see
Mihaly Csikszentmihaly, Flow: The Psychology of Optimal Experience (Csikszentmihaly
1991). For a concise summation of the applicability of Csikszentmihaly's ideas to
instrumental performance see Burzik's "Go with the flow" (Burzik 2003).
2 The notion of "affordance" was introduced by psychologist James Gibson (Gibson
1977, 1979). In the Gibsonian sense, an affordance is an opportunity for action that the
environment presents to an embodied agent. As such, the term accounts for the particular
7
instrument, the performer adapts to what is uncovered in the act of playing,
continually developing new forms of embodied knowledge and competence. Over
a sustained period of time, these negotiations lead to a more fully developed
relationship with the instrument, and to a heightened sense of embodiment, or
flow.
Performance with a conventional acoustic instrument serves as a useful
example of an embodied mode of human activity, and of an engaged coupling
with a complex physical mechanism. But in the context of the present study, I’m
not specifically interested in appropriating the conventions of acoustic
instrumental practice for computer music, or in modeling acoustic instruments in
the digital domain. Along with those writers who would proclaim the advent of a
new computer music performance practice, I hold that the computer, considered
as a performance medium, presents new and unique problems and prospects. But
where those writers focus on the shortcomings of the audience, I focus on the
shortcomings of current theory and technologies, and on the body of the
performer—not because of the body’s historical coupling to conventional
instruments, but because I choose to conceive of the body as a site of possibility
and resistance, and because it seems that the computer has a way of both
limiting the body’s possibilities and diminishing its potential for resistance.
physical and perceptual attributes and abilities of the agent, as well as her intentionality.
To borrow an example from Andy Clark: "... to a human a chair affords sitting, but to a
woodpecker it may afford something quite different (Clark 1997:172)."
8
There are attributes, then, to the experience of playing a conventional
acoustic instrument that are pertinent to thinking about the design of digital
musical instruments that would allow for embodied modes of performance. The
optimal performative experience—this somewhat intangible and elusive notion of
flow—could be characterized as a way of being that is so direct, immediate and
engaging, that the normative senses of time, space and the self, are put
temporarily on hold. It amounts to a presence and participation in the world, in
experiential real time and real space, in which meaning and purpose arise not
through abstract contemplation, but directly within the course of action. Such
action involves complex and continuous exchanges and interactions between
senses, the motor system (muscles), the nervous system (including the brain),
and the social and physical environment in which the ritualised act of
performance is embedded. In short, the experience of flow, of a heightened
sense of embodiment, involves an immediately palpable feeling of active
presence in a world that is directly lived and experienced. Traditional though it
may seem, these are qualities that I believe are central, and will remain central,
to musical performance. If the computer is going to figure as a musical
instrument, and if it does not presently lend itself to embodied form of
interaction, then some work needs to be done.
1.3 The Criteria of Embodied Activity
Over the course of this essay, I’ll return to what I take to be the five key criteria
of embodied musical performance; or, more specifically, the five key criteria of
9
the particular kind of embodied mode of interaction with digital musical
instruments that I hope to uncover through outlining a philosophically informed
approach to instrument design. Those criteria are:
1. Embodied activity is situated. Embodiment arises contextually, through
an agent’s interactions with her environment. The agent must be able
to adapt to changes in the environment, and in her relationship to it,
without full prior knowledge of the features of the environment, or of
its structure and dynamics.
2. Embodied activity is timely. Real world activity involves real-time
constraints, and the agent must be able to meet these constraints in a
timely manner. This means that it is incumbent on the agent to not
disrupt the flow of activity because her capacity for action is too slow.3
3. Embodied activity is multimodal. A large portion of the agent’s total
sensorimotor capabilities are galvanised in performance. This involves
3 David Sudnow uses a nice example of untimely behavior in Ways of the Hand:
Recall Charlie Chaplin on the assembly line in Modern Times: the conveyor belt
continuously carrying a moving collection of nuts and bolts to be tightened, their
placements at regular intervals on the belt, Chaplin holding these two wrenches,
falling behind the time, rushing to catch up, screwing bolts faster to stay ahead of
the work, missing one or two along the way, because the upcoming flow seems to
gain speed and he gets frantic, or because it actually does speed up, eventually
caught up in the machine and ejected onto the factory floor in his hysterical
epileptic dance. (Sudnow 2001:32-3)
10
optimising the use of the body’s total available resources for cognition,
action and perception, with an emphasis on the concurrent utilisation
of distinct sensorimotor modalities, as well as the potential for mutual
interaction, or cross-coupling, between those modalities.
4. Embodied activity is engaging. The sense of embodiment arises when
the agent is required by the task domain. That is, the environment is
incomplete without the involvement of the agent, and it presents
challenges to the agent that consume a large portion of her attention.
5. The sense of embodiment is an emergent phenomenon.4 That is,
optimal embodied experience arises incrementally over a history of
sensorimotor performances within a given environment or phenomenal
domain. There is a link between increasing sensorimotor competence
within the task domain and the sense of embodiment.
Borrowing from cognitive scientists Francisco Varela, Evan Thompson and
Eleanor Rosch (Varela, Thompson, and Rosch 1991), I will refer to the embodied
mode of performative activity I’m outlining here as enactive. I’ll address the
concept of enaction in more depth in Chapter 3, but for the time being it’s useful
4 This criterion could perhaps have been condensed into the phrase "embodiment is an
emergent phenomenon." But this is potentially misleading, as embodiment is a given for
biological systems; i.e., living organisms do not emerge into their bodies. The sense of
embodiment, then, is phenomenal, whereas the fact of embodiment is objective. The
implications of this double sense of embodiment—of its "inner" and "outer" aspects—are
explored in Chapter 3.
11
to emphasize the centrality of the body to the enactive model of cognition. In
contrast to orthodox views of mental process that view cognition as the internal
mirroring of an objective external world, the enactive perspective takes the
repeated sensorimotor interactions between the agent and the environment as
the fundamental locus of cognitive development. This encompasses the dynamics
of the experiential present, i.e., that which is ineluctably the “now,” but it also
encompasses the emergence and development of knowledge and competence,
i.e., the cognitive dimension of activity.
In the enactive view, cognition is fundamentally an embodied phenomenon; it
arises through and within an agent’s physical interactions with her environment.
To that extent, the “now” of lived experience, of an instantaneous conceptual and
corporeal disposition within a given environment, plays a determining role in the
emergence of cognitive systems and structures, and cognitive systems and
structures, in turn, play a determining role in constituting the “now.” It’s an
ongoing, circular, and fully reciprocal process of mutual determination and
specification in which subjectivity and the sense of embodiment are in a
continuous state of flux. This model of cognition, with its emphasis on bodily
involvement in the “bringing forth of a world,”5 provides a template for the
performance practice that I hope will emerge from this study.
5 The expression is borrowed from Varela, Rosch and Thompson's The Embodied Mind
(Varela, Thompson, and Rosch 1991).
12
1.4 The Computer-as-it-comes
A number of authors (Agre 1995, 1997; Clancey 1997; Dourish 1999, 2001;
Stein 1999; Winograd and Flores 1986) have shown that it is no easy task to
design computing devices that would allow for embodied modes of interaction.
The prevailing guiding metaphors of computer science (CS) and human computer
interaction (HCI)6 are at odds with the embodied/enactive approach, and
routinely preclude modes of interaction that are situated, timely, multimodal, and
engaging, or that lead to a heightened sense of embodiment over a history of
interactions. And while the subset of computing devices that is of specific interest
to this essay—digital musical instruments—is these days comprised of a vast and
diverse array of implementations, the field as a whole has not been immune to
6 The "prevailing guiding metaphors" of CS and HCI—i.e., the epistemological
underpinnings of what I have labelled "conventional" CS and HCI—will be outlined in terms
of a computationalist ontology in Chapter 2. Lynn Andrea Stein has suggested that it was
a matter of historical contingency that saw the computationalist approach hold sway in the
formative days of computer science:
Cybernetics took seriously the idea of a computation embedded in and coupled to
its environment. These were precisely the issues suppressed by the
computationalist approaches. In the intellectual battles of mid-century, cybernetics
failed to provide the necessary empowerment for the emerging science of
computation and so was lost, dominated by the computational metaphor. The
nascent field of computational science was set on a steady path, but its
connections to the world around it were weakened. (Stein 1999:482)
13
the guiding metaphors of conventional CS and HCI. This is not to say that all
digital musical instruments have failed to realize the potential for embodied
modes of interaction. But rather, those instruments that have managed to realize
this potential have done so despite the conventional tenets of CS and HCI.
It may be useful to distinguish between two main currents in present day
computer music performance practice. The first of these would take the personal
computer more or less as it comes (with minimal or zero additions to the
standard input devices), and would normally be characterized by the “near-
motionless” mode of performance described earlier in the chapter. This practice is
often encapsulated under the rubric of “laptop music,”7 and has given rise to a
so-called “laptop aesthetic (Jaeger 2003).” The second of the two currents is
defined precisely through its non-acceptance of the “computer-as-it-comes” as a
musical instrument. Rather, the practitioners seek to extend computing devices,
or even completely reconfigure them, through the development and integration
of new technologies designed specifically for musical performance. This is the
field of activity to which my own work belongs, and I will refer to it under the
(intentionally) broad term of “digital musical instruments.” A third current could
also be identified, of “extended acoustic instruments,” in which the computer is
used as a signal processing add-on or improvising partner to a conventional
7 The term "laptop music" surfaced in the second half of the 1990s, at around the
same time that the first "laptop performers" began to appear. For a diverse range of
assessments of laptop performance practice and its reception, see the articles collected in
Contemporary Music Review 22 (4), 2003.
14
acoustic instrument. But as the presence of the acoustic instrument already
invokes the potential for embodied performance, this area of practice is not of
specific relevance to the present study.
The “computer-as-it-comes” is a term that will appear throughout this essay.
What I intend to denote is not so much a specific device (although it could be),
but rather a general notion of the more or less generic personal computer; the
technological instantiation of the conventional guiding metaphors of CS and HCI.
This is the computer that “laptop music” adopts wholesale into its performance
practice, and the same computer that those working towards “digital musical
instruments” would seek to re-engineer in order to arrive at embodied modes of
performance.
There has been a great deal of activity in recent years in the development of
new digital musical instruments. There has also been a steadily growing corpus of
scholarly articles, research papers and theses on issues in live computer music.
While this has lead to numerous innovations in both the theory and technology of
computer music performance, there remains a near total absence of work related
specifically to the philosophical foundations of instrument design. I believe that
the most pressing issues in arriving at designs that allow for embodied forms of
musical interaction with computers are philosophical, and that in order to arrive
at sustainable designs for enactive instruments, the limits and potentialities of
the current computational media—i.e., the defining attributes of the computer-
as-it-comes—need to be examined in philosophical terms.
The tendency in digital musical instrument design has been to focus on the
pragmatic issues of design: specific sensor and actuator technologies, audio
15
synthesis methods, mapping strategies, and so on. Without addressing these
issues at some point, there will, of course, be no digital musical instruments of
which to speak. But in this essay I focus more on the theoretical and foundational
issues of design, with a view to providing a conceptual touching stone for the
pragmatic stage. While there is a great deal of overlap between the pragmatic
and the foundational issues, it seems to me that the shift of emphasis is
potentially very useful. Without proper attention to the foundational issues, there
is a greater likelihood that designers will unwittingly fall back on the received
tenets of CS and HCI, even though those tenets may (and more often than not
will) work against the bringing into being of enactive instruments.
The personal computer brings with it a sizable repertoire of usage
conventions, and, all too regularly, designers end up drawing on the conventional
patterns of use without proper consideration of the implications of those patterns
for the end user. As I will endeavour to show, these implications are philosophical
in origin, reflecting world views, and models of interaction, behavior and
cognition, that are immanent in designs, and, in turn, in the technological
artifacts that result from those designs. If a medium precludes a desired usage—
an embodied mode of interaction, for example—and if it does so because of the
world models that are embedded in its very mechanism, then that medium needs
to be examined with a philosophical perspective in order to arrive at a better
understanding of the ways in which it determines its patterns of use. This is the
first step towards rethinking and reconfiguring those patterns, and towards
arriving at designs that are more fully and properly geared towards the
requirements and desires of embodied human actors.
16
2 The Interface
Musical ideas are prisoners, more than one might believe, of musical devices.
— Pierre Schaeffer, Traite des Objets Sonore
2.1 Interaction and Indirection
Interaction takes place when signals are passed back and forth between two or
more entities. Interactions between a human and a computer are conducted
through an interface. The interface provides the human with a means of access to
the programs running on the computer; it consists in providing an appropriate
abstraction of computational data and tasks to the user. Input devices (such as
keyboards and mice) capture signals from the user that are mapped, through the
interface abstraction layer, to changes in the state of computer programs. Output
devices (such as monitors, loudspeakers and printers) transmit human-decodable
respresentations of the state of the running programs from the computer back to
the user. Human-computer interface design is therefore concerned with providing
the user with a set of usage practices, protocols and procedures appropriate to
the task domain for which the interface, in the first instance, is required.
17
One thing that distinguishes the computer from tools such as, say, the
canonical example of the hammer,1 is the absence of any direct correlation
between the physical domain in which a computational task is carried out, and
the way in which that task is conceived by the user. The hammer, considered as
an interface, is correlated within the user’s cognitive apparatus to the physical act
of hammering. But a computer user’s interactions with a computer are rarely, if
ever, correlated within the cognitive apparatus to the electrical phenomena that
constitute the physics of computation. Rather, in order to accomplish meaningful
tasks with computers, its physical operations are abstracted, and the task domain
is presented to the user in the form of graphical and auditory representations. It
follows that interactions with a computer are necessarily indirect. This sets the
medium apart not only from the hammer, but from the overwhelming majority of
tools that humans use, including conventional acoustic musical instruments.
Already, then, in the distance that the interface imposes between the human
and the computer, we see the “disconnect” between agent and medium. The
physics of computational media consists in the regulated flow of electrons
through circuits, and the human agent does not interact with those circuits in any
kind of physically direct manner. Rather, in order for significant interactions to
take place, the interactional domain needs to be designed, input devices need to
1 The hammer example has figured large in philosophy of technology and media theory
since its appearance in Heidegger's Being and Time (Heidegger [1927] 1962) and “The
Question Concerning Technology (Heidegger [1949] 1977)." For an interesting analysis of
the role that the hammer has played within these discourses, see Don Ihde's Instrumental
Realism (Ihde 1991).
18
be be mapped to tasks and procedures in software, and software data need to be
transmitted to the user in the form of representations. The overriding goal of
conventional human-computer interface design is to reduce the inevitable
distance between agent and medium, ideally to the extent that the user comes to
conceive of the task domain directly in the terms of the representations that
comprise the interface.2 To a certain extent, reducing the degree of indirection
between agent and medium is also the goal of the present study. But an enactive
model of interaction will require an entirely different approach to that taken by
conventional HCI.
2.2 Representation and Cognitive Steering
Things is what they things.
— π.o. (printed on a coffee mug)
The computer-as-it-comes packages interface abstractions into
representational frameworks—i.e., aggregated metaphorical schema—that are
customarily (though somewhat inaccurately) characterized as software. Users of
2 This is the express goal of so-called "direct manipulation" interface models (see 2.5
below). More radical approaches, such as tangible and ubiquitous computing, would seek
to embed computing devices directly (and invisibly) within the user's environment
(Dourish 2001; Greenfield 2006; Norman 1999; Ullmer and Ishii 2001; Weiser 1988,
1991, 1994; Weiser and Brown 1996).
19
personal computers are familiar with the now standard interface metaphors for
the routine management and maintenance of their computer systems: files,
folders, desktops, workspaces, trash cans, and the like. It’s a suite of
bureaucratic abstractions, extrapolated from a real-world task environment that
is likely familiar to the user, that serves to facilitate bureaucratic work, keeping
the play of regulated voltages—the physical agency through which that work is
actually accomplished—well out of the user’s immediate zone of awareness. The
interface amounts to a model of the world; an encompassing system of
metaphors that serves to both guide and regulate the agent’s thoughts and
activities through intrinsic correspondences to everyday objects and activities.
It’s an unusual transaction that takes place between the designers of
computer interfaces and the end users of those interfaces. Through the set of
interactions made available by whichever incidental pre-packaged
representational world, the user participates in whichever incidental model of the
world happens to be implicit to the design. Models of the world are born out of
philosophical systems. However well-formulated or defined those philosophical
systems may be, and however conscious a designer may be of the philosophical
underpinnings of the decisions made during the course of design, the transition
from design to artifact nonetheless remains loaded with epistemological
implications for the end user. This is an unavoidable side-effect of indirection,
and, despite a great deal of attention within the fields of interaction design and
20
technology studies,3 it’s a side-effect that remains beyond the bounds of
consideration for a large number of designers, and an even larger number of end
users. As Philip Agre has put it, “technology at present is covert philosophy (Agre
1997: 240).”
As the interface delineates the conceptual milieu to the user, it orients the
user’s cognitive activity. Through repeated performances, a set of implicit
assumptions as regards the elements and structure of the task domain begins to
solidify, and through a chain of subtle reciprocal influences, the repertoire of
meaningful performance actions becomes more or less fixed in bodily habit. This
is what Merleau-Ponty defines as an incorporating practice; a process in which
actions are literally incorporated—i.e., registered in corporeal memory—through
repeated performances (Merleau-Ponty [1945] 2004). These bodily habits do not
so much comprise a catalogue of discrete and distinct states as they do a
collection of dispositions and inclinations; arrangements within which the agent is
potentially free to move, but which at the same time determine the structure and
dynamics of those movements. In a similar vein to Pierre Bourdieu’s notion of the
habitus, there comes to exist “a durably installed generative principle of
regulated improvisations (Bourdieu 1977: 78).” To the same extent that an
interface encapsulates a model of the world, then, it encapsulates a model of
3 In particular, see Heidegger, “The Question Concerning Technology (Heidegger
[1949] 1977), Feenberg, Critical Theory of Technology (Feenberg 1991), and Agre,
Computation and Human Experience (Agre 1997).
21
performance. These dual aspects are inextricably intertwined, and over the
history of an agent’s interactions, they are mutually reinforcing.
Merleau-Ponty’s concept of incorporation is consistent with the enactive model
of cognition. In the enactive view, the systems and structures that play a
determining role in the formation of cognitive patterns are in turn determined by
the emergent patterns of interactional dynamics. Or to put it another way, at the
same time that repetitive dispositions towards action and modes of perceiving are
engendered within the agent’s sensorimotor mechanisms, “cognitive structures
emerge from the recurrent sensorimotor patterns that enable action to be
perceptually guided (Varela, Thompson, and Rosch 1991: 172-173).” This
formulation is essentially a latter day reworking of the fully recursive process,
encompassing incorporating practices, that Merleau-Ponty defined as the
intentional arc4 (Merleau-Ponty [1945] 2004).
In this feedback loop at the heart of the enactive view, there is a high degree
of reciprocal determination and specification between perception, action,
cognition, and the contingencies of the environment in which perception, action
and cognitive process are embedded. If we accept that these dependencies are
real, then it will make little sense, when examining an interactional domain with a
view to the emergence of cognitive and performative patterns, to draw a hard
4 In Varela, Thompson and Rosch's The Embodied Mind (Varela, Thompson, and Rosch
1991)—the book in which "enactive cognitive science" is first outlined—the authors
acknowledge their debt to Merleau-Ponty's phenomenology. See in particular the book's
introduction and opening chapter.
22
dividing line between action and cognition, or between the mind and the body. It
will also make little sense to examine computer interfaces, and the metaphorical
schema that those interfaces encapsulate, without due regard to their
contingencies and particularities, and the potential implications for the thoughts
and actions of the people who will interact with them. These are important
concerns not only when arriving at new designs, but also when looking at the
consequences of existing designs for performance.
It would seem that the more closely we examine the interface in use, the
more quickly the common notion of the interface as a passive and impartial
means to an end begins to break down. We come to see that it is far from
transparent to the task domain to which it is applied, and we begin to understand
it “not as an add-on which allows a human to come into relations with an
underlying structure, but rather as constitutive of that very structure (Hamman
1997: 40).” At the same time that the boundaries of the user’s potential
repertoire of actions and perceptions are determined by the epistemological
underpinnings of the representations that comprise the interface, the interface
reveals itself as embodying a theory of knowledge and performance.
But it’s how this theory of knowledge and performance is embodied in the
interface that is of specific interest to this study. The personal computer arrives
from the vendor prepackaged with a vast collection of programmed responses,
the user adds to these with the installation of new software, and accomplishes
tasks through the agency of the now standard input and display devices—the
keyboard, the mouse, the monitor, and the loudspeakers. The affordances of the
computer-as-it-comes determine the limits of what is possible within any
incidental task domain, and the user comes to learn, from one piece of software
23
to the next, the kinds of behaviors and outcomes that might be expected to come
about as a result of her regulated interactions with the medium. It may well be
that for the majority of tasks for which personal computers are routinely used,
the computer-as-it-comes is a perfectly adequate medium. But I will endeavor to
show that it is precisely the models of activity that are embedded in the interface
to the computer-as-it-comes that preclude the sense of optimal embodied
experience—the sense of flow—that can arise in complex real-time activities such
as musical performance with conventional acoustic instruments.
The predominant guiding metaphors of human-computer interface design,
through the agency of software abstractions, and input and display devices, are
geared towards routine forms of activity. For complex, situated, embodied and
real-time forms of activity, we are in need of new metaphors, new ways of
thinking about design, and new technologies. Before heading straight to the
drawing board, however, it’s worth considering what it is, exactly, about the
computer-as-it-comes that sways the user into a routine-oriented mode of
activity, and thereby precludes the potential for embodied and enactive modes of
interaction.
2.3 Computationalism
While little has been written about the philosophical basis of interaction design
with specific regard to digital musical instruments—or even, for that matter, with
regard to personal computers in general—it’s nonetheless a topic that has
received some considerable attention, particularly over the last fifteen years, in
24
artificial intelligence (AI).5 Having accomplished so little of what the pioneers of
the field promised in the 1960s, AI theorists and practitioners have been forced
to critically re-examine the institutionally endorsed models of perception, action
and reasoning that originally appeared to have such vast potential. This has led
to some important questions being raised as regards the traditional foundations
of interaction design, as well as the various philosophical assumptions on which
those foundations are built.
As a succession of AI implementations would bear out, symbolic
representations of real-world task domains must take into account a huge
number of environmental variables if the artificial agent-at-large is to be
endowed with even a sub-insect capacity for sensing and locomotion. As the
complexity of the agent’s environment increases, the number of environmental
variables also increases. In turn, the number of conditions that must be encoded
in the agent’s representation of the environment increases in geometric
proportion. Given an environment of incrementally increasing complexity, it does
not take long before the computational load on the artificial agent ensures
against its being capable of the rapid real-time responses that we witness in the
various creatures that inhabit the real world. Moreover, the agent has no capacity
for responding to features or obstacles that appear in the environment
5 In particular see Haugeland (1985), Winograd and Flores (1986), Brooks (1991),
Dreyfus (1992), and Agre (1997).
25
unexpectedly, as each new object requires that a new representation be added,
by an engineer, to the agent’s model of the world.6
It was precisely these kinds of problems that prompted a small faction of AI
researchers to question the very principle of symbolic representation.7 This would
be no simple task, as since the advent of the Church-Turing thesis (Church 1932,
1936; Turing 1936) computation has largely been conceived as the
algorithmically codifiable manipulation of symbols, where those symbols stand in
for objects and operations in the world. But even this notion of computation—the
originary notion of computer science—is itself already grounded in an older
notion, namely, the mechanistic explanation of the 17th century. The breakaway
AI researchers would be arguing, then, not only with the accepted wisdom of the
field, but with Descartes, Hobbes, Leibniz, Locke and Newton. They would be
arguing against the guiding rubric of computer science, conventional AI, and so-
called “hard” cognitive science, that has variously been labeled “mentalism (Agre
1995, 1997),” “the computational metaphor (Stein 1999),” and
“computationalism (Dietrich 1990; Scheutz 2002).” Computationalism is the term
that I will use.
6 For an interesting overview of the various problems posed by the symbolic
representation approach in AI, see the introduction to Andy Clark's Being There (Clark
1997).
7 The first viable alternative to the symbolic representation approach is outlined in
Rodney Brooks' "Intelligence without representation" and "New Approaches to Robotics
(Brooks 1991, 1991)."
26
At the heart of the computationalist perspective is the presumption that we
reason about the world through mechanized procedure; i.e., through the
deductive manipulation of symbols that stand in for objects and operations in the
world. Mental activity, therefore, consists in extrapolating data from the world,
coding abstractions from that data, and reasoning about the representational
domain that those abstractions comprise. This is, by and large, the way in which
we program computers to simulate real-world problems and dynamics, and the
successes of computer science can make it rather easy to anthropomorphize the
process of computation; to see in the mechanical procedure a simulacrum of
human thought. It’s such a view that provided the original impetus of AI
research, and has led to what Agre has termed “a dynamic of mutual
reinforcement … between the technology of computation and the Cartesian view
of human nature, with computational processes inside computers corresponding
to thought processes inside minds (Agre 1997: 2).” Essentially, the
computationalist rubric would have it that computation is synonymous with
cognition.
It’s beyond the scope of the present study to enter into what remains a major
debate in the philosophy of mind and cognitive science over the mechanistic
foundations of thought. I will however argue that the tacit acceptance of the
computationalist approach will prove to be a stumbling block in the design of
computer interfaces for musical performance, in much the same way that it has
already proven to be a stumbling block in the design of artificial agents. If, as
Andy Clark has noted, and as the failings of AI would bear out, “symbol
manipulation is a disembodied activity (Clark 1997: 4),” then the computer-as-it-
comes—a materialization of the computationalist paradigm—already precludes
27
the possibility of embodied forms of interaction. With the current state of
knowledge about the workings of the nervous system, there is no precise way of
determining whether this is a matter of a symbolic overload, or a
“representational bottleneck” (Brooks 1991), for the human agent. But what can
be seen in the computationalist model of representation is a fundamental
objectivism in which the reasoning of the agent, whether human or artificial, is
situated above and outside the environmental embedding of the agent’s body. In
other words, the agent performs manipulations on symbolic representations of
the task domain in a realm of mental abstraction that is always and necessarily
disconnected from the environmental niche in which activity actually takes place.
The agent is, therefore, a kind of transcendental controller, coding abstractions
and reasoning about a world that forever remains exterior to cognitive process.
There is, then, an essential dualism at the heart of the computationalist
model of cognition. But it’s a specific variety of dualism; one that sets an “inside”
against an “outside.” It corresponds to a manner of thinking about the world that
George Lakoff and Mark Johnson have identified as the container metaphor:
We are physical beings, bounded and set off from the rest of the world by the
surface of our skins, and we experience the rest of the world as outside us. Each
of us is a container, with a bounding surface and an in-out orientation. (Lakoff and
Johnson 1980: 29)
From this ontological grounding, the container metaphor extends to various ways
in which we conceptualize time and space, the elements of visual, aural and
tactile perception, and events, actions, activities and states. In the discourse of
computationalism, such conceptualizations come about as a result of an abstract
inner space, the “mind,” setting itself in contradistinction to both the body—which
28
is viewed as little more than a transducer of sensory experience—and the outside
world.
The container metaphor is consistent with the mechanistic explanation. The
“bounding surface” of the mind is traversed by sensory stimuli, these stimuli are
converted into representations of the world-as-perceived, these representations—
along with the representations of structure that establish their logical
connections; a kind of propositional calculus—are stored as the contents of the
mind, and these contents form the basis from which plans are constructed, “by
searching through a space of potential future sequences of events, using one’s
world models to simulate the consequences of possible actions (Agre 2002:
132).” When those plans result in behavior, the agent reaches the end of the
sequence of events that characterizes the “in-out orientation” of the mind, and
the relationship between agent and world has, in some way or other, been
altered.
The pertinence of the container metaphor to the present study lies in the
strong separation it enforces between agent and environment, as well as
between mind and body, and also in the sequential model of activity that it
presumes. Implicit to the container metaphor is the assumption that cognition is
fundamentally distinct from perceiving and acting, and that mind and matter—in
the tradition of the Cartesian res cognitans and res extensa—are necessarily
separate. It’s precisely because of the schism between thinking and acting that
activity is sequential—the agent must form an internal represention of the
domain and construct a plan before deciding on appropriate action. There is an
inevitable delay, then, between decision and action. And over the iterative chain
that would characterize extended activity—a chain of actions following decisions
29
following actions—a sequence of such delays punctuates the flow of activity. This
is a point that I will explore more fully in the next section.
The transition from textual to graphical modes of interaction with computers
brought with it significant implications not only in terms of how humans and
computers interact, but in terms of the accessibility of computing machinery to
non-specialists. In order to make computers more accessible, the interaction
paradigm would need to be both immediately intuitive to the broadest possible
range of human subjects, and applicable across the widest range of known and
as-yet-unknown task environments. The great success of the WIMP (window,
icon, menu, pointing device) model is due, as least in part, to the way in which it
galvanizes the user’s knowledge of the world; this so-called “direct manipulation”
style of interaction draws explicitly on the user’s capacity to identify symbolic
representations of data (files) and processes (programs), and—through actions
such as dragging, dropping, clicking, double-clicking, etc.—to accomplish the
tasks required by the activity domain.
The graphical interface paradigm is nowadays so pervasive, and so obviously
effective, that few people would think to question the kind of user knowledge on
which it draws. But on examination, it reveals itself to be an instance of the
container metaphor; the workspace is a container for folders, which are in turn
containers for files, the user puts things “in” the trash, “opens” a file or program,
and so on. The representational domain is functionally isomorphic with the
Cartesian model of mind, and, by virtue of the interface, the user comes to
encounter the virtual environment in much the same way as the Cartesian
subject encounters the world; through an “in-out orientation” to an environment
populated by well-defined objects.
30
If we consider these encounters with the virtual environment in light of the
constitutive role that the interface plays in determining user activity, we can
discern that the interface, over a history of interactions, will bring about the
“disconnect” between agent and environment that is implicit to the container
metaphor. The model of the world as embodied in the interface will effectively
lead to its own realization.
I take no position on the suitability of objectivist forms of representation to
everyday or mundane computational tasks. When those representations stand in
for the states of a task domain, they bring the user to conceive of the task
domain, and to act within it, in such a way that the focus is directed at changing
or manipulating those states. The objects of the virtual environment provide the
locus for interaction, and the user retains the status of detached controller. All is
(most likely) well for the maintenance of a spreadsheet, or for uploading files to a
server; activities in which it would make sense to have an objective cognizance of
the contents of the task domain. But these are not ordinarily the types of
activities in which the agent’s optimal bodily experience, or the sense of flow,
have significant bearing on the effective accomplishment of the task at hand.
If, however, we are considering the suitability of the computer-as-it-comes as
a musical instrument that would allow for embodied modes of interaction, the
kind of representations that serve as the access points to the medium, and the
world models that those representations embody, constitute an important matter
for consideration. By definition, an enactive model of performance would situate
the agent’s cognitive acitivities entirely within her environment. An objectivist
model of representational content, which would situate the agent’s cognitive
activities outside her environment, therefore throws up a not inconsiderable
31
obstacle to arriving at embodied modes of interaction. But there can be no
practicable form of interaction with a computer without an interface, and an
interface requires that the computational activity be represented in some form or
other; even if the form of representation is the physical embodiment of the
computing device itself.8 The crucial point, then, is the form of representation.
More specifically, it is the difference between those forms of representations that
set out to passively encode the state of the task domain, and those that would
seek to structure the agent’s active involvement within the task domain. This is a
point to which I will return throughout the essay. First, however, it’s worth
examining in closer detail the costs to performance of unwittingly adopting the
objectivist/computationalist model of representation that is ingrained in the
methods of conventional CS and HCI.
8 This is precisely the representational strategy behind tangible computing. In terms of
the magnitude of representational abstraction, tangible interfaces are of a very low order.
If, for example, the user of a tangible device manages to put the idea that she is
interacting with a computer out of mind, her cognizance of the interface is of the same
order of abstraction as the Gibsonian affordance ("this chair affords sitting"). For
numerous examples of tangible user interface devices see the website of the Tangible
Media Group at MIT (http://tangible.media.mit.edu; accessed July 25, 2006).
32
2.4 Sensing and Acting
A movement is learned when the body has understood it, that is, when it has
incorporated it into its ‘world,’ and to move one’s body is to aim at things through
it; it is to allow oneself to respond to their call, which is made upon it
independently of any representation.
Maurice Merleau-Ponty, The Phenomenology of Perception
Although the “disconnect” between agent and environment is intrinsic to the
container metaphor as applied to the computationalist model of mind, the
container metaphor does not in itself account for how we experience or perceive
a disconnect in, for example, musical performance with a laptop computer. And
despite the ways in which the WIMP interface regulates the activities of the user,
and indeed situates her in a specific and highly determined relation to the
medium, it’s nonetheless entirely possible for that user to become seemingly
immersed in the task environment, for the experience to be that of direct
manipulation of the interface contents, and for the medium to effectively
disappear9 from use.
9 Disappearance is an important concept in Heidegger's philosophy of technology,
particulary in Being and Time (Heidegger [1927] 1962). In short, disappearance is an
indicator of the moment at which the tool user ceases to experience the tool as separate
from her body. A state of immersion in the task for which the tool is required leads to the
33
Immersion in the task environment does not, however, provide a guarantee
of an embodied mode of interaction. It’s likely that an immersive activity is
engaging; and as such, it fulfills one of the five key criteria of embodied activity
that I outlined in Chapter 1. It would even seem to follow that an immersive
activity is, by definition, a situated activity. On closer inspection, however, this
would not seem to be the case in the specific example of interaction with the
computer-as-it-comes. The context of embodiment, i.e., the environment within
which the agent’s sense of embodiment arises, is the real world. Immersion in a
virtual environment—e.g., the environment constituted by iconic abstractions of
computational data and tasks, as in the WIMP model—involves situating the
agent’s attention and intentions squarely within that virtual world. As a
consequence, the disconnect with the real world is proportional to the amount of
attention consumed by the objects that populate the virtual world. The agent is
immersed in the activity, but it’s that very immersion that determines that the
activity is not embodied. Immersive activity involving the computer-as-it-comes
is therefore substantively different to immersive activity involving, say, a
conventional acoustic musical instrument.
In itself, this is still a superficial treatment of a very subtle process, and it
would seem that there’s more to the issue than drawing a tidy distinction
between the virtual and the real, or between abstract and direct modes of
user, in the midst of activity, experiencing the tool as an extension of her body. To that
extent, the tool disappears as an object of consciousness.
34
interaction. This is where the issues of timeliness and multimodality—another two
of the five key criteria of embodied activity—enter the picture.
I’ve already discussed the ways in which the core elements of the WIMP
model of interaction—the window, icon, menu and pointing device—play a
determining role in the formation of objectivist concepts in the computer user’s
cognizance of activity. But it’s not simply a matter that because the interactional
domain is an instance of the container metaphor, the user will come to think and
act in terms of the objects that populate a world exterior to cognitive process;
there’s also the how of the container metaphor’s instantiation within the WIMP
model. The key consideration here is the modes for the transmission of signals
from computer to user, and from user back to computer; or, more specifically, it
is the sensory and motor mechanisms that are called into use, and the
sensorimotor habits and patterns that are engendered by those modes of
transmission.
There are two key facets to the WIMP model that guarantee that interactions
with the computer-as-it-comes can never be multimodal: 1. at any given
moment, there is only a single and discrete centre of interaction; i.e., the mouse
or text cursor, and 2. the weight of emphasis on visual forms of representation
consumes a large portion of the user’s attention, and in doing so diminishes the
potential for involvement of the other sensory and motor modalities. These are
the two aspects of a mode of interaction—the typical mode of interaction with the
computer-as-it-comes—that are experienced by the user as an on-going
sequence of pointing, clicking, typing and observing. As the gaze is directed
towards an icon of interest to the task, the hand works in tandem with the eyes
to move the mouse cursor towards that icon. When the cursor and icon converge
35
on the screen, the fingers click on the mouse button, or press keys on the
keyboard, to elicit a response from the on-screen abstraction.
The immediate cost of the visuocentric approach to the non-visual
sensorimotor modalities is self-evident: the more cognitive resources are
allocated to vision, the fewer remain for the agent’s other sensors and actuators.
But there is another aspect that is perhaps less obvious, and this is where the
mode of interaction coincides with the issue of timeliness. The single point of
interaction that is characteristic of the WIMP model of interaction leads to a mode
of activity that is characterized by a sequential chain of discrete user gestures;
the flow of time is effectively segmented into discrete chunks, where any action
can be taken only after the prior action has been completed. There is no
concurrency of actions, no possibility of operating at two or more interactional
nodes simultaneously, and no potential for the cross-coupling of distinct input
channels.10 With acoustic instrumental performance, it’s not just the concurrent
use of multiple sensorimotor modalities that leads to the sense of embodiment,
it’s the various ways in which these modalities work together and exert influence
upon one another, and the way in which the performer, as a function of the
ongoing accrual of competence at coordinating the sensorimotor assemblage, is
better adapted to meet the real-time constraints of performance.
10 For a comparative analysis of "time-multiplexed" (single point) vs. "space-
multiplexed" (multiple, distributed points) interaction scenarios, see Fitzmaurice and
Buxton's "An empirical evaluation of graspable user interfaces (Fitzmaurice and Buxton
1997)."
36
With regard to this notion of timeliness, it may be useful to draw a distinction
between planning and agency. The WIMP model of interaction presumes that the
user has plans “in mind,” and that these plans are to be executed, step-by-step,
until the objective of the task-at-hand is met. The system of abstractions and
representations that typify the WIMP model are not geared to the demands of
real time. Rather, building on a model of behavior in which reasoning about
representations formed from sense impressions must always take place prior to
action, each step towards accomplishing the plan will simply take as long as it
takes to sense, infer, and act. Agency, on the other hand, is indicative of
behavior that is adaptive to environmental demands and constraints, where those
constraints encompass the necessity of a timely response. Agency, in this specific
sense, might more properly be defined as “embodied agency.” But whatever the
designation, it points to a mode of performance that is bluntly precluded by the
representational infrastructure of the WIMP paradigm. That infrastructure
presumes a model of reality in which the contents of the world come prior to our
behavioral engagement with the world; a sequence that the enactive approach
would seek to reverse.
It may be interesting to consider if there may be potential misuses of the
computer-as-it-comes that could lead to embodied interactional modes. By
“misuse,” I mean a kind of usage that in one way or another does not correspond
to the usage scenarios presumed by the WIMP paradigm. As “laptop music” has
already figured in the discussion, let’s assume that the computer-as-it-comes is,
in this case, an off-the-shelf laptop computer.
37
The standard input devices of the generic laptop computer are the keyboard
and trackpad.11 According to conventional WIMP practice, inputs at these devices
are coordinated by the position of the cursor on the computer screen. As I type
these words (at my generic laptop computer), the text cursor blinks at the
current text position, indicating the point at which the next character in this
sequence of discrete characters is anticipated. When I’m done typing, I’ll move
my second finger to the trackpad. Upon contact, a mouse cursor will appear on
screen. My finger will guide the mouse cursor to a point at the top-left region of
the screen, where prior experience tells me I will find the word “File.” When the
cursor is over that word, I’ll use my pointer finger to click the trackpad button. A
menu will appear, and as I use my third finger to move the cursor over its
contents, each item will be highlighted in turn. My third finger will stop when the
menu item “Save” is highlighted, and again I’ll use my pointer finger to click the
trackpad button. The structure of the interface determines that my motions will
follow a type-point-click sequence, and that at each step in that sequence my
attention will be directed towards the single point of interaction that the interface
affords.
As I’ve argued, this kind of determination on the part of the interface will
preclude embodied modes of interaction. But with different mappings from the
11 These devices vary from one model of laptop computer to the next; e.g., many
laptops substitute a trackpoint for a trackpad, and the number and type of trackpad
buttons may also vary. For the purposes of this example, I will assume a trackpad with a
single trackpad button.
38
input devices to programs—e.g., mappings that would subvert the inherent
sequentiality of WIMP—the interface acquires new affordances. That is, it solicits
new modes of activity from the user.
Suppose that some piece of sound synthesis software is written, and that it’s
written expressly to be used without graphical or textual feedback from the
computer screen. The cursor, then, is done away with altogether. To minimize
unnecessary distractions in performance, the computer screen could be entirely
dimmed. Interestingly enough, it’s in doing away with the cursor that entirely
new interactional possibilities for the keyboard and trackpad become apparent.
We see that the keyboard does in fact afford multiple points of interaction, and
that these points might be engaged concurrently; an affordance that the blinking
text cursor—along with the accumulated usage history of QWERTY technologies—
had somehow hidden from view. We also see that the trackpad affords
continuous input with two degrees of freedom; an affordance that was not
apparent when trackpad usage was bound up with the task of directing the cursor
to discrete points on the computer screen.
Could this amount to an interface that affords embodied modes of interaction?
The short answer is, I think, perhaps. These misuses of the keyboard and
trackpad would seem to circumvent the impediments to embodied activity that
characterize the WIMP paradigm: singularity and sequentiality. Mappings from
keyboard events to software could be arbitrarily complex, or as simple as the
mapping from piano keyboard to hammer and string (one sound event per key
event). Either way, the interface affords chording; the formation of composite
events from distributed points of interaction. Mappings from trackpad input to
software could afford the continuous modification of the sound events thus
39
triggered by the keyboard; and it’s in the continuity of these modifications that
the inherent sequentiality of pointing and clicking would be circumvented. The
asymmetry of “handedness” would likely determine that, because of the fine
granularity of action required of keyboard input, chording actions would be
performed by the dominant hand, while continuous modificatory actions at the
trackpad would be performed by the nondominant hand.12 To situate the hands in
optimal position, we might turn the base of the laptop at a 30-45° angle to the
standard typing position. We would almost certainly push the (blank) screen to as
flat a position as possible, to put it out of the way of the hands. We may, then,
have the beginnings of an expressive instrument; even, perhaps, of an embodied
performance practice.
What’s interesting about this example is that we have not changed the
physical structure of the interface; i.e., we continue to use the same keyboard
and trackpad that serve as the input devices in the WIMP model. What we have
changed, however, is the potential for interaction that the interface affords; and
the example shows that these affordances are immanent to the map from input
devices to programs. At the same time, then, that we substitute a new map for
the WIMP map, we construct a new model of performance.
Of course, to any regular user of a laptop computer, these new affordances
would need to be learned. And they would need to be learned in spite of the
activities the laptop has previously afforded in everyday use. This is not an
insurmountable task, especially given that users of general purpose computers
12 The role of bimanual asymmetry in interface design is discussed in 4.4.
40
are, to a certain, limited degree, accustomed to learning new patterns of
interaction with each new piece of software. But when I suggested that this
reconfigured laptop would perhaps afford embodied modes of interaction, I did so
out of a hesitation as regards the physical structure of the interface. That is,
while the affordances of the interface have been fundamentally altered by new
mappings from hardware to software, and while it’s entirely feasible that the
performer could develop a timely and multimodal mode of interaction with this
new interface, there nonetheless remains some physical property of the interface
that would seem to be opposed to the development of an embodied performance
practice. This may be an issue of the limited potential for resistance in the
keyboard’s pushbutton mechanism, of the arrangement of keys not being
conducive to chording, of the limited surface area of the trackpad, of the
trackpad’s proximity to the keyboard, and so on. Or, it may simply be an issue of
the instrument’s failure to be properly indicative of use (a topic I will discuss in
Chapter 4). Whatever the explanation, there seems a reasonable possibility that
the instrument will not be engaging over a sustained period of practice. And this
possibility provides enough incentive to turn attention towards the design of
special purpose devices, and to leave unanswered the question as to whether this
general purpose device might, under certain circumstance, afford embodied
modes of interaction.
I’ve been concerned in this section with outlining the ways in which the
standard interaction model of the computer-as-it-comes precludes embodied
activity. One of the hazards of design is the weight of convention on current
practice; a force that often goes entirely unnoticed in design practice. It seems to
me that it’s this very force—and the widespread failure to notice it—that has led
41
to numerous music softwares that buy unwittingly into the model of interaction
that is implicit to the WIMP paradigm. In doing so, these softwares also buy
unwittingly into a model of performance that places abstract reasoning prior to
action; a model that inevitably leads to a disembodied mode of interaction. An
enactive, embodied agent-based model of interaction, then, will need to arrive at
an alternative interactional paradigm to that of the computer-as-it-comes. One of
the main objectives of this study is to outline a sketch of one such alternative.
2.5 Functional and Realizational Interfaces
Something in the world forces us to think. This something is an object, not of
recognition, but of a fundamental encounter.
— Gilles Deleuze, Difference and Repetition
Andrew Feenberg draws a distinction between a “primary” and a “secondary
instrumentalization,” which respectively consist in “the functional constitution of
technical objects and subjects,” and “the realization of the constituted objects
and subjects in actual technical networks and devices (Feenberg 1999: 202).”13
In terms of the implementation of interfaces, the core difference between the
primary and the secondary instrumentalization lies in the way that the task
13 In Feenberg's scheme, primary and secondary instrumentalization respectively
correspond to "essentialist" and "constructivist" orientations of human to medium
(Feenberg 1999, 2000).
42
domain is structured. The functional interface (primary instrumentalization)
serves a predetermined function; it is structured around a finite set of
interactions which are known in advance of the task’s execution. The well-
designed functional interface conceals the specific mechanics of the task, and
presents the user with possibilities for action that draw on familiar and often
rehearsed patterns of experience and use. The realizational interface (secondary
instrumentalization), on the other hand, brings with it the possibility of
continuously realizing new encounters and uses, and, in the process, of re-
determining the relationship between technical objects and their human subjects.
The realizational domain encompasses the contexts of meaning and signification
in which human and medium are embedded, and is conducive to dynamic and
indeterminate forms of interaction. In short, realization is a form of play.
While Feenberg correlates the secondary instrumentalization with a broadly
socialist utopian project, he is nonetheless careful to point out that the primary
instrumentalization, or functionalism, still has it uses. There are a great many
task environments in which it makes sense to facilitate, as transparently as
possible, the accomplishment of the task. Landing an airplane, for example,
presents a situation in which human agency is best served by an immutable
function-relation between the elements of the interface and the range of possible
outcomes that the interface represents; the representational correspondence of
the interface to the world—i.e. the correlation between the system of interface
metaphors and the system of real-world objects and operations for which those
metaphors stand—should, in the interest of maximizing the potential for
continued existence, be static.
43
Efficiency is key to the functionalist approach. In terms of meeting the various
constraints and demands of the task environment, it’s of no use to have the user
waste time on the parsing of a complex metaphorical system, and it’s of no use
to involve her in forms of play. Functionalism aims to minimize the cognitive
load. To that end, the well-designed functionalist interface is comprised of
representations that are immediately familiar to the user. The cognitive effort is
at its optimal minimum when the representations have a directly recognizable
corollary in the user’s prior experience of the world. Indeed, the ideal
functionalist interface would have the user convinced that it consists of no
representations at all; it takes on an artificial transparency through its very
leveraging of the user’s experience. That is, as the task environment obtains its
coherence through the system of representations that comprise the interface, the
user comes to conceptualize the task directly in terms of what is represented; the
representations cease to be denotative, and instead become the intrinsic
elements of the task itself. At that moment the task is conflated with the
metaphorical domain in which it is represented, and the interface effectively
disappears in use; it becomes equipment.14
This situates the user in an interesting position. She is immersed in what
would appear to be the im-mediacy of the task, but the medium is still very much
present, and continues to be constitutive of the structural relation of technical
object and human subject. And while the interface is evidently not at all
14 In Heidegger's terminology, the tool becomes "equipment" at the moment of its
disappearance in use.
44
transparent to the task domain, the more it seems to be transparent, the more
effectively it corresponds to the ideal of functionalist efficiency. In leveraging the
user’s experience of the world, the interface directs her towards a set of
predetermined expectations as regards performance. It minimizes the cognitive
demand and, at the same time, defines an interactional context in which
significance—at least ideally—is invariable. In contradistinction to the domain of
realization, then, the functionalist domain does not encompass the contexts of
meaning and signification in which human and medium are embedded, and is not
conducive to dynamic and indeterminate forms of interaction.
Functionalism has become a standard metric in the evaluation of the
successes and shortcomings of computer interfaces. The idea of leveraging
experience in order to minimize the strain that the interface places on the user’s
cognitive apparatus is a hallmark of “user-centered design” (Norman 1986, 1999;
Norman and Draper 1986), and the extent to which the interface disappears from
the user’s attention constitutes the key criteria for the success of such
approaches. The model of computer interface design known as “direct
manipulation” (Norman, Holland, and Hutchins 1986)15—the model in which the
user drags graphical representations of files into graphical representations of
folders, among other things—already has the aim of the usage enterprise built
15 For an implementation guide to the "direct manipulation" model of computer
interface design, see "The Macintosh Human Interface Guidelines"
(http://developer.apple.com/documentation/mac/HIGuidelines/HIGuidelines-2.html;
accessed March 20, 2006).
45
into the blanketing term; i.e., things work best when the user believes that,
rather than manipulating symbolic abstractions, she is in fact working directly
with the objects of the task domain.
It’s entirely possible that the functionalist approach is optimally effective
across a broad range of routine computational task environments. In much the
same way that it makes little sense to employ dynamic and indeterminate forms
of interaction when landing an airplane, it makes little sense to do so when
balancing a computerized bank account or uploading a file to a server. These are
tasks in which the activity is better served by invariable representations, and in
which the degree of efficiency with which the task may be accomplished is
inversely proportional to the amount of user attention that is consumed by the
interface.
But there is a danger, with functionalism becoming something of a de facto
standard in interaction design, that the functionalist approach is adopted in task
environments where it is not well-suited. That is, in task environments where the
task-at-hand is better served by a realizational approach. In thinking about
designing interfaces for musical performance, we are dealing with such a task
environment.
Where Donald Norman and other key figures in “user-centered design”
champion the disappearance of the interface, the realizational approach would
suggest that the interface offers some form of resistance to the user; i.e., that it
should be irrevocably present. At first glance, this would seem to be at odds with
the notion of flow. One of the key aspects of this paradigmatically embodied form
of activity is its immediacy; and it would seem self-evident that the more the
46
medium obtrudes in use, the less im-mediate the activity. It’s at this point that
it’s useful to draw a distinction between embodied action and enaction. While the
sense of embodiment may be enhanced, or even optimal, when the agent
successfully responds to cognitive challenges, such cognitive challenges are not
prerequisite to embodied activity. For example, the sense of im-mediacy
experienced when the agent is immersed in the act of hammering—the sense
that the hammer is not a distinct object, but an extension of the agent’s
sensorimotor mechanism—is indicative of that agent’s embodiment in action. But
once the agent has acquired a sufficient degree of performative competence at
hammering, the task ceases to present her with cognitive challenges. Hammering
may be immediate and immersive, but it is not necessarily engaging. And it is in
this that the hammer is not a realizational interface, and hammering is not
enactive. To return to Francisco Varela’s formulation, enaction involves the
“bringing forth of a world.” The cognitive dimension is central to the process, and
it is precisely where enaction and realization coincide.
This raises an obvious question: if performance with conventional acoustic
musical instruments is enactive, exactly how is the potential for realization
embedded in the the instrumental interface? Or, how is that, say, a violin is
substantively different to a hammer? The short answer is in the way in which the
musician’s intentionality is coupled to the instrument’s specific and immanent
kinds of resistance. As the musician transmits kinetic energy into the mechanism,
the instrument responds with proportionate energy; energy that is experienced
by the musician as sound, haptic resistance, weight, and so on. There is a “push
and pull” between musician and instrument. Over a sustained period of time, the
musician adapts her bodily dispositions to the ways in which the instrument
47
resists; i.e., to the instrument’s dynamical responsiveness. It’s important to note
that these adaptations, as much as they are determined by the resistance offered
by the instrument, are also determined by the musician’s intentionality. It’s
because the musician sets out to realize something—to actively participate in
embodied practices of signification—that her adaptation follows a unique
trajectory, and the cognitive dimension continues to be central to the process of
adaptation.
But this still doesn’t provide a satisfactory explanation of how the potential for
realization is somehow embodied in one interface but not another. The hammer,
like the violin, offers resistance to the agent. At one level, then, it would seem
meaningless to talk of functional and realizational interfaces, and instead to view
the entire process as a matter of the agent’s intentionality; functionalism would
correspond to a “functional attitude,” and realization would likewise correspond to
a “realizational attitude.” But this view does not consider the specific dynamic
properties of resistance that are embodied in the interface. Rather, it presumes a
neutrality of the interface to human intentionality, and ignores the constitutive
role that the interface plays in the emergence of intentional and behavioral
patterns. An agent could very well set about developing a musical performance
practice with a hammer, carefully adapting her bodily dispositions to its dynamic
properties of resistance over a period of many years of thoughtful rehearsal. But
it’s likely that, at some point, she will either abandon the instrument for a
medium that offers greater potential for realization, or she will make
modifications to the instrument that would better serve that realizational
potential.
48
To return to Feenberg’s specification, both technical objects and subjects—
i.e., artifacts and humans—are constituted through an ongoing process of mutual
specification and determination. The hammer has been constituted to serve a
largely predetermined functional agenda: hammering. As such, it is
advantageous that the hammer, considered as interface, presents minimal
cognitive demands on the agent. Although music has its obvious functional uses
in late capitalist society, the model of musical performance that is of specific
interest to the present study is realizational; it assumes open-ended, fluid, and at
least partly indeterminate processes of signification, and as such requires the
ongoing cognitive involvement of the musician. A majority of conventional
acoustic musical instruments have been constituted in such a way that the
dynamic properties of their resistance are sufficiently complex, and at the same
time sufficiently coherent, that they coincide optimally with the musician’s
intentionality. In requiring that the musician’s ongoing cognitive involvement is
central to the process of adaptation to the instrument’s dynamics, the potential
for realization—for embodied forms of signification, and for the “bringing forth of
a world”—is effectively maximized.
Approaches to digital musical instrument design that set out to model the
dynamics of conventional acoustic instruments by and large circumvent the
pitfalls of de facto functionalism. In the simulation of the various networks of
excitors and resonators that constitute the physical mechanisms of acoustic
instruments, and in the carefully considered mapping of the parameters of those
synthesis primitives to tactile controllers, the integration of force feedback within
the controller apparatus, and so on, an interface is constituted that comes close
to the realizational potential of the real world instrument that it models. But the
49
main focus of this study is to outline a foundation for the design of digital musical
instruments that is more general than the physical modeling of existing
instruments. And while there is much to be learned through analyzing the
dynamical properties of conventional instruments, the basic idea is nonetheless
to arrive at a practice that fully engages the new prospects for performance that
are indigenous to computing media. This is why I have considered it important to
distinguish between functional and realizational modes of interaction. The
discourse of functionalism is implicit to the discourse of conventional human-
computer interaction design. As I have attempted to show, this can only be an
impediment to arriving at technologies that maximize the potential for
realization. In the specific case of musical performance, this means interfaces
that embody the prospect of enaction. This calls for an alternative discourse, and
alternative approaches to design.
2.6 Conclusion
To reiterate my key criteria from Chapter 1: embodied activity is situated,
timely, multimodal, and engaging. The sense of embodiment over a history of
interactions within a phenomenal domain emerges at the point where these
various constraints intersect. This is not just a matter of action, but rather a
matter of the various and complex dependencies between action, perception, and
cognition. Or, in a purely enactivist sense, of the inseparability of action,
perception, and cognition.
50
What I have set out to show in this chapter is the various ways in which the
computer-as-it-comes is a far from ideal medium in terms of meeting the criteria
of embodied activity. The objectivist foundations of conventional HCI presume a
strong separation between user and device, and situate the user squarely
“outside” the interactional domain. The WIMP model serves to enforce this
separation, and at the same time to regulate the actions of the user in such a
way that time is discretized into repeating units of sensing and acting, where the
locus of interaction is almost invariably unimodal. Further, the predominant
notion of human-computer interaction design, which would aim to reduce the
cognitive load on the agent and make the interface disappear from use,
presumes a model of activity that is anything but engaging or challenging to the
agent. When all or most of these criteria fail to be met, there is, in my view, no
possibility for the kind of interactive and circular processes of emergence that are
characteristic of enaction, or embodied cognition.
To arrive at an enactive model of musical interaction, then, we will need to
systematically rethink the world models that are embedded in the interface to the
computer-as-it-comes. Overcoming the disconnect that the computer-as-it-comes
enforces between human and instrument will require elaborating an alternative
world model, and then looking at the ways in which such a model could be
materialized in an instrument that would necessarily be something other than
the-computer-as-it-comes. This will be my task for the remainder of the essay.
51
3 Enaction
The body is our general medium for having a world.
— Maurice Merleau-Ponty, The Phenomenology of Perception
3.1 Two Persistent Dualisms
In Chapter 2, I suggested that it would make little sense, when examining an
interactional context with a view to enactive process, to draw hard dividing lines
between action and cognition, or between body and mind. But these hard dividing
lines persist in our language, and therefore also in any provisional description of
the elements and processes of enaction. I also suggested that it makes little
sense to discuss agent and environment in isolation, and instead stressed the
inseparability of one from the other, particularly when attempting to discern the
adaptive process that sees a complex set of ever-more refined skills, dispositions
and behaviors emerge over a history of interactions. But in any attempt to
describe such interactions, our descriptions inevitably land squarely at the
boundary between agent and environment. And so in the same way that we
52
insert a hard dividing line between body and mind, we tacitly delineate a neat
separation between body and world. On the face of it, it would seem that our
language, permeated as it is by the inherent dualisms of Western philosophical
and scientific discourse, will ultimately lead us back to a primary disconnect. Or
rather, it will lead us to two disconnects: between mind and body, and between
body and world. This presents a problem. As long as the body is opposed to both
mind and world, it’s difficult to describe, much less defend, any notion of “direct
experience.” Disconnection would seem to be the order of the day.
But while philosophical language may be geared in such a way that describing
experience necessarily involves dualist, abstract and objectivist terms, it does not
necessarily follow that “direct experience”—however that may be defined—does
not factor among those varieties of human experience for which we may or may
not already have an adequate terminology. The specific variety of experience that
I’ve set out to describe—this paradigmatically embodied, immersive and engaged
experience—is fundamentally about activities that are always in a state of
becoming, and which are therefore not at all easy to define in dualist, abstract
and objectivist terms. Enaction involves a temporality in which relations are
constantly in flux, and in which new systems and structures continuously emerge
and disappear in the midst of interactional unfolding. In other words, it involves
“the processual transformation of the past into the future through the
intermediary of transitional forms that in themselves have no permanent
substance (Varela, Thompson, and Rosch 1991: 116).” The directness of
experience, then, resides in the “nowness” of the experiential present. It is a
variety of experience that comes prior to description, and prior to any clear
53
determination of the subject, or of those objects and opportunities for action that
make up that (transitional) subject’s environment.
In attempting to define “direct experience,” then, we encounter a paradox.
Direct experience implies a provisional and temporary state of being that is
always and necessarily resistant to ontological reduction. I would even go so far
as to say that the “nowness” of the lived present is that which makes direct
experience, by definition, preontological. But as soon as we attempt to describe
the systems and structures of direct experience, we introduce ontological
categories. It’s in this that we see the intrinsic paradox of the description: there
can be no notion of that which is direct without casting experience in abstract
terms.
This is likely to be the source of some confusion. And given that one of the
primary motivations behind the present study is to outline a philosophical
foundation for design, it will not help if the key philosophical concepts are poorly
defined or potentially misleading. Fortunately, questions such as these are not
without precedent; there is a branch of philosophy that has dealt systematically
with direct experience, and it has done so within the context of a well-defined
dualist discourse. In the transcendental phenomenology of Husserl,1 the
existential phenomenology of Heidegger and Merleau-Ponty, and in the latter day
1 Although Husserl does not figure very significantly in this study, I mention him
because he is acknowledged as the founding figure of European phenomenology, and had
a direct influence on the thinking of both Heidegger and Merleau-Ponty.
54
reworking of both European and Buddhist phenomenology2 in enactive cognitive
science and so-called postphenomenology,3 the apparent paradox of a dualistic
description of unreflective behaviour is dealt with comprehensively.
Phenomenology, in its various manifestations, is a vast and complex field, and
it’s beyond the scope of this essay to cover any of its myriad branches of inquiry
in any significant manner. However, there are two key concepts, from two quite
different moments in the phenomenological tradition, which are particularly
useful to the model of interaction that I am attempting to describe. Double
embodiment and structural coupling—both of which terms already point to a
fundamental dualism prior to their elaboration—respectively address the
mind/body and body/world problems in direct experience. In outlining them here,
I hope to clear up any confusion as to how the dualism that resides in any
description of embodied action is substantively different from the disembodied
dualism that lies at the heart of the computationalist perspective. This should
bring us to a point where, after having established a disconnect in our
descriptions, we come to see how that disconnect ceases to exist in the flux of
2 The philosophy of Nagarjuna, for example, and of the Madhyamika tradition in
Buddhist thought, figures significantly in Varela, Rosch and Thompson's outline of
"codependent arising," and its implications for subjectivity (Varela, Thompson, and Rosch
1991).
3 Postphenomenology is a term introduced by, and most often associated with,
philosopher Don Ihde (Ihde 1983, 1990, 1991, 1993, 2002).
55
embodied action, and in the experiential merging of self and world. I should note
that I am not attempting to construct a new theory of the mind/body problem
here, or even to weigh into the debate. Rather, the objective is pragmatic: to
outline some core theoretical issues with a view to opening up a space for new
digital musical instrument design scenarios.
3.2 Double Embodiment
As long as the body is defined in terms of existence in-itself, it functions uniformly
like a mechanism, and as long as the mind is defined in terms of pure existence
for-itself, it knows only objects arrayed before it.
— Maurice Merleau-Ponty, The Phenomenology of Perception
In his analysis of tool use in Being and Time (Heidegger [1927] 1962), Heidegger
draws a famous distinction between the ready-to-hand and the present-at-hand.
The ready-to-hand indicates an essentially pragmatic relation between user and
tool. It is when the tool disappears, i.e., when it has the status of equipment,
that the user engages the task environment via the ready-to-hand. The relation,
then, is not about a human subject and an “object” of perception. Rather, it is
about that object’s “withdrawal” into the experiential unity of the actional
context:
The peculiarity of what is proximally ready-to-hand is that, it must, as it were,
withdraw in order to be ready-to-hand quite authentically. That with which our
56
everyday dealings proximally dwell is not the tools themselves. On the contrary,
that with which we concern ourselves primarily is the work. (Heidegger [1927]
1962: 99)
The ready-to-hand implies an engaged and embodied flow of activity. The human
is caught up in what Hubert Dreyfus has called “absorbed coping (Dreyfus 1993:
27).” It’s only when this flow of activity is disturbed by some kind of technological
breakdown that the apparently seamless continuity between user and tool is
broken.
In the moment of breaking down the tool becomes un-ready-to-hand, or, in
Heidegger’s more often used term, present-at-hand:
Anything which is un-ready-to-hand … is disturbing to us, and enables us to see
the obstinacy of that with which we must concern ourselves in the first instance
before we do anything else. With this obstinacy, the presence-at-hand of the
ready-to-hand makes itself known in a new way as the Being of that which lies
before us and calls for our attending to it. (Heidegger [1927] 1962: 102)
The hammer appears as an object of consciousness, i.e., it acquires
“hammerness,” only “if it breaks or slips from grasp or mars the wood, or if there
is a nail to be driven and the hammer cannot be found (Winograd and Flores
1986: 36).” Prior to the technological breakdown, then, the hammer is invisibly
folded into the continuum of direct experience. It has no objectness in itself, but
rather disappears into the purposefulness of action. The moment of its acquiring
the status of object coincides with a disturbance to the accomplishment of the
purpose for which the activity, in the first instance, was undertaken:
57
When an assignment has been disturbed—when something is unusable for some
purpose—then the assignment becomes explicit. (Heidegger [1927] 1962: 105)
Hubert Dreyfus recasts Heidegger’s distinction between the ready-to-hand
and the present-at-hand in psychological terms. He suggests that it is only when
purposeful activity is disturbed that “a conscious subject with self-referential
mental states directed toward determinate objects with properties gradually
emerges (Dreyfus 1991: 71).” That is, direct, immediate experience is
supplanted by abstract and reflective experience when the tool user is
necessitated by a breakdown to perceive the tool in abstract terms, and to reflect
on the context in which action and intention is embedded. There is a back-and-
forth in experience, then, between direct and abstract modes of engaging the
world. That both modes are experienced by the same body points to a
fundamental duality of embodied experience, or a double embodiment.4
4 I borrow the term "double embodiment" from Varela, Thompson and Rosch's The
Embodied Mind (Varela, Thompson, and Rosch 1991), who in turn base their coinage on
Merleau-Ponty's notion of embodiment:
We hold with Merleau-Ponty that Western scientific culture requires that we see
our bodies both as physical structures and as lived, experiential structures—in
short, as both "outer" and "inner," biological and phenomenological. These two
sides of embodiment are obviously not opposed. Instead, we continuously circulate
back and forth between them. Merleau-Ponty recognized that we cannot
understand this circulation without a detailed investigation of its fundamental axis,
58
At first glance, it would seem contradictory to speak of abstract reflection as a
subset of embodied experience. It does not, for example, satisfy the criteria of
embodied activity that I laid out in Chapter 1. Further, abstract reflection would
seem to be more or less identical in function to the disembodied reasoning of the
computationalist model of cognition that I outlined in Chapter 2. There are two
critical points here in arriving at a fairly subtle, and inherently paradoxical,
distinction. First, by locating cognitive process entirely within the mechanisms of
the body as lived, the body must necessarily “contain” cognition. To the extent
that abstract reflection forms part of lived experience—at the moment of a
technological breakdown, for example—the experience of disembodiment is quite
literally embodied by the reflective subject. Second, the computationalist model
of cognition does not account for unreflective experience. According to the
computationalist perspective, all activity is mediated by internal representations
of the task domain, and reasoning about potential courses of action. With double
embodiment, such a state of affairs arises only when the flow of unreflective
activity is interrupted. An enactive model of cognition does not, then, dismiss the
reflective state of disembodied reason. Rather, it encompasses it within the lived
experience of the doubly embodied agent at large in the world.
This seemingly paradoxical state of affairs is captured in Merleau-Ponty’s
concept of the “practical cogito (Merleau-Ponty [1945] 2004);” an idea that, in a
namely, the embodiment of knowledge, cognition, and experience. (Varela,
Thompson, and Rosch 1991:xv-xvi)
59
single turn of phrase, encompasses both direct action and abstract reflection. For
Merleau-Ponty, as for Heidegger, the phenomenological project is in the first
instance concerned with reversing the Cartesian axiom; with the substitution of
practical understanding for abstract understanding, and with the placement of an
“I can” prior to the “I think (Merleau-Ponty [1945] 2004: 137).” The crucial factor
in addressing the apparent contradiction between direct action and abstract
reflection is to situate both within the context of the unfolding of activity and
cognitive skill in a temporal context:
There is, indeed, a contradiction, as long as we operate within being, but the
contradiction disappears…if we operate in time, and if we manage to understand
time as the measure of being. (Merleau-Ponty [1945] 2004: 330)
Embodied being, then, encompasses both reflective and unreflective experience.
And in the unfolding of being that conforms to the enactive model of cognition,
“These two sides of embodiment are obviously not opposed. Instead, we
continuously circulate back and forth between them (Varela, Thompson, and
Rosch 1991: xv).” Indeed, it is through this circulating back and forth, through
what Varela et al. have termed “a fundamental circularity (Varela, Thompson,
and Rosch 1991),” that perceptual, actional and cognitive skills develop, hand in
hand. Enaction does admit a mind/body dualism, then: it “encompasses both the
body as a lived, experiential structure and the body as the context or milieu of
cognitive mechanisms (Varela, Thompson, and Rosch 1991: xvi).” But the
moment in which the agent becomes subjectively conscious of her body, and of
her body’s objective relations to the objects arrayed before it, is only ever
60
transitory. At the moment that activity resumes, the body recedes into the
background, and its objects withdraw into the immediacy of the task.
As I argued in Chapter 2, the computer-as-it-comes precludes embodied
forms of activity. It does not allow for a motility that is situated, timely,
multimodal, engaging. In short, it keeps the user in a state of disconnection from
the tool; a disconnect that is reinforced by the symbolic representationalist
underpinnings of conventional computer interfaces. What I have endeavored to
show here is that this disconnect is a factor in experience, and so when turning to
design, it should not be discounted. But that form of direct experience that
Heidegger termed the ready-to-hand—a notion that is more or less synonymous
with the notion of embodied activity that I outlined in Chapter 1, and is our
natural way of galvanizing tools and working within our everyday environments—
is missing from the conventional interactional paradigms with the computer-as-it-
comes.5 While some authors have suggested that we should explicitly factor the
Heideggerean breakdown into our music interface models (Di Scipio 1997;
Hamman 1997, 1999), they also place emphasis on non-real-time music
production (composition), rather than the processes of real-time music
production (performance) with which I am specifically concerned. I suggest,
rather, that with a view to designing enactive instruments, attention should be
5 Winograd and Flores present an extensive analysis of the conventional metaphors of
computer science in relation to a Heideggerean ontology in Understanding Computers and
Cognition (Winograd and Flores 1986).
61
directed at maximizing the potential for fully engaged and direct experience. As
with the hammer, or with any other tool, we can expect that breakdowns will
happen in the course of everyday practice. Such breakdowns are essential, for
example, to the incremental adaptive process of learning to play a conventional
acoustic instrument.6 My focus, then, when turning to issues of design, will not be
directed at engineering breakdowns, but rather at engineering the potential for
the desired kind of breakdowns. In terms of the technical implementation, the
measure will be resistance.
3.3 Structural Coupling
The world is inseparable from the subject, but from a subject which is nothing but
a project of the world, and the subject is inseparable from the world, but from a
world which the subject itself projects.
— Maurice Merleau-Ponty, The Phenomenology of Perception
Although I’ve already suggested that double embodiment and structural coupling
address, respectively, mind/body and body/world dualisms, it would be more
accurate to say that both double embodiment and structural coupling address the
mind/body/world continuum with an emphasis on different processes. The world
6 Later in the chapter (3.5) I outline this adaptive process in detail with specific
reference to the role of breakdowns.
62
obviously figures in the double embodiment analysis: it is the context in which
action is embedded. In much the same way, the mind figures in structural
coupling: it is the locus of cognitive emergence over a history of interactions
between body and world. But where the emphasis in double embodiment is on
the oscillatory nature of mental engagement in an interactional context, the
emphasis in structural coupling is on the circular processes of causation and
specification that pertain between the agent and the environment. More
specifically, structural coupling draws a dividing line between body and world in
description and schematization—i.e., it enforces a separation—in order to
demonstrate the inseparability of one from the other in the unfolding of a
coextensive interactional milieu, and in the emergence of performative and
cognitive patterns and competencies.
In early formulations (Maturana and Varela 1980, 1987), the concept of
structural coupling was applied to evolutionary biology. It presented an analysis
of the interactions between an organism and its environment (where the
environment may include other organisms), with a view to their mutual
adaptation and coevolution. More specifically, it addressed the circular and
reciprocal nature of these interactions. The coupling between organism and
environment is “structural” because, as the organism and the environment
exchange matter and energy, their respective structures, and hence the structure
of their interactions, are changed as a function of the exchange. The process is
captured neatly in Maturana and Varela’s definition of an autopoietic machine:
An autopoietic machine is a machine organized (defined as a unity) as a network
of processes of production (transformation and destruction) of components that
63
produce the components which: (i) through their interactions and transformations
continuously regenerate and realize the network of processes (relations) that
produced them; and (ii) constitute it (the machine) as a concrete unity in the
space in which they (the components) exist by specifying the topological domain
of its realization as such a network. (Maturana and Varela 1980: 78-79)
Over a history of exchanges between organism and environment, there is an
increasing regularization of structure, i.e., a continuous realization of “the
network of processes,” such that both organism and environment are more viably
adapted to productive exchange, and such that those exchanges strengthen the
conditions for continued interaction.
Structural coupling is a key component of the enactivist model of cognition. In
Varela, Rosch and Thompson’s formulation, it is the very mechanism by which
cognitive properties emerge:
Question 1: What is cognition?
Answer: Enaction: A history of structural coupling that brings forth a world.
(Varela, Thompson, and Rosch 1991: 206)
The world that is brought forth, or enacted, by the agent, traverses the divide
between agent and environment. In contrast to the computationalist subject—
who reasons about an external world in an internal domain of symbolic
representation—the enactive subject actively realizes the world through the
connection of the nervous system to the sensory and motor surfaces which, in
turn, connect the embodied agent to the environment within the course of action.
The fully developed notion of structural coupling, then, emphasizes the
64
inseparability of agent and environment in embodied cognition, but at the same
time locates the points at which agent and environment intersect, and offers an
explanation as to how repetitive contacts at these points of intersection can lead
to incrementally more complex states of functioning on the part of the cognitive
system.
There is a certain push and pull of physical forces between agent and
environment that constitutes a critical aspect of their structural coupling. In other
words, structural coupling implies physical constraints and feedback. The
contingencies and specificities of the agent’s embodiment form one such
constraint, and it is a constraint that is in an ongoing state of transformation as
the agent acquires and develops motor skills, or finds herself in new or changing
environments with new or changing actional priorities. Physical constraints also
exist within the environment, and these forces act upon the agent’s body within
the course of activity, and so play a critical role in the emergence of embodied
practices and habits. This push and pull between agent and environment has a
dynamic contour, and this is where the “hard dividing line” that we may draw
between them must necessarily be qualified. The dividing line is rather more
pliable; a quality that is tidily encapsulated in a schematization by Hillel Chiel and
Randall Beer (Figure 3.1).
65
Figure 3.1. Interactions between the nervous system, the body
(sensorimotor surfaces), and the environment (from Chiel and Beer
(1997)). Chiel and Bier’s commentary: The nervous system (NS) is
embedded within a body, which in turn is embedded within the
environment. The nervous system, the body, and the environment are
each rich, complicated, highly structured dynamical systems, which are
coupled to one another, and adaptive behavior emerges from the
interactions of all three systems.
In Chiel and Beer’s diagram, the dividing lines between body and environment,
and between nervous system and body, are clearly distinct, but they are not
rigid. The push and pull between each of the components in the interactional
domain is indicated by projecting triangular regions. It’s clear that a “push” on
one side of the body-environment divide results in a proportionate “pull” on the
other, and vice versa. The “body” consists of sensory inputs and motor outputs,
and contains the nervous system, which is connected to the sensorimotor surface
through the same dynamical “push-pull” patterns that connect the body to the
environment. There is, then, a fluid complementarity between environment,
66
body, and nervous system, of which Chiel and Beer’s diagram provides an
instantaneous snapshot. To capture the properly dynamical nature of this
complementarity, the diagram would need to be animated. We would then see
the projecting triangular regions extend and contract in regular (though not
necessarily periodic) oscillatory patterns, and these motions would provide a view
of the continuous balancing of energies between agent and environment as the
play of physically constrained action unfolds over time.
These kinds of exchanges may be more or less stable in terms of the impact
of environmental dynamics on agent dynamics, and vice versa. And they may
demand more or less of the agent’s cognitive resources, depending on the
potential complexity of balancing the intentionality of the agent with the
environmental contingencies. What we see is a transfer function—a map—from
agent to environment and back again, that, from one interaction to the next, may
exhibit linear, nonlinear, or even random behavior. Beer has suggested that when
embodied agent and environment are coupled through interaction, they form a
nonautonomous dynamical system (Beer 1996, 1997). It’s a perspective that has
also been adopted by a handful of cognitive scientists as an explanatory
mechanism for the emergence of cognitive structures through interactional
dynamics (Hutchins 1995; Thelen 1994). Although it doesn’t form an explicit part
of Varela and Maturana’s original formulation, the dynamical systems approach
provides a potentially useful way of both understanding and schematizing
structural coupling. I will return to this point in my outline of implementational
models in Chapter 4.
67
There are two fundamental and seemingly contradictory points to viewing
interactions between an embodied agent and its environment as a process of
structural coupling: 1. to emphasize the inseparability of agent and environment,
and 2. to locate the points at which agent and environment intersect, i.e. their
bounding surfaces. The danger with the analytic part of this formulation is that,
as soon as we’ve drawn the dividing line between agent and environment, it’s
rather easy to view them in isolation, and to understand their respective
behaviors as self-contained properties of autonomous systems. This lands us,
more or less, back within the computationalist model of rationally guided action.
Therefore, it’s precisely the point at which the mechanics of the agent-
environment connection need to be described.
We will see a disconnect in schematizations of both the computationalist and
the enactive models of action. On one side, the agent, on the other, the world.
But what distinguishes the enactive model from the computationalist model is the
formation of a larger unity between agent and world through dynamical
processes of embodied interaction and adaptation. These processes are
characterized by crossings of the divide, by the “push and pull” between coupled
physical systems, and by a form of experience that, rather than being lived
through a world of abstract inner contemplation, is lived directly at the points
where the sensorimotor system coincides with the environment in which it is
embedded. Although we can delineate the boundary between agent and
environment in an abstract diagram of their interactional milieu, such a diagram
will not capture the experiential aspect of embodied interaction. The agent does
not feel herself to be separate from the world in which she is acting but, rather,
68
is intimately folded into its dynamics and processes. The “bringing forth of a
world,” that is, of an organismic continuity between agent and environment,
amounts to the moment at which the original severance, or disconnect, ceases to
factor in the agent’s experience. The body is not “as it in fact is, as a thing in
objective space,” but rather constitutes “a system of possible actions, a virtual
body with its phenomenal ‘place’ defined by its task and situation (Merleau-Ponty
[1945] 2004: 250).”
I would argue that, as a matter of definition, when the five criteria of
embodied activity (Chapter 1) are met, a structurally coupled system is inevitably
formed.7 To this extent (and in keeping with Varela, Thompson and Rosch’s
formulation), structural coupling implies enaction, and vice versa. Structural
coupling between performer and instrument will, therefore, be key to the model
of enactive musical performance that I am proposing, and an essential criterion
in design.
7 To be more precise, the first four criteria of embodied activity would form a
structurally coupled system, and the fifth--embodiment is an emergent phenomenon--
would come for free. As Thelen and Smith point out (Thelen 1994), the emergence of
cognitive, perceptual and actional abilities constitute the teleological dimension of
structural coupling.
69
3.4 Towards an Enactive Model of Interaction
The key theoretical components of the essay have now been presented. But
before turning to issues of the design and implementation of enactive digital
musical instruments, it may prove useful to outline the various models of
interaction that I’ve discussed to this point in the form of diagrams. The leap
from theory to implementation is almost always a shaky endeavor, and the
models that I present here may serve as a provisional and necessarily speculative
bridging of the gap between theory and praxis. To that end, the diagrams focus
specifically on human-computer interaction, but remain both general and non-
specific in terms of hardware and software implementation details (i.e., the
interface). It’s the dynamics of the various models of interaction between human
and computer that form the key concern, with a view to distinguishing their
various implications for the development of human cognition and action. The
underlying rationale, then, is to arrive at a candidate model of enactive
interaction, with the intention of holding this model in view when shifting the
focus to implementation.
There is a basic model of human-computer interaction (figure 3.2) that can be
taken to hold for all subsequent models. The human performs actions at the
inputs to the computer which cause changes to the state of the computer’s
programs. In turn, the computer transmits output signals representing the state
of its programs which are perceived by the human.
70
PERCEPTION
ACTION
HUMAN COMPUTER
OUTPUT
PROGRAMS
INPUT
S
M
Figure 3.2. The basic model of the human-computer interaction loop. S
represents the map from the state of the computer’s output devices to the
human’s sensory inputs, and M represents the map from the human’s
motor activities to the state of the computer’s input devices. Together, the
input and output devices constitute the interface to the programs running
on the computer.
The basic model is, however, incomplete. The human perceives and acts, and
therefore demonstrates intentionality. But there is nothing to link perception to
action. That is, although a cognitive dimension is implied, the model does not
account for it. In fact, the usefulness of the model lies solely in specifying the
basic mechanics of human-computer interaction, and as these mechanics can be
assumed to be unchanging for all subsequent models,8 it can also be assumed
8 To say that the basic mechanics is unchanging is not to say that the interfaces will be
identical. The basic mechanics can be taken to mean the maps from output to perception,
71
that the subsequent models will be distinguished solely by cognitive
considerations. For present purposes, this means the map between perception
and action.
To make the step from the basic model to the conventional model of human-
computer interaction, we need only insert human reasoning between perceiving
and acting (figure 3.3).
PERCEPTION
REASONING
ACTION
HUMAN COMPUTER
OUTPUT
PROGRAMS
INPUT
S
M
Figure 3.3. The basic model extended to include the model of human
activity in conventional HCI. Human actions follow after inner reasoning
about sensory inputs, resulting in a sequential chain of actions, and a
and from action to input. Different interfaces will result in different map dynamics, and
these dynamics will in turn carry different sets of implications for cognition, perception and
action.
72
segmentation of the flow of time (see Chapter 2.4). This model is
paradigmatic of what I have termed “the computer-as-it-comes.”
We now have a schematization of the Cartesian subject in the midst of
interaction, and it’s interesting to note the upside-down symmetry on either side
of the human/computer divide.9 The conventional model presumes that the
human reasons about her interactions with the computer in an inner world of
mental abstraction. There is therefore an inevitable time delay between
perception and action, the duration of which is simply as long as it takes to
perform the necessary mental computations. Although they are not detailed in
figure 3.3, it can be assumed that the input and output devices of conventional
HCI serve to reinforce the computationalist ontology from which conventional HCI
derives. To this extent, input devices are ordinarily monomodal and geared to a
single focal point of motor activity (from one moment to the next, either the
mouse or the keyboard), and output devices are ordinarily visuocentric and
geared to a single focal display point (the cursor). When these factors combine in
the form of a device, we have what I have termed “the computer-as-it-comes.” It
can effectively be guaranteed that interactions with the computer-as-it-comes
will be disembodied, at least according to the minimal criteria I set down for
embodied activity in Chapter 1.
9 In the spirit of mechanistic philosophy, we could even relabel “Perception,”
“Reasoning,” and “Action,” respectively as “Input,” “Programs,” and “Output.”
73
In Chapter 2, I drew a distinction between functional and realizational
interfaces. The distinction rests on the manner in which the interface elicits
particular varieties of action and thought from the human user. While the
terminology places explicit emphasis on the interface and how it is constituted,
the immediate concern lies with the implications of the interface for the
emergence of cognitive, perceptual and actional patterns. In schematizing the
respective interactional paradigms of the functional and realizational interfaces,
then, I have added a further cognitive dimension to the human side of the
computer-as-it-comes model, while the computer side has remained unchanged.
In the diagram of the functional model of interaction (figure 3.4), the added
dimension is labelled “Knowledge.” This knowledge can be considered offline with
regard to activity. That is, it’s an abstract quantity that exists prior to interactions
with the computer, and while it directly informs the ways in which the human
subject perceives and reasons, knowledge is “accessed,” rather than
“constituted,” within the course of action. It can also be assumed that there are
no real-time constraints on the accessing of this knowledge, and that this aspect
reinforces the sense, in user experience, that the knowledge being galvanized is
offline.
74
PERCEPTION
REASONING
ACTION
HUMAN COMPUTER
OUTPUT
PROGRAMS
INPUT
S
M
KNOWLEDGE
Figure 3.4. The human-computer interaction loop with the functional
interface (see 2.5). The human’s knowledge is leveraged by the
abstractions that comprise the computer’s interface, and this knowledge is
galvanized to guide perception and reasoning, leading to appropriate
action. The functional interface is deterministic; i.e., the goal of the task-
at-hand is known in advance, and the interface is designed to lead to the
accomplishment of this goal while placing minimal cognitive demands on
the human.
I noted in Chapter 2 that functionalism is something of a standard in conventional
interaction design. Through leveraging existing user knowledge, and thereby
minimizing the cognitive load, the task domain and its end goals are made as
transparent as possible. While the approach has a great many advantages for
routine activities with computers, it is not advantageous to activities that are
dynamic or nondeterministic by nature.
75
In figure 3.5, “Knowledge” is relabelled “Realization,” and the links between
“Realization” and “Perception,” and “Realization” and “Reasoning,” are now
bidirectional.
PERCEPTION
REASONING
ACTION
HUMAN COMPUTER
OUTPUT
PROGRAMS
INPUT
S
M
REALIZATION
Figure 3.5. The human-computer interaction loop with the realizational
interface (see 2.5). The key difference between the realizational and the
functional interface lies in the cognitive demands they place on the human.
Whereas human knowledge can be considered static in functional
interactions, it is dynamic in realizational interactions. The realizational
interface is nondeterministic; i.e., it brings with it a continuing potential
for new encounters and uses, and human knowledge continues to expand
over a history of interactions. Because the term “knowledge” implies a
fixed state of knowing, it is substituted in the diagram by the more
dynamic and fluid “realization.”
76
The term “knowledge” implies a static corpus of known facts. It’s precisely this
corpus of “knowns” on which the functional interface draws. The realizational
interface, on the other hand, offers resistance to the user, deliberately prompting
her to new modes of thinking about the task domain. Hence the substitution of
the more dynamic and fluid term “realization.” In figure 3.5, a reasoning stage
still intervenes between the perceiving and acting stages. According to the
criteria of embodied activity, then, the model represents a disembodied mode of
interaction. Nonetheless, an important step has been taken towards the enactive
model. By introducing resistance to the interface—a resistance that requires the
human to fully engage in the activity—the shift is effected from a static and
deterministic model of activity to one that is dynamic and nondeterministic. While
realization is offline to the activity, it still requires that the human commit
continuous and significant cognitive resources to the task, and thus opens the
possibility for the on-going generation of new meanings and modes of thought.
I have defined embodied activity as a state of being that consists in a
merging of action and awareness. That is to say, there is a seamless continuity
between perceiving and acting, experienced as flow. In figure 3.6, the boundaries
between perception, reasoning and action are collapsed, and the continuity
between perceiving and acting is indicated by the label “Perceptually Guided
Action.”
77
PERCEPTUALLY
GU IDED
ACTION
HUMAN COMPUTER
OUTPUT
PROGRAMS
INPUT
S
M
Figure 3.6. Embodied Interaction. The perceiving/reasoning/acting
sequence has been collapsed into a fully integrated model of activity.
Perception and action constitute a unity, labelled here as “Perceptually
Guided Action.” This corresponds to the flow of embodied activity (see
1.2), and to Heidegger’s ready-to-hand (see 3.2); there is a merging of
action and awareness, and the sense of disconnect between human and
computer ceases to factor in experience.
This is the first of the schematizations in which the human is represented as a
unity, and it can be assumed that the experience of “oneness” involves the loss
of any sense of disconnect with the computer. I’ve argued that such a mode of
activity is precluded by the computer-as-it-comes, and that this has proven a
major stumbling block in arriving at designs for digital musical instruments that
allow for embodied modes of interaction. The model of activity corresponds to
Heidegger’s ready-to-hand, or in Hubert Dreyfus’ paraphrase, “absorbed coping,”
or, in the rubric that I’ve used throughout the essay, embodied action. As with
78
the standard model of human-computer interaction (figure 3.2), there is no
explicit focus on conscious mechanisms. Indeed, the distinguishing aspect of the
ready-to-hand is that it is an unconscious, unreflective mode of behavior.
In Chapter 2, I suggested that what distinguishes embodied action from
enaction is the realizational dimension. That is, that while the sense of
embodiment may be optimal when cognitive challenges are placed upon the
human agent, such challenges are not prerequisite to embodiment. Cognitive
realization is, however, prerequisite to enaction. To make the step from
embodied action to enaction, then, “Realization” is connected to “Perceptually
Guided Action” through a bidirectional path (figure 3.7).
PERCEPTUALLY
GU IDED
ACTION
HUMAN COMPUTER
OUTPUT
PROGRAMS
INPUT
S
M
REALIZATION
Figure 3.7. Enaction. Human and computer are structurally coupled
systems (see 3.3). Enaction implies an embodied model of interaction with
79
a view to cognitive and actional realization. In the enactivist view,
cognition is an embodied phenomenon. It arises through physical
interactions, and in turn shapes the trajectory of future interactions.
There’s a symmetry between the enactive model and that of the realizational
interface (figure 3.5): both include a realizational dimension that is tied, through
reciprocal patterns of determination, to perception and action. And in both
instances, realization is tightly correlated to the resistance that the interface
offers to the human user, and to the cognitive challenges this resistance
presents. But where the realizational interface solicits a mode of activity that is
disembodied and offline, the enactive interface solicits time-constrained
improvised responses that are embodied and online. Another way to view this is
as the difference between, in Elizabeth Preston’s terminology, “representational
and non-representational intentionality (Preston 1988).” Where the realizational
interface is concerned with engineering a representational breakdown—i.e.
deliberately causing a reappraisal of the representations that comprise the
interface; an activity that necessarily involves reasoning, and is therefore
disembodied and offline—the enactive interface is concerned with soliciting new
responses without recourse to inner representations.10 That is, the interface is
10 There are continuing disagreements among cognitive scientists and philosophers of
mind as to whether "inner representations" play a part in direct experience. Although I
take no position in the debate, for the purposes of the present study I assume that inner
representations play no part in direct experience, as this makes it easier to distinguish
between direct and abstract experience. If we were to stick with the idea that humans are
80
encountered directly rather than abstractly, in real time and real space, and
human activity is embodied and online. In the enactive model, then, realization is
an incremental process of cognitive regularization and awareness, stemming from
forces that are directly registered through the body, and at the same time
determining the emergent contour of the body’s unfolding patterns and
trajectories.
The enactive model of interaction represents the ideal performative outcome
of the class of digital musical instruments that I am setting out to define and
describe in this study. Before turning to design, however, it’s important to note
that while the enactive model of interaction represents an idealized “way of
being” in the performative moment, it does not represent the sum total of the
performance practice. Rather, in keeping with Merleau-Ponty’s theory of “double
embodiment,” that performance practice, in addition to the enactive model of
interaction (figure 3.7), would also at various moments involve embodied action
storing the contents of their environment as inner representations at all times, then we
could potentially draw the distinction between abstract and direct experience in terms of
objective and deictic intentionality. Deictic representations were discussed in Chapter 2,
but I will reiterate here. According to Philip Agre, "a deictic ontology ... can be defined
only in indexical and functional terms, that is, in relations to an agent's spatial location,
social position, or current and typical goals or projects (Agre 1997: 242)." With deictic
intentionality, then, we do not relate to an object in terms of its objectness, but in terms
of the role it plays in our activities. And it is because the object is so directly folded into
the actional midst that we encounter it directly rather than abstractly.
81
(figure 3.6), and offline realization (figure 3.5). Each of these modalities would
constitute different ways of engaging the same instrument, and the human
performer would routinely cross the lines that distinguish one modality from the
next. There will be “breakdowns,” particularly in the learning stage, which shift
awareness to the “objectness” of the instrument. The instrument will become
present-at-hand; i.e., it will be encountered through a representational
intentionality. Additionally, in the midst of embodied activity, it cannot be
assumed that the instrument will provide endless novelty to the performer;
particularly as, over the course of practice, she becomes more finely adapted to
the instrumental dynamics. At such moments—again, to borrow terminology from
Heidegger—the instrument effectively disappears from use, and becomes ready-
to-hand. In everyday embodied practices, it’s not unusual for these experiential
modalities to be engaged simultaneously. For example, a violinist breaks a string
in the middle of performance, drawing the focus of her attention to the
objectness of the instrument. At the same time, however, she continues playing
on the remaining three strings. With the greater portion of available cognitive
resources allocated to the instrumental breakdown, it’s likely that the act of
playing proceeds without a great deal of reflective thought. We see then a
coincidence of the present-at-hand and the ready-to-hand, as the intentionality
of the performer is divided across different components of the same instrument.
That the same human is able to divide the instantaneous allocation of
cognitive resources into representational and nonrepresentational subcomponents
is nothing extraordinary for a practiced, multi-tasking, doubly embodied
performer. It is also something that happens as a matter of course in the
82
development of any form of embodied practice, and therefore need not factor in
design. In the particular case of what I have termed enactive digital musical
instruments, then, it can be assumed that if the instrumental implementation
engenders suitable conditions for the enactive model of interaction, the other
modalities—embodied action and offline realization—will invariably follow. The
practical implication for instrument design, then, is that the enactive model is the
only one that need be kept in view.
3.5 The Discontinuous Unfolding of Skill Acquisition
In Merleau-Ponty’s phenomenology, human intentionality is fundamentally
concerned with the body’s manner of relating to objects in the course of
purposive activity. In the broadest sense of the term, it encompasses both
representational and nonrepresentational intentional modes. In using the
umbrella term “intentionality,” then, we can condense the enactive, embodied
action, and offline realization models into a single integrated model, which I have
termed “enactive performance practice (figure 3.8).” The model encompasses the
interdependencies between perception, action and cognitive unfolding within the
circumscribed interactional domain of instrumental practice.
83
HUMAN BODY INSTRUMENT
COGNITION
R
I
Time
Figure 3.8. Enactive performance practice. Human body and instrument
are unities, and cognitive abilities emerge over time through the
continuous and embodied circular interactions between them. As these
cognitive abilities develop, there is an incremental regularization of the
performative patterns of the body, and of the dynamics of the body-
instrument interactions. I represents the map from human intentionality to
the instrument, while R represents the map from the instrument’s
reactions back to the human.
While the enactive performance practice model is too general to be useful in
design, it does serve to encapsulate all the key facets of the interaction paradigm
I’ve set out to describe. The human acts purposefully through her body,
exemplifying an intentionality. Her bodily actions are transduced by the
instrument and lead to a reaction. The instrumental reactions are perceived by
the human, and these perceptions, as they are registered in the body, modulate
her intentionality, and thus her ongoing reactions and bodily dispositions. The
process could be schematized as a bidirectional exchange, but we get closer to
84
the flux of the performance experience if the interactions are viewed as circular
and continuous. The cognitive dimension is not independent of these interactions,
but rather is folded into them through realization. Over time, cognitive abilities
continue to develop, as the body continues to adapt to the dynamics of the
interactional domain. Although cognition and the body are indicated as distinct
entities in figure 3.8, this is solely for the purposes of clarity. It should be kept in
mind that cognition is an embodied phenomenon, realized at the connections
between the nervous system, the sensorimotor surfaces, and the environment.11
Enactive performance practice as I’ve outlined it here is consistent with
Merleau-Ponty’s notion of the intentional arc (see Chapter 2). The “arc” metaphor
is interesting, as it implies a continuity in the acquisition of perceptual, actional
and cognitive skills; a continuity that is also implied in the unbroken trajectory of
cognitive unfolding in figure 3.8. As long as enactive performance practice—and
also the intentional arc—can be said to encompass representational and
nonrepresentational modes, this doesn’t present a problem. At the same time,
however, the model does not accurately reflect the ways in which the modes of
bodily relation to an instrument are transformed over the course of cognitive
11 In this essay, the environment can be taken to comprise the instrument, and an
idealized physical space in which the instrument's outputs might be optimally perceived by
the human performer. In real practice, of course, the environment may include any
manner of physical spaces, humans, other animals, etc. While such features of the
environment will inevitably play a part in the emergence and formation of performer
intentionality, it's beyond the scope of this study to factor them into consideration.
85
unfolding; i.e. it does not account for the intrinsically discontinuous back-and-
forth between the present-at-hand and the ready-to-hand that characterizes the
acquisition of skill. This is an especially important point when considering the
acquisition of realizational skills, such as learning to play a musical instrument.
Before moving on to issues of implementation, then, it’s worth considering the
ways in which human bodily ways of being are transformed within the process of
acquiring a specific skill. I’ll do this by drawing out some correspondences
between two texts, Hubert Dreyfus’ “The Current Relevance of Merleau-Ponty’s
Phenomenology of Embodiment (Dreyfus 1996),” and David Sudnow’s Ways of
the Hand (Sudnow 2001).
Dreyfus sets out in his article to “lay out more fully than Merleau-Ponty does,
how our relation to the world is transformed as we acquire a skill (Dreyfus
1996:6).”12 He does this by dividing the temporal unfolding of skill acquisition
into five distinct stages—“Novice,” “Advanced beginner,” “Competence,”
“Proficient,” and “Expertise”—where each stage is characterized by specific bodily
ways of relating to the task environment in question. Dreyfus assumes “the case
of an adult acquiring a skill by instruction (Dreyfus 1996:6),” and illustrates his
argument with two examples: learning to drive a car, and learning to play chess.
In the discussion that follows, I will borrow from Dreyfus’ decomposition of the
intentional arc into five distinct stages, but will illustrate the argument with an
12 The numbering system in citations of Dreyfus' article refer to the paragraph number
of the online text.
86
example that is more immediately pertinent to the present study: learning to
improvise with a musical instrument. Sudnow’s Ways of the Hand—a detailed first
person “production account” of the gradual acquisition of skill as a jazz pianist—is
in this regard an ideal candidate.
Dreyfus’ “Novice” stage begins with the reduction of the task environment
into explicit representations of the elements of which the environment is
composed:
Normally, the instruction process begins with the instructor decomposing the task
environment into context-free features which the beginner can recognize without
benefit of experience in the task domain. The beginner is then given rules for
determining actions on the basis of these features, like a computer following a
program. (Dreyfus 1996:7)
That the features of the environment are “context-free” implies that the focus of
activity is directed towards connecting the body to the instrument—i.e.,
establishing a “grip”—in the proper place and with the proper alignments, but
without any explicit regard as to how these alignments will eventually fold into
the context of embodied, time-constrained performance. For Sudnow, the
features of the task environment were chords, and the proper alignments were
the voicing of those chords:
In early lessons with my new teacher the topic was chord construction, or voicing,
playing a chord’s tones in nicely distributed ways. (Sudnow 2001:12)
The proper “place” of the chords was determined by the specific configuration of
piano keys that the hand would need to engage. It’s interesting to note the
87
“substantial initial awkwardness” that Sudnow describes in the complex of
lookings and graspings that characterize this stage:
I would find a particular chord, groping to put each finger into a good spot,
arranging the individual fingers a bit to find a way for the hand to feel
comfortable, and, having gained a hold on the chord, getting a good grasp, I’d let
it go, then look back to the keyboard—only to find the visual and manual hold
hadn’t yet been well established. I had to take up the chord again in terms of its
constitution, find the individual notes again, build it up from the scratch of its
broken parts. (Sudnow 2001:12)
The mode of engagement here is clearly that of the present-at-hand. Each note
of the chord is mentally associated with an individual finger before the hand gains
a hold on the chord as a whole. The chord, then—the initial “context-free” feature
of the environment—is itself decomposed into individual features. And this
decomposition demands an on-going coordination between an abstract mental
image of the task at hand and the accomplishment of the task. As Sudnow notes,
“lots of searching and looking are first required (Sudnow 2001:12).”
In Dreyfus’ taxonomy, the “Advanced beginner” stage is characterized by the
emergence of a degree of contextual recognition:
As the novice gains experience actually coping with real situations, he begins to
note, or an instructor points out, perspicuous examples of meaningful additional
aspects of the situation. After seeing a sufficient number of examples, the student
learns to recognize them. Instructional maxims now can refer to these new
situational aspects, recognized on the basis of experience, as well as to the
88
objectively defined non-situational features recognizable by the novice. (Dreyfus
1996:10)
The “situational aspects” here point to an initial emergence of gestalts; i.e. of the
tendency to regard coordinated actions—such as the playing of a chord—not as
the combined motions of individual figures, but as a single, integrated motion of
the hands:
As my hands began to form constellations, the scope of my looking
correspondingly grasped the chord as a whole, seeing not its note-for-noteness
but its configuration against the broader visual field of the terrain. (Sudnow
2001:13)
It’s important to note, however, that such gestalts remain limited to isolated and
non-time-pressured events. The context that the performer is beginning to
glimpse, then, remains offline. The perceptual recognition of places and
alignments is beginning to occur at a higher level of scale, but this recognition is
neither situated (in the sense that one place and alignment might lead to a next
place and alignment, or that it might be solicited by some other pressing
constraint in the environment, or both) nor timely (in the sense that the
transition from one place and alignment to a next must satisfy timing constraints
in the broader context of a performance). It is at the next stage of skill
acquisition that such factors enter the equation.
Dreyfus’ designation for the third stage of skill acquistion—“Competence”—is
potentially misleading. It would perhaps be more accurate to say that
competence emerges towards the end of the third stage, where the stage as a
whole is characterized by a gradually increasing capacity for dealing with the
89
online aspects of performance; i.e. for situated and timely musical utterances.
The beginning of the third stage is marked, however, by anything but a sense of
performative competence. Rather, the disparity between the level of skill
accomplished thus far and a newly gained understanding of the larger context of
performance—i.e., its online aspects—leads to a sense of frustration. This
frustration is borne specifically of the body’s inability to adequately respond to
the seemingly overwhelming online demands of performance:
With more experience, the number of potentially relevant elements of a real-world
situation that the learner is able to recognize becomes overwhelming. At this
point, since a sense of what is important in any particular situation is missing,
performance becomes nerve-wracking and exhausting, and the student might
wonder how anybody ever masters the skill. (Dreyfus 1996:13)
Interestingly enough, Sudnow’s first public performance took place at precisely
this stage in his development. It’s worth quoting his account in full:
The music wasn’t mine. It was going on all around me. I was in the midst of a
music the way a lost newcomer finds himself suddenly in the midst of a Mexico
City traffic circle, with no humor in the situation, for I was up there trying to do
this jazz I’d practiced nearly all day, there were friends I’d invited to join me, and
the musicians I’d begun to know. I was on a bucking bronco of my own body’s
doings, situated in the midst of these surrounding affairs. Between the chord-
changing beat of my left hand at more or less regular intervals according to the
chart, the melodic movements of the right, and the rather more smoothly
managed and securely pulsing background of the bass player and drummer, there
obtained the most alienative relations. (Sudnow 2001:33)
90
The gap between motor intentionality and motor ability led to a music that “was
literally out of hand (Sudnow 2001:35).” It also led to Sudnow shying away from
further public performances for a period of several years.
Dreyfus notes that the performer normally responds to the newly discovered
enormity of the task at hand by adopting a “hierarchical perspective,” and by
deciding upon a route that “determines which elements of the situation are to be
treated as important and which ones can be ignored (Dreyfus 1996:14).” In
short, the task is again reduced to individual components. But unlike the concrete
components of activity that constitute the “context-free” features of the “Novice”
stage, the components of the “Competence” stage are rather more context-
bound:
The competent performer thus seeks new rules and reasoning procedures to
decide upon a plan or perspective. But these rules are not as easily come by as
the rules given beginners in texts and lectures. The problem is that there are a
vast number of different situations that the learner may encounter, many differing
from each other in subtle, nuanced, ways. There are, in fact, more situations than
can be named or precisely defined so no one can prepare for the learner a list of
what to do in each possible situation. Competent performers, therefore, have to
decide for themselves what plan to choose without being sure that it will be
appropriate in the particular situation. (Dreyfus 1996:15)
For Sudnow, the plan was to work towards a “melodic intentionality” by
extending in practice his acquired embodied knowledge of isolated chords to
patterned sequences of chords, as well as sequences comprised of the individual
91
notes that those chords contain. Not coincidentally, this plan was decided upon
without input from his teacher, or guidance from “texts and lectures”:
At first, and for some time, this was a largely conceptual process. I’d think: “major
triad on the second note of the scale, now again,” then “diminished on the third
and a repeat for the next,” doing hosts of calculating and guidance operations of
this sort in the course of play. (Sudnow 2001:43)
And in due course, gestalts began to emerge at the level of the sequence, rather
than appearing solely at the level of the event:
A small sequence of notes was played, then a next followed. As the abilities of my
hand developed, I found myself for the first time coming into position to begin to
do such melodic work with respect to these courses. (Sudnow 2001:43)
The emergence of these gestalts is more or less equivalent to what Sudnow
describes as “the emergence of a melodic intentionality”:
... an express aiming for sounds, was dependent in my experience upon the
acquisition of facilities that made it possible, and it wasn’t as though in my prior
work I had been trying and failing to make coherent note-to-note melodies.
Motivated so predominantly toward the rapid course, frustrated in my attempts to
reproduce recorded passages, I had left dormant whatever skills for melodic
construction I may have had. The simplest sorts of melody-making entailed a
note-to-note intentionality that had been extraordinarily deemphasized by virtue
of the isolated ways in which I’d been learning.
It’s precisley in this emerging capacity to form fully articulated phrases that the
performer achieves a degree of competence. Though not yet a native speaker of
92
the language, there is nonetheless a fledgling facility for forming coherent
sentences.
Dreyfus’ chracterization of the “Proficient” stage is particularly interesting in
terms of the Heideggerean opposition between the present-at-hand and the
ready-to-hand:
Suppose that events are experienced with involvement as the learner practices his
skill, and that, as the result of both positive and negative experiences, responses
are either strengthened or inhibited. Should this happen, the performer’s theory of
the skill, as represented by rules and principles will gradually be replaced by
situational discriminations accompanied by associated responses. Proficiency
seems to develop if, and only if, experience is assimilated in this atheoretical way
and intuitive behavior replaces reasoned responses. (Dreyfus 1996:20)
These “situational discriminations” of “intuitive behavior” point explicitly to the
mode of “absorbed coping” that is definitive of the ready-to-hand. And it’s
precisely in the ready-to-hand that “experience is assimilated”; i.e., it is
embodied by the experiencing subject. With an increase in embodied skill, then,
there is also an increase in the ratio of ready-to-hand to present-at-hand modes
of engagement:
As the brain of the performer acquires the ability to discriminate between a variety
of situations entered into with concern and involvement, plans are intuitively
evoked and certain aspects stand out as important without the learner standing
back and choosing those plans or deciding to adopt that perspective. Action
becomes easier and less stressful as the learner simply sees what needs to be
achieved rather than deciding, by a calculative procedure, which of several
93
possible alternatives should be selected. There is less doubt that what one is
trying to accomplish is appropriate when the goal is simply obvious rather than the
winner of a complex competition. In fact, at the moment of involved intuitive
response there can be no doubt, since doubt comes only with detached evaluation
of performance. (Dreyfus 1996:21)
The “Proficient” stage is, however, still comprised of a generous quota of
moments characterized by a mode of “detached evaluation”; i.e., the present-at-
hand. And it’s interesting to note the way in which this can directly conflict with
“intuitive behavior”:
No sooner did I try to latch onto a piece of good-sounding jazz that would seem
just to come out in the midst of my improvisations, than it would be undermined,
as, when one first gets the knack of a complex skill like riding a bicycle or skiing,
the very first attempt to sustain an easeful management undercuts it. You struggle
to stay balanced, keep failing, then several revolutions of the pedals occur, the
bicycle seems to go off on its own, you try to keep it up, and it disintegrates. Yet
there’s no question but that the hang of it was glimpsed, the bicycle seemed to do
the riding by itself, and essence of the experience was tasted with a “this is it”
feeling, like a revelation. (Sudnow 2001:76)
What we see is the paradigmatic Heideggerean “breakdown”; the catalyst that
effects the shift from a ready-to-hand to a present-at-hand mode of perceiving
the task environment. The occurrence of such breakdowns is directly related to
the number and type of skills the performer has managed to assimilate in the
course of interactions with the environment up to the moment in question. Or,
more specifically, the occurrence of breakdowns is directly related to the number
and type of skills the performer has not managed to assimilate:
94
The proficient performer simply has not yet had enough experience with the wide
variety of possible responses to each of the situations he or she can now
discriminate to have rendered the best response automatic. For this reason, the
proficient performer, seeing the goal and the important features of the situation,
must still decide what to do. To decide, he falls back on detached, rule-based
determination of actions. (Dreyfus 1996:22)
What distinguishes the “Proficient” stage from the “Competent” stage is a
shift to a yet higher level of articulational scale. That is, from the level of the
individual phrase or sentence to the level of, perhaps, a discussion or argument.
What distinguishes the “Proficient” stage from the “Expertise” stage, however, is
the continuity of the discourse. A continuity that—in the case of proficiency—is
rendered discontinuous by the intrusion of breakdowns. Sudnow also uses a
linguistic analogy:
From a virtual hodgepodge of phonemes and approximate paralinguistics, a
sentence structure was slowly taking form, sayings now being attempted, themes
starting to achieve some cogent management. But at the same time, courses of
action were being sustained that faded and disintegrated into stammerings and
stutterings, connectives yet to become integrally part of the process. (Sudnow
2001:56)
It’s these “connectives”—“a way of making the best of things continuously
(Sudnow 2001:59)”—that gradually fall into place over the course of sustained
practice. With this falling into place, and with the embodiment of ever more
refined responses to the dynamical contingencies of the environment, the
occurrence of breakdowns—i.e. the solicitation of self-conscious thought, and the
catalyst of “stammerings and stutterings”—becomes increasingly seldom.
95
I’ve already suggested that a capacity for continuous intuitive interactional
response to environmental dynamics is definitive of what Dreyfus describes as
the “Expertise” stage. But Dreyfus also points to a greater refinement to these
responses than there is to the variety of responses that are typical during the
“Proficient” stage:
The expert not only knows what needs to be achieved, based on mature and
practiced situational discrimination, but also knows how to achieve the goal. A
more subtle and refined discrimination ability is what distinguishes the expert from
the proficient performer, with further discrimination among situations all seen as
similar with respect to plan or perspective distinguishing those situations requiring
one action from those demanding another. (Dreyfus 1996:25)
More specifically, he suggests that discriminating ability and a continuity of
response are necessarily linked criteria of expertise:
With enough experience with a variety of situations, all seen from the same
perspective but requiring different tactical decisions, the proficient performer
gradually decomposes this class of situations into subclasses, each of which share
the same decision, single action, or tactic. This allows the immediate intuitive
response to each situation which is characteristic of expertise. (Dreyfus 1996:25)
The lessons learned from breakdowns during the “Proficient” stage, then, have
enabled the expert performer to respond to the same conditions from which
those breakdowns emerged in a timely and unselfconscious manner. Actions are
perceptually guided, the perfomer is immersed in the activity, and the “I think” is
supplanted by an “I can”:
96
I’d see a stretch of melody suddenly appear, unlike others I’d seen, seemingly
because of something I was doing, though my fingers went to places to which I
didn’t feel I’d specifically taken them. Certain right notes played in certain right
ways appeared just to get done, in a little strip of play that’d go by before I got a
good look at it. (Sudnow 2001:76)
With the refinement of dispositional abilities, there also emerges a parallel
refinement of articulational fluency:
I could hear it. I could hear a bit of that language being well spoken, could
recognize that I’d done a saying in that language, in fact for the very first time, a
saying particularly said in all of its detail: its pitches, intensities, pacings,
durations, accentings—a saying said just so. (Sudnow 2001:78)
At this point in the discontinuous unfolding of skill acquisition, the performer
embodies perceptual, actional and cognitive capacities that, in suitable
performance circumstances, enable the experience of flow.
In light of the apparent discontinuities of skill acquisition, it may be worth
revising the diagram of figure 3.8, in which cognitive unfolding is indicated as
continuous over time. In figure 3.9, the temporal dimension is segmented into
discrete blocks corresponding to Dreyfus’ five stages of skill acquisition.
97
HUMAN BODY INSTRUMENT
SKILL
R
I
Time
1.
Novice
2.
Adv an c e d
beginner
3.
Competence
4.
Proficient
5.
Expertise
Figure 3.9. A detailed view of enactive performance practice,
encompassing the discontinuous unfolding of skill acquisition. “Skill” is
indicative of cognitive, motor and perceptual skills. It is also indicative of
the developing capacity for coordination between all three. I represents
the map from human intentionality to the instrument, while R represents
the map from the instrument’s reactions back to the human.
“Skill” replaces “Cognition” in this diagram, where “skill” can be said to
encompass cognitive, motor and perceptual skills, as well as the capacity for
coordination among the three components in both reflective and unreflective
behavior. A more accurate model yet might indicate the changing nature of
human body/instrument relations over each of the five stages of skill acquisition,
but as it stands, the diagram of the continuous and circular human/instrument
interaction loop is sufficiently general to be applicable at each of the stages.
98
Sudnow’s account in Ways of the Hand is representative of what I have
termed an enactive performance practice. But there is nothing particularly
extraordinary about the way in which his skills were acquired. Given an able body
(and therefore an innate capacity for perception, action and cognition), an
intentionality (e.g. to become an improvising jazz pianist, to produce coherent
sequences of notes, etc.), and a sufficiently responsive instrument (e.g. a piano),
any human subject might follow an analagous course. In Sudnow’s case, these
three prerequisites to enactive performance practice came for free. But my
argument has been that in the case of performance with digital musical
instruments, something fundamental is missing; i.e. a sufficiently responsive
instrument. A sufficient responsiveness is synonymous with what I have referred
to as resistance. And it’s precisely the kind of resistance that an instrument
affords to the intentioned, embodied agent that will determine whether or not
that instrument has the kind of immanent potential that would lead to an
enactive performance practice. Kinds of instrumental resistance, then, will be a
major focus when the discussion turns to issues of implementation in Chapter 4.
3.6 Conclusion
I began this chapter with a discussion of the inevitable paradox in any description
of direct experience. The model of enactive performance practice—an attempt at
such a description—brings the discussion squarely back to this fundamental,
instinctive, and largely unreflective way in which humans, through the agency of
99
their bodies, relate to the world. This raises the question: if unreflective behavior
is so fundamental to human experience, why go to the trouble of detailing so
many of its particularities? Why not let that which will happen as a matter of
course, happen as a matter of course?
Both Heidegger and Merleau-Ponty viewed their work as opposed to the
mechanistic underpinnings of canonical Western philosophy. In their respective
analyses of mundane, everyday, unreflective activity, there is an agenda to
replace the Cartesian model of subjectivity with that of the embodied agent at
large in the world. I suggested, earlier in the chapter, that a reversal of the
Cartesian axiom constitutes the first concern of the phenomenological project.
The mechanistic and the phenomenological discourses, then, are fundamentally
at odds. And to the extent that technical discourse continues to hinge on the
discourse of mechanistic philosophy, it also continues to be resistant to
phenomenology. My concern, then, has been with outlining a model of human
experience and activity that serves as an alternative to the model routinely
adopted by technical designers, i.e. that of the perpetually disembodied Cartesian
subject. If it is in fact possible to design and build digital musical instruments
that allow for enactive processes to be realized, then we will have done nothing
other than arrive right back at the most fundamental form of human agency.
100
4 Implementation
4.1 Kinds of Resistance
There are two key assumptions that underlie the enactive model of interaction: 1.
that human activity and behavior has rich, structured dynamics, and 2. that the
kinds of resistance that objects offer to humans in the course of activity are key
to the on-going dynamical structuring of interactional patterns. In the previous
chapter, I was concerned with describing the interactional patterns of an enactive
performance practice with a view to the implications of those patterns for
cognition. Focus was directed at the dynamics of human activity and behavior. In
this chapter, focus is directed at the kinds of resistance that a candidate digital
musical instrument might offer to a human performer in the midst of
performative activity. The underlying concern, then, shifts from theory to
implementation.
I have suggested previously in the essay that conventional acoustic
instruments, because of the resistance they offer to the performer, serve as
useful examples of technical objects that embody the potential for enaction. But
in the huge diversity of mechanisms that we see across the range of acoustic
101
instruments, there is a proportionate diversity in kinds of resistance. The physical
feedback to the performer that arises in the encounter between bow and string,
for example, is of a different kind to that which comes of the projection of breath
into a length of tubing. We can assume, then, that in much the same way that
the contingencies of human embodiment play a determining role in the dynamical
emergence of performative patterns, so too do the contingencies of instrumental
embodiment. This makes the task of arriving at a universal template for the
design of enactive musical instruments a profoundly complex, if not obviously
impractical undertaking.
In the various models of interaction that I schematized in the previous
chapter, the maps from human motor function to computer input devices, and
from computer output devices to human sensory input, are non-specific in terms
of the particular sensorimotor mechanisms that are activated in the course of
interaction—the models are intended to be as general and universal as possible.
But as soon as we move from interaction diagrams to real world
implementations, a higher degree of specificity is required. If, for example, a
candidate model for an enactive digital musical instrument were to remain
general, there would need to be an account of the myriad ways in which human
energy might be transduced as signals at the computer inputs. In the context of
the present study, rather than attempting to compile a comprehensive catalogue
of implementational possibilities, I will focus on one particular real world
implementation: a digital musical instrument that also happens to represent my
first serious attempt at engaging the essay’s key theoretical issues in the form of
an actual device. This device, as with any musical instrument, offers unique kinds
102
of resistance to the performer. The final component of the study sets out, then,
to detail the instrument’s implementational specifics, with a view to the various
ways in which its indigenous and particular kinds of resistance may or may not
lend themselves to the development of an enactive performance practice.
Standard human-computer interaction models partition the computer into
three distinct layers: input devices, programs and output devices. This is the
model I employed in the interaction diagrams of Chapter 3, and I will stick with
that model here. It would seem likely, when the core concern is how the
candidate instrument is resistant to the human performer, that the greater
portion of attention would be directed towards input and output devices, i.e.,
hardware. It is at the level of hardware, after all, that the performer actually
physically engages the instrument. But as I pointed out in Chapter 1, digital
instruments constitute a special class of musical devices: their sonic behavior is
not immanent in their material embodiment, but rather, must be programmed.
So, while hardware certainly constitutes more than a passing concern, the
dynamical behavior and resistance of the instrument is to a large degree
encapsulated in its programs. In the pages that follow, I will, therefore, direct a
significant amount of attention to issues of software.
In persisting with the standard division between hardware and software, I
hope also to demonstrate the utility of keeping the two layers separate in the
design process. While I shall be discussing just one specific implementation, I
nonetheless hope to make it apparent that in maintaining a loose coupling
between hardware and software components, the potential for reusing those
components is increased. This is particularly true of software components, which
103
may at any time in the future need to be integrated into different
implementational contexts, such as a new hardware framework.1 In that case,
any one particular software framework brings with it a certain modest degree of
generality. And to the extent that the framework continues to evolve across
distinct implementations, we may also see the beginnings of—if not a universal
approach to the design of enactive digital instruments—one that is at least
suitably general and robust.
4.2 Mr. Feely: Hardware
Overview
A device that goes under the name of Mr. Feely represents my first attempt at
the implementation of an enactive digital musical instrument (figure 4.1).
1 For an interesting counter example to this approach, where hardware and software
may in a certain variety of cases be inextricable, see Cook (2004).
104
Figure 4.1. Mr. Feely.
105
Mr. Feely’s computational nucleus resides on a miniature x86 compatible
motherboard, running the Linux 2.6 kernel, with patches applied for low latency
audio throughput and for granting scheduling priority to real-time audio threads.
Eight channel audio A/D and D/A hardware, MIDI A/D and D/A boards, and power
conversion modules are located in the same enclosure as the motherboard. One
of the design goals was to create a silent instrument with no moving parts inside
the enclosure. For that reason, the operating system resides on flash memory,
and a specific motherboard/chipset combination was chosen because of its
capacity for fanless operation.
Integration and Instrumentality
Sukandar Kartadinata has used the term “integrated electronic instruments” to
denote a class of devices characterized by an encompassing approach to their
material realization (Kartadinata 2003). “Encompassing” is used here in its most
literal sense: all of the components of which the instrument is comprised—the
input devices, the output devices, and the internal circuitry—are encompassed
within a single physical entity. Kartadinata notes that total integration is not
ubiquitous among conventional acoustic instruments—e.g., the bow is a distinct
physical entity from the body of the violin—but “total integration” is not really the
point of an integrated approach. Rather, emphasis is placed on the coherence of
the instrument; that is, how the material embodiment affords a performative
encounter with a unity. This is in sharp contrast to the sprawl of individual
devices and cables that characterizes “the often lab-like stage setups built around
general purpose computers (Kartadinata 2003:180).”
106
Integration and coherence of the instrumental embodiment were important
factors in the design of Mr. Feely. From the outset, I had in mind that it was of
critical importance that the instrument should have an instrumentality. This is
suggestive of two different interpretations, both of which figured in my approach
to design, and both of which factor in the perceived coherence of the instrument
to the performer: 1. that the instrument in its material embodiment should be
indicative of a specific purpose, and 2. that the instrument should have the feel
of a musical instrument. It may appear redundant to suggest that an instrument
should be instrumental, but it seemed to me a useful way of distinguishing the
project from those in which the instrument comprises a general purpose, off-the-
shelf computer (with or without an attendant array of peripheral input devices).
Figure 4.2 shows Mr. Feely in the playing position. Because of the
instrument’s weight, it is secured on a stand, but designed to rest in the lap of
the performer.
107
Figure 4.2. Mr. Feely in the playing position.
The playing position ensures that there is constant physical contact between
performer and instrument. This aspect of the design is tied in—in the most literal
sense—with the aim that the instrument should feel like a musical instrument. In
the act of playing, the contact with the instrumental body is intensified by hand
actions at the control surface, as weight is transferred from the upper body to the
thighs. The sense of the instrument’s physically “being there” is, then,
proportional to the amplitude of the human’s motor energy output. But there is
another aspect to this “being there,” and this is tied in with the way in which the
instrument is indicative of its use. The control surface is situated at the
performer’s centre of gravity, and it is angled (with respect to the performer) in
108
such a way that it presents itself optimally to the hands, and occupies the focal
ground in the field of vision. It’s not just that the instrument is “there,” but—to
paraphrase Michael Hamman—that it is “so very there” that the opportunity for
action, for physically engaging the controls, makes itself more than readily
apparent.
Control Surface
Unlike the computer-as-it-comes—a general purpose device—Mr. Feely is a
special purpose device. This means that the instrument is intended to be nothing
but a musical instrument, and that it therefore need not accommodate the
multiple representational paradigms required of a multiplicity of possible usages.
An important aspect, then, of the instrument’s instrumentality, is that the
interface is devoid of representational abstractions. In keeping with Rodney
Brooks’ dictum that “the world is its own best model (Brooks 1991),” I avoided
any graphical representations of the sound or its generating mechanisms at the
interface, giving preference to the performer’s perceptions of the sound itself,
and the cross-coupling of these perceptions with the tactile and visual
engagement of the instrument and its input devices. The way in which Mr. Feely’s
interface is different to that of the computer-as-it-comes, then, is equivalent to
the “considerable difference between using the real world as a metaphor for
interaction and using it as a medium for interaction (Dourish 2001:101).”
Three classes of input device are used on Mr. Feely’s control surface: knobs,
buttons and joysticks. The control surface is partitioned into distinct regions
(figure 4.3), which are distinguished by the points in the audio synthesis system
109
to which they are linked. I will detail the specific functional behaviors and
mapping strategies used to connect the input devices to the audio system in 4.3.
It is, however, worth noting the control surface’s basic partitioning scheme in this
section. Although this unavoidably touches on software issues, the functional
layout of the panel is a hardware concern.
Display &
Patch ControlM u t e
Buttons
P o w e r
On/Off
G l o b a l
Volume
Variants
Joysticks
C h a n n e l
Section
G l o b a l
Section
Figure 4.3. Mr. Feely: Control surface partitioning scheme.
110
Of the eight distinct regions that comprise the control surface, four would
ordinarily be utilized only between periods of performative activity: those labelled
“Display & Patch Control,” “Mute Buttons,” “Global Volume,” and “Power On/Off”
in figure 4.3. The Display and Patch Control section is described under Visual
Display below; the functions of the other three sections are self-explanatory. The
four remaining control surface regions—labelled “Channel Section,” “Global
Section,” “Joysticks,” and “Variants” in figure 4.3—indicate the areas in which
activity is focused during performance.
The Channel Section is partitioned into five discrete channels of three knobs
and one button each; there are respectively mapped to five discrete audio
synthesis networks in the software system. The Global Section is divided into two
subsections, which respectively comprise nine knobs, and three knobs combined
with three buttons. These controllers are mapped to a global audio processing
network, and in certain cases to points in the five discrete channels. Signals from
each of the five discrete synthesis channels are passed as inputs to this
processing network. The Joystick Section is comprised of two x-y joysticks, one
of which springs back to its centre position when not in use. These joysticks are
considered freely assignable to any and multiple input points in the discrete
synthesis channels or the global processing network. The Variants Section is
comprised of six backlit buttons. When one of these buttons is toggled on, all
other buttons will be in their off state. These buttons are used to switch between
pre-stored variants in the synthesis network. These variants may differ by
synthesis parameter settings, by mapping functions, or by synthesis network
topologies.
111
With all these individual input devices and multiple mapping systems, it would
seem that the performer has rather a lot to remember during performance. And if
the performer is required to store such data in conscious memory, then the
instrument is not, in itself, properly or sufficiently indicative of its use. This is not,
however, how things work in practice. Firstly, by partitioning the control surface
into functional regions, the user quickly adapts to the relationship between a
cluster of controls and clusterings of associated behavioral patterns at the
instrument’s output. The physical layout of the control surface, then, reinforces
the relationship between specific functional regions and specific functional
behaviors to both the visual and tactile senses. Secondly, by employing a static
functional structure across different patches—that is, across varying
implementations of the underlying audio synthesis networks—the patterning of
the instrument’s behavior remains relatively constant. This means that motor
patterns do not need to be relearned from scratch from one patch to the next; in
fact they should be optimally adaptable, from a base set of functional
correspondences, across even radically divergent implementations of the sound
generating subsystem.
The performer, then, is not required to store a catalogue of controller
functions and mappings in conscious memory, but rather learns through
performing. The layout of the control panel is designed to facilitate this learning
process. The emphasis is placed on motor memory as opposed to the conscious
storing of data, and the underlying software system is designed in such a way
that motor memory should be transferable and adaptable across varying audio
subsystem implementations. The control surface is still, as a whole, sufficiently
112
complex and multifaceted as to offer resistance to learning. It was my aim that
the degree of resistance should be neither so minimal that the interface would
become quickly transparent to motor memory and activity, or so great that, even
after a significant amount of practice, it would remain beyond grasp.
Visual Display
In chapter 2, I discussed the cost to the nonvisual senses of the visuocentric
approach to interaction as typified by the computer-as-it-comes. This is
something that I tried to avoid in the design of Mr. Feely, not only with a view to
minimizing the cognitive demands of visual attention, but with a view to
rendering the interface as free of abstraction as possible. It proved useful,
however, to integrate a character display with the control surface, which is used
to navigate a patch bank between performances, and to monitor data in the case
of “breakdowns” (e.g. program exceptions, memory errors, CPU overload, etc.).
The display is not intended to be used during performance, except as a
notification mechanism in the case of such a breakdown. It does not, therefore,
make any demands on the performer’s attention, and to the extent that vision is
required for the performance task, it may be directed to the guidance of motor
activities.
Audio Display
An important aspect of the “feel” of many conventional acoustic instruments is
the haptic feedback to the performer from the instrument’s vibrating body as it
radiates sonic energy. Unlike conventional acoustic instruments, electronic
113
instruments require the use of amplifiers and loudspeakers in order to propagate
sound in space. Except in the case that the amplifier/loudspeaker system is built
into the instrumental body, electronic instruments are lacking in the haptic
vibrational feedback that is characteristic of their acoustic counterparts.
This issue was taken into consideration in the design of Mr. Feely, but
unfortunately, in deciding upon an amplifier/loudspeaker system, it was
outweighed by other constraints: 1. that the amplifier be powerful enough for the
instrument to be used without further amplification (e.g. through a P.A. system),
and 2. that the loudspeaker should have a wide radiation pattern. This limited the
options among available technologies, and resulted in the choice of a combined
amplifier/loudspeaker system that, because of its size and weight, could not
practically be integrated with the body of the instrument. Nonetheless, by careful
positioning of the amplifier/loudspeaker in performance, it’s possible to go a
certain way towards the “feel” of a conventional instrument. By placing the
amplifier/loudspeaker on the floor, as close as is practical to the body of the
instrument, the radiation of vibrational energy can be felt through the feet and,
to a lesser extent, the torso. The effect varies with the character of the sound, its
frequency and loudness, the type of floor surface, and the type and number of
reflective and absorbtive material in proximity to the loudspeaker. This speaker
placement has one other advantage: the location of the point source of the
sound—which, in the case of the great majority of acoustic instruments, is the
instrument’s body—is as close as is practical to the body of the instrument. The
perceptual localisation of the origin of the sound is an important indicator of the
114
instrument’s phenomenal presence, both for the performer, fellow performers,
and the audience.
Summary
It would be premature to evaluate the ways in which Mr. Feely offers resistance
to the performer without having paid due attention to software. Nonetheless, it
may be useful to recap on the key aspects of the hardware implementation, and
to point to some implications for embodiment, and for the emergence of an
enactive performance practice.
Firstly, the instrument is integrated and instrumental. This means that the
performer engages an instrument that has a functional coherence to its material
embodiment as well as a tangible physical presence in performance. These
factors contribute to the potential for an encounter with the instrument that is
engaging (one of the five criteria of embodied activity from Chapter 1). Secondly,
the instrumental interface affords distributed motor activities without the burden
of representational abstractions. The interface is, then, motocentric rather than
visuocentric, and encompasses multiple distributed points of interaction. This
stands in contrast to the visuocentric, representation-hungry, singular (as
opposed to distributed), and sequential (as opposed to parallel) mode of
interaction that is idiosyncratic to the computer-as-it-comes. At the same time,
then, that the hardware interface to Mr. Feely avoids the interface model of the
computer-as-it-comes, it also avoids the associated costs of that model for
interaction. Whereas the computer-as-it-comes would situate the user’s attention
in a world of metaphorical abstraction and would provide no guarantee of
115
meeting timing constraints (see 2.4), Mr. Feely situates the user’s attention
directly within the activity, encourages the parallel distribution of the activity
across distinct sensorimotor modalities (touch and proprioception, hearing,
vision), and—because of the distributed and multiply parallel nature of the
performative mode—offers a reasonable chance that the real-time constraints of
musical performance might be met. These factors again correspond to certain of
the five criteria of embodiment; specifically, that embodied activity be situated,
multimodal, and timely.
When the focus is shifted from the instantaneous aspects of embodied activity
to embodiment as an emergent phenomenon, we touch on issues of adaptation
and cognition. Such issues are tied in with the instrument’s behavior; i.e., with
the resistance that it offers to the performer, and the unique dynamical
patterning of thought and activity that comes of that resistance. As a piece of
hardware, Mr. Feely affords embodied modes of interaction. But to get from
interaction to realization—i.e., to the emergence of an enactive performance
practice—the instrument will be required to offer resistance to the performer
through the medium of sound. This brings the discussion around to the
implementation of the instrument’s sonic behavior in software.
116
4.3 Mr. Feely: Software
Overview
Mr. Feely’s software system is written in the SuperCollider programming
language.2 The language was chosen for three main reasons: 1. it is mature and
offers a rich set of built-in features, 2. it is easily extensible with user-defined
modules, primitives, and plug-ins, and 3. it is object-oriented. As the main focus
of my work has been directed at the creation of a system that would allow for
dynamical behaviors, much of the task of programming has involved the
incremental development of a framework—an integrated library of extensions to
the language—that augments the base audio synthesis architecture with modules
that allow for complex dynamical mappings between system entities. The
implementational possibilities of these extensions to the language will comprise
the main focus of this and the next section. First, however, it will be useful to
describe the base architecture on which the framework is built.
SuperCollider Server Architecture
The SuperCollider audio synthesis engine passes signals between nodes on a
server, where those nodes represent instances of user-defined synthesis and
processing functions. A sample signal flow diagram would look familiar to
anybody who has worked with modular synthesis systems (figure 4.4).
2 http://www.audiosynth.com.
117
NODES
SIGNALS
SOUND
Figure 4.4. SuperCollider synthesis server: Signal flow.
A node on the synthesis server may contain parameter slots. For example, a
node that represents an oscillator function may contain slots for frequency, phase
and amplitude parameters. The values of a parameter slot may be set by sending
messages to the node to which the slot belongs, or by mapping the parameter
slot to the output of a bus (figure 4.5).
BUS
MESSAGE
SLOTS
Figure 4.5. Writing values to a node’s parameter slots by 1. sending a
message, and 2. mapping the slot to the output of a bus.
118
A bus is a virtual placeholder for a signal. It’s possible, for example, to tap an
output signal from any node in the synthesis network and route it to a bus, from
which the signal could be rerouted as an audio signal input to any other node, or
mapped to a parameter slot belonging to any other node (figure 4.6).
1 2
BUS 1
BUS 2
Figure 4.6. Signal routing between parallel synthesis networks using
busses. Bus 1 taps an output signal from a node in the first channel and
routes it to the audio input of a node in the second channel. Bus 2 taps an
output signal from a node in the second channel and maps it to a
parameter slot of a node in the first channel.
SuperCollider’s bussing architecture allows for the flexible routing of signals
within the synthesis network. This flexibility is exploited and extended in the
119
extensions to the language that form the basis of Mr. Feely’s mapping
framework.
Mapping Framework
The mapping framework that I have developed for Mr. Feely is primarily
concerned with providing a flexible and intuitive mechanism for routing signals
between components of the audio synthesis network, and for defining functional
mappings between them. A functional mapping can be taken to mean the
transfer function from the output of one component to the input of another. That
is, a function that is applied to the signal such that the signal’s characteristics are
transformed between output at the source component and input at the receiver
component. The mapping framework consists of a hierarchical library of such
functions encapsulated within discrete software objects. The behavior of the
instrument as a whole is in large part determined by these functions and their
various mappings and routings within the audio synthesis network.
As I noted in the previous section, any signal within the audio synthesis
network may be routed to a bus, and rerouted from that bus to any other point in
the network. In Mr. Feely’s mapping framework, the functional transformation of
the signal takes place between the bus and the signal’s destination. The objects
that perform these transformations comprise the mapping layer. The mapping
layer allows for the flexibility to route the signal at a single bus to multiple
destinations with multiple functional mappings (figure 4.7). This is an example of
a “one-to-many (Wanderley 2001)” mapping model.
120
BUS
x
y
z
Mapping Layer
Figure 4.7 The signal at a bus is split into three signals. These signals are
routed to three different parameter slots, effecting a one-to-many
mapping. Each signal is subject to a functional transformation (those
transformations denoted here as x, y and z) between the bus and their
respective parameter slot destinations. The software objects that perform
these transformations comprise the mapping layer.
The mapping framework also allows for the “cross-coupling (Hunt, Wanderley,
and Paradis 2003)” of bus signals, or “many-to-one (Wanderley 2001)” mappings
(figure 4.8).
x
y
BUS 1
BUS 2
Figure 4.8. The signals at two busses are subject to functional
transformations (x and y). The transformed signals are summed, or
121
“cross-coupled,” resulting in a mapping from multiple signal sources to a
single parameter slot.
Additionally, the mapping framework allows for what I have termed “function-
parameter” mappings, where the output of one functional mapping may be
mapped into a parameter slot in another (figure 4.9).
x
y
BUS 1
BUS 2
Figure 4.9. The signals at two busses are subject to functional
transformations (x and y). The output of function x is mapped into a
parameter slot in function y. The output of function y is mapped to a
parameter slot in an audio synthesis network component.
For example, function x in figure 4.9 might scale the output of the signal at BUS
1 into the range [1,10]. Function y might multiply the output value of the signal
at BUS 2 by the value of an argument, where that argument is set at a
parameter slot. When the output of x is mapped into the parameter slot that
corresponds to the multiplicand argument of y, the signal at BUS 2 is multiplied
by the scaled signal at BUS 1. The output of the dependent function y is then
mapped to a parameter slot in an audio synthesis network component. This is a
122
simple example, but it makes clear the kinds of complex interdependencies
between system components that “function-parameter” mappings allow.
Mr. Feely’s hardware controls are connected to the audio synthesis network
through busses (figure 4.10).
ADC
x
y
z
BUS
Figure 4.10. The map from hardware to software. Analog signals are read
by an analog-to-digital converter (ADC) and written to a bus in the audio
synthesis network. The signal at the bus may be treated as though it were
any other signal.
While all busses in the audio synthesis system are instances of a single class of
bus, and therefore have identical implementations, they are nonetheless
classified as having either local or global scope. All busses that are placeholders
for signals routed from audio signals have global scope, and can be routed to any
point in the synthesis network. Busses that are placeholders for signal arriving
from Mr. Feely’s hardware controls, however, are accorded either local or global
scope, depending on the particular input device to which they are connected. In
this scheme, the scope of a bus corresponds to the function of the input device as
defined by the partitioning of Mr. Feely’s control surface into functional regions.
123
Channel Section controllers, for example, are connected to busses that have local
scope within each of the five discrete audio synthesis network channels, while
Global Section controllers are connected to busses that have global scope (figure
4.11).
1 2
L1.1
L1.2
L1.3
L2.1
L2.2
L2.3
GLOBAL
G1
G2
Figure 4.11. Local and global scope of busses. Busses L1.1-3 and L2.1-3
are connected to Channel Section controllers on Mr. Feely’s control panel.
Their scope is local; i.e., they may only be routed to the corresponding
audio synthesis network channels, 1 and 2. The output of these audio
synthesis channels is summed and sent to a global processing network.
124
Global busses G1 and G2 are connected to Global Section controllers on
Mr. Feely’s control panel. The scope of these busses is global; i.e., they
may be routed to the global processing network, or to any of the discrete
audio synthesis channels.
Busses have a special status in the mapping framework. They are
placeholders for signals that originate both outside and inside the audio synthesis
network, and therefore represent the points at which human action and internal
mechanism coincide. It was a deliberate design choice to accord busses this dual
role, as a transparency to the source of signals within the system effectively blurs
the implementational boundary between human and instrumental behaviors. That
is to say, signals are treated as equivalent whether their origins are external or
internal to the system, and this equivalency of signals implies that all signal flow
networks are formed at the same level of structure. The “push-and-pull” of
dynamical forces that is key to the instrument’s resistance, then, is encapsulated
in the structure and behavior of a single integrated signal flow network.
To this point, the simple mapping schemes I have illustrated have not
demonstrated models of dynamical behavior. The only difference, for example,
between the mapping scheme of figure 4.11 and that of a linear summing mixer
is that the bussing architecture in the figure shows the possibility of a flexible
routing of controls signals to individual parameter slots in the various mixer
channels. The dynamical behavior of the system as a whole would, nonetheless,
appear to be relatively flat. Consider a system, however, where the outputs from
125
two discrete audio synthesis networks are routed to global busses, and then back
to parameter slots within the discrete networks (figure 4.12).
1 2
A1 A2
GLOBAL
yx
Figure 4.12. Discrete audio synthesis networks are coupled to form an
interacting composite network. Global busses A1 and A2 serve as
placeholders for the output signals of channels 1 and 2 . These signals are
transformed by functions, x and y, and the continuous outputs of those
functions are routed to parameter slots in the discrete channels. The
output of channel 1, after underdoing functional transformation, is used to
regulate the internal behavior of channel 2, and vice versa.
126
In this example, the output of channel 1 is routed back to a parameter slot in
channel 2, and vice versa. The output signals from the two channels, then, rather
than being summed (as in figure 4.11), could be used to regulate one another’s
behavior. The structure of the network—i.e., its topology—creates a coupling
between the two discrete audio synthesis networks; where they had previously
formed uncoupled autonomous systems, they now form coupled nonautonomous
systems. The way in which the bussed signals act as regulatory mechanisms in
the respective synthesis networks is defined by the mapping functions, indicated
in figure 4.12 as x and y. These functions might encapsulate any number of
behaviors. They might, for example, map the audio signal unaltered into the
parameter slot, scale the audio signal to an effective range, track the signal’s
frequency or amplitude characteristics, scale it to an effective range and map the
resulting signal to the slot, and so on. Any of these choices would create the
possibility for complex behavioral dependencies between the two synthesis
networks, and at the same time, the possibility for nonlinear dynamical behaviors
in the composite (coupled) system.
Summary
From the perspective of either of the discrete networks in figure 4.12, internal
behavior is nonautonomous; i.e., behavioral patterns are determined in part by
signals that originate outside the network. From the perspective of a human
observer, however, the composite network (comprised of the two interacting
subnetworks) could be said to be autonomous, as it operates, and exhibits
behavior, without human intervention. This presents an interesting design
127
problem: we want the instrument to have rich, structured dynamics, but at the
same time, we want those dynamics to emerge in the coupling of the instrument
to a human performer. So, although we could engineer a system that exhibits
dynamical behavior without human involvement, the kind of system that is more
compelling with a view to enactive performance practice would be one that,
rather than exhibiting autonomous dynamical behavior, embodies the potential
for dynamical behavior when coupled to a human performer. This does not rule
out the kind of model encapsulated in figure 4.12. In fact, this model forms the
basis of the first usage example I will outline in the next section. It does,
however, call for calibration of the system—a “tuning” of the system’s dynamical
responsiveness—when human action enters the equation.
In summary, then, the mapping framework allows for the creation of complex
interdependencies between system components. And these interdependencies
are key to the “push-and-pull” dynamics that define the instrument’s kinds of
resistance. But the question remains as to how one might go about calibrating
the system in such a way that it requires human action; i.e., such that when
there is a “push-and-pull” of physical forces at the hardware layer, the
instrument responds and resists with proportionately rich and varied sonic
behavior. I’ll take up this issue by outlining two specific usage examples.
128
4.4 Mr. Feely: Usage Examples
Overview
In this section I outline two examples of Mr. Feely in use. At the present writing,
the first model is in an early stage of development, while the second is relatively
mature. I have chosen these specific examples because of their differences. Or,
more specifically, because their differences illustrate the ways in which diverse
implementations might highlight distinct facets of a single basic concern: enactive
performance practice. The two usage examples are interesting, then, because
they point to different kinds of resistance, to different modes of embodied
activity, and to different realizational potentialities.
Example 1: Pushing the envelope
Figure 4.13 illustrates an extension of the interacting composite network of figure
4.12. As in figure 4.12, the output of channel 1 is mapped via a global bus to a
parameter slot in channel 2, and vice versa, in such a way that the two discrete
audio synthesis networks regulate one another’s behavior in a manner
determined by the output of the functions x and y. The example in figure 4.13
departs from that of figure 4.12, however, through the addition of two local
busses, L1.1 and L2.1.
129
1 2
L1.1 L2.1A1 A2
GLOBAL
x y
ba
Figure 4.13. Functional covariance. Local busses L1.1 and L2.1 are
placeholders for signals from Mr. Feely’s Channel Section. These signals
are mapped to parameter slots of mapping functions internal to the
composite audio synthesis network. This is an instance of “function-
parameter” mapping, where the output of function a serves as a
continuous input, or argument, to function x, and the output of function b
serves as a continuous input to function y.
The local busses L1.1 and L2.1 provide the effective point of access to the
system for human action. Rather than being mapped to parameter slots in the
nodes that comprise the synthesis network, these busses are mapped to
parameter slots of mapping functions that are internal to the system; i.e., they
130
represent “function-parameter” mappings. The way in which the output signals of
the coupled channels regulate one another’s behavior, then, is largely determined
by the functional mapping from the local busses to the parameter nodes of the
global busses, and is covariant with human action.
This network of mappings forms the basis of a performance scenario I’ve
developed for Mr. Feely that goes under the working title “pushing the envelope.”
The mappings illustrated in figure 4.13 represent just a partial view of the entire
system, which utilizes five discrete audio synthesis networks and assigns three
local busses to each network, corresponding to the five channels of three knobs
that comprise Mr. Feely’s Channel Section. The two busses per channel that are
not shown in figure 4.13 are mapped to various parameter nodes in the
respective discrete audio synthesis networks. These mappings vary across
different implementations of the basic system, but in all instances map into
continuous ranges as suitable to the synthesis parameter in question. The
functional mappings from the local busses L1.1 and L2.1—the busses that are
shown in figure 4.13—are key to the dynamical responsiveness of this particular
network. It’s their role that I will focus on here.
In the “pushing the envelope” model, the functions x and y (figure 4.13)
represent composite functions: amplitude followers (on the signals at A2 and A1
respectively) modulated by the output of a logistic mapping function:
xn+1 = µxn(1 - xn)
131
The outputs of x and y are connected as level controls at the output stage of
channels 1 and 2 respectively, effecting a coupling between the two channels.
The logistic mapping function is interesting because the trajectory of its orbit
varies with different values of the variable µ. It represents a simple nonlinear
system, the response of which becomes increasingly chaotic when the value of µ
is greater than 3, and is entirely unstable when µ is greater than 3.87 (assuming
values of x in the range [-1, 1]). The mapping functions x and y, then, already
embody the potential for complex dynamical behavior, where the dynamical
contour of the modulated signals derived from A2 and A1 may be more or less
chaotic or “flat” depending on the assignment of a constant value to µ.
The functions a and b in figure 4.13 represent the slope (rate of change) of
the signals at busses L1.1 and L1.2 respectively. The amplitude of this function’s
output will vary proportionately with the rate of performer activity at the
hardware controls—i.e. the corresponding knobs in Mr. Feely’s Channel Section—
that are connected to the bus. This means, essentially, that the more “active” the
activity, the greater the amplitude of the resulting signal.
The parameter slots in the mapping functions x and y (figure 4.13) represent
the variable µ in the logistic mapping function. This creates for a potentially very
interesting mapping. As the outputs of a and b are effectively plugged into µ, the
dynamical contour of the outputs of x and y are directly proportional to the rate
of performer activity. The effective ranges of a and b are scaled to a dynamically
rich range in µ (between 2.9 and 3.87; a range that encompasses the
discontinuous transition from flat to chaotic dynamics through successive period
doublings), which results in the system as a whole having response
132
characteristics that vary dynamically with the “push-and-pull” of human motor
actions. For example, an increase in the rate of left-right knob “twiddling” with
respect to time (figure 4.14) will result in a proportionate increase in the “degree
of chaos” in the outputs of functions x and y.
TIME
Figure 4.14. Left-right knob manipulation with respect to time.
In practice, the “pushing the envelope” model has certain interesting
implications for performance. Firstly, because of the way the system is
calibrated—specifically the “tuning” of the logistic map variable µ in relation to
the rate of change of motor activity—it requires a performer; i.e., without
performer action, the response of the system is dynamically flat. Secondly, the
system requires considerable physical effort on the part of the performer to elicit
dynamically rich responses from the software system. To that extent, the system
doesn’t just require the performer, it requires a considerable investment of
performative energy. Thirdly, the behavior of the system as a whole is far from
133
transparent at first use, and in fact demands significant experimentation before
certain consistent patterns and responses begin to reveal themselves. The
complexity of the system’s dynamical responsiveness is effectively guaranteed by
the interdependencies of the five discrete audio synthesis networks, as
encapsulated in the functional mappings from outputs in one channel to
parameter nodes in another. The key implication of these interdependencies is
that performative actions directed toward a single channel of controls will have
consequences beyond the scope of the discrete audio synthesis network to which
those controls are connected. That is to say, although the performer may place
the focus of activity at any one moment within a specific channel—and the
human anatomical constraint of two-handedness tends to determine this kind of
pattern in performance—the effects of that activity will nonetheless be felt
throughout the composite network comprised of all five channels.
In my experience thus far with this system, I’ve found that it’s not possible to
get an overall conceptual grasp on its range of behavior, and particularly on the
way that dynamical changes propagate through the composite network.
Nonetheless, certain recurrent patterns of motor activity have begun to emerge,
and these patterns are yielding varieties of sonic responsiveness that, at the
same time that they continue to be more closely aligned to certain expectations,
also continue to yield new and often surprising dynamical contours.
Example 2: Surfing the fractal wave (at the end of history)
In certain respects, there are parallels in the dynamics of the “pushing the
envelope” network to the dynamics of many conventional acoustic instruments.
134
When there is no input of human energy, for example, the instrument’s response
is “flat.” And when human energy is transmitted to the system, the system’s
dynamical responsiveness is proportionate to the amplitude of that energy. There
is, then, a particular way in which the model requires the performer: it requires a
“pushing”—a directed expenditure of kinetic energy—to actualize the dynamic
potential that is immanent to the network.
The model I outline in this section—“surfing the fractal wave (at the end of
history)”3—embodies an altogether different kind of resistance and affords an
altogether different variety of motor activity. Where performance with
conventional acoustic instruments ordinarily requires a “pushing” of kinetic
energy into the instrumental mechanism in order to set things in motion, in the
“surfing the fractal wave” model, things are already in motion in the instrumental
mechanism. The mode of performance, then, is more concerned with giving
dynamical shape and contour to these motions; an “absorbed coping” that is
about the timely navigation of energy flows in the environment, rather than the
directed transmission of energy flows that originate in the body. Hence the
distinction between “surfing” and “pushing” analogies.
Patterns of motor activity in “surfing the fractal wave” are designed around
the asymmetry of “handedness” (Guiard 1987); i.e., dominant and non-dominant
3 The name is borrowed from the title of a 1997 Terence McKenna lecture
(http://www.abrupt.org/LOGOS/tm970423.html). My appropriation, however, has very
little to do with McKenna's original intention.
135
hands are afforded independent sub-tasks, but they cooperate in the
accomplishment of the larger task that those sub-tasks comprise. Kabbash,
Buxton and Sellen describe three characteristic ways in which the two hands are
asymmetrically dependent in select everyday tasks:
1. The left hand sets the frame of reference for action of the right. For example, in
hammering a nail, the left hand holds the nail while the right does the hammering.
2. The sequence of motion is left then right. For example, the left hand grips the
paper, then the right starts to write with the pen.
3. The granularity of action of the left hand is coarser than that of the right. For
example the left hand brings the painter’s palette in and out of range, while the
right hand holds the brush and does the fine strokes onto the canvas.
(Kabbash, Buxton, and Sellen 1994:418)
Each of these examples could be viewed as aspects of a single embodied
tendency; a tendency that is self-reinforcing across a wide range of activities and
over repeated performances. Kabbash et al. advocate the design of human-
computer interfaces that exploit the habitual ways in which humans tend to use
their hands in skillful activity. The “surfing the fractal wave” model heads in this
direction.
Figure 4.15 represents a partial view of the “surfing the fractal wave” network
model.
136
C1.1
C1.2
C1.3
Audio
Network
JSY
GLOBAL
JSX
1
C2.1
C2.2
C2.3
2
Audio
Network
LH RH
a b
SEQ
Figure 4.15. “Surfing the fractal wave” network model. The x and y
outputs of a joystick with global scope (JSX, JSY) are mapped to
parameter slots of a chaotic sequencer function (SEQ). The sequencer
sends a stream of timed triggers to parameters in each of five discrete
audio synthesis networks (for clarity, only two are shown). Local busses
(C1.1-3 and C2.1-3) read signals from the knobs in Mr. Feely’s Channel
Section. These controls “filter” the results of the mapping from the
sequencer stream to each of the discrete audio synthesis networks.
Joystick manipulations are always performed by the left hand. Knob
manipulations are in most instances performed by the right hand. Some
feedback networks, mapping functions and audio synthesis network
schemata have been omitted for clarity.
137
The diagram divides the network space into left hand and right hand regions. In
performance, the pads of the left hand fingers tend to “ride” the joystick, where
certain gestural patterns emerge in response to the dynamical properties of the
“function-parameter” mappings of the global busses JSX and JSY (placeholders
for continuous signals from the x and y axes of the joystick, respectively) into the
output of a chaotic sequencer (SEQ).4 The sequencer is calibrated in such a way
that its output is more or less stable when the values of the mapping functions a
and b are close to the centre of their effective ranges. In practice this means that
when the joystick is in its centre position (the resting position for a “spring-back”
style joystick), the sequencer clock outputs a steady stream of pulses, at a
medium tempo, with a regular and stable amplitude pattern (figure 4.16).
Time
Figure 4.16. Sequencer pulse stream when the joystick is in centre
(“resting”) position.
4 The "chaotic" sequencer function is not technically chaotic (in mathematical terms).
The designation can be taken to be qualitative.
138
The mapping functions a and b determine, however, that deviations in the x and
y axes of the joystick result in more complex behaviors in the pulse stream. The
parameter slot to which a is mapped represents a multiplication argument for the
sequencer’s clock frequency and base amplitude. An increase in the signal at
JSX, then—corresponding to a left-to-right movement across the joystick’s x
axis—results in an increase in the pulse stream’s frequency and amplitude (figure
4.17).
Time
JSX
L
R
SEQ
Figure 4.17. Sequencer pulse stream when there is a left-to-right
movement across the joystick’s x axis.
The parameter slot to which the mapping function b is connected represents a
chaotic variable in the sequencer function. In short, this single variable
determines two aspects of the sequencer’s behavior: 1. the degree of pulse
“nestedness,” and 2. the probability that successive values read from an internal
finite state machine are mapped to the amplitude of the pulse stream. An
increase in the value of both of these parameters (corresponding to a bottom-to-
139
top movement in the joystick’s y axis) results in an increase in the system’s
entropy, where pulse “nestedness” implies a greater likelihood of frequency
multiplication from one pulse to the next (and therefore a greater likelihood of
extra pulses being “nested” into the pulse stream), and where the irregularly
patterned output of the internal finite state machine incrementally encroaches on
the otherwise linear behavior of the amplitude mapping in the mapping function a
(corresponding to the left-to-right movement across the joystick’s x axis). Figure
4.18 adds a bottom-to-top movement in the joystick’s y axis to the left-to-right
movement in the x axis illustrated in figure 4.17. The output of the pulse stream
shows the trajectory towards a higher “degree of chaos” over time.
Time
JSX
L
R
SEQ
JSY
B
T
Figure 4.18. Sequencer pulse stream when there is a left-to-right
movement across the joystick’s x axis, and a bottom-to-top movement
across the y axis. The increase in the signal at JSY results in a greater
140
likelihood of “nestedness” in the pulse stream, and a greater likelihood of
irregularities in amplitude patterns.
The perceptual guiding of left-hand actions in “surfing the fractal wave” is
more integrated than figure 4.18 would suggest. While the joystick operates
across two degrees of freedom—the x and y axes—the performer does not break
the activity down into separate movements in two dimensions (as figure 4.18
would indicate). Rather, the performer guides the left-hand through singular
trajectories across a two-dimensional space. And it’s in these motions that a
“feel” develops for the sequencer’s stable and chaotic regions, the transitions
between then, and for the shift from greater-to-lesser and lesser-to-greater
degrees of event density with respect to time. But these motor patterns
constitute only one part of the coordinated left hand/right hand movements that
amount to “surfing the fractal wave.” And while it’s useful to break the activity
down into left and right hand sub-tasks, there can be no complete picture without
considering how these sub-tasks coordinate and cooperate.
The output of the chaotic sequencer is mapped to parameters in each of the
five discrete audio synthesis networks. While each of these networks
encapsulates different dynamical responses, there are strong symmetries
between their behaviors, and between the kinds of responses that right hand
actions might elicit from each of the networks. Each of the five synthesis
networks implements a resonator function, where the pulses that are mapped
into each of network serve as excitors. These resonators embody different
resonance models (with different dynamical responses), but there are certain
141
perceptual constants from one network to the next. Figure 4.19 shows the
mapping from local busses to two of the five discrete audio synthesis channels.
C1.1
C1.2
C1.3
Audio
Network
1
C2.1
C2.2
C2.3
2
Audio
Network
GATE
WIDTH
RESONANCE
Pulse Stream
3,4,5
Figure 4.19. Perceptual symmetries in the functional mapping from
busses to the audio networks across distinct channels. Percepts (“Gate,”
“Width,” “Resonance”) are assigned to corresponding busses across each
channel. The symmetry holds at the level of hardware, where rows of
knobs in Mr. Feely’s Channel Section correspond to rows of busses in the
diagram.
High level “percepts” are symmetrical across each of the five channels, where
each of those percepts corresponds to the same bus number assignment in each
channel. That is, “Gate” corresponds to busses C1-5.1, “Width” corresponds to
busses C1-5.2, and “Resonance” corresponds to busses C1-5.3. This has the
142
effect of similar classes of response being elicited from corresponding knobs in
each of the five channels of Mr. Feely’s Control Section.
Of course, these “percepts” require a symmetry in terms of the effect of
functional mappings into each of the discrete audio synthesis networks if their
particular perceptual qualities are to be discerned and distinguished. The “Gate”
mechanism is functionally identical across all five channels: turning the
corresponding knob from left to right has the effect of allowing a greater number
of pulses to pass through a gated input to each resonator. It acts, then, as an
event filter on the pulse stream, where no pulses are passed to the resonator
system when the gate’s value is zero, all pulses are passed when the gate’s value
is one, and each pulse in the stream has a 0.5 probability of passing when the
gate’s value is 0.5.
The implementation of the “Width” mechanism varies slightly from one
channel to the next, but its effect is symmetrical: turning the corresponding knob
from left to right has the effect of “loosening the elasticity” of each resonator; i.e.
a tighter “elasticity” (implemented as a shorter impulse response in the delay
lines in the resonator’s filterbank) will result in shorter output events, whereas
these events will take on longer durations (correlating to the perception of having
a greater temporal width) as the resonator’s “elasticity” is slackened.
The “Resonance” mechanism is the most varied in terms of implementation
across the five channels. It is tied in specifically to parameter nodes in the
resonator that change the resonator’s dynamical responsiveness; i.e., the
resonant frequencies, their bandwidths, and the ways in which the filters that
143
comprise the resonator’s internal filterbank interact. Across all five channels,
turning the “Resonance” knob from left to right tends to shift the dynamical
response of the resonator increasingly towards distortion, self-oscillation and
nonlinear behavior.
In the breakdown of right hand and left hand tasks in “surfing the fractal
wave,” there is a correspondence to each of the three characteristic behaviors of
bimanual asymmetric action that Kabbash et al. point out. It’s worth addressing
each point in turn:
1. The left hand sets the frame of reference for action of the right.
In the “surfing the fractal wave” model, left hand movements give contour to the
dynamical unfolding of the pulse stream, while the right hand acts as an event
filter on the stream, and a modifier of the dynamical properties of the events that
emerge from pulses hitting the resonator functions. The pulse stream, as it
unfolds, is the frame of reference for the “picking” and “shaping” of discrete
events that characterizes right hand actions.
2. The sequence of motion is left then right.
This follows from the first point: the right hand modifies the event stream only
after the left hand has given the stream its dynamical contour. But unlike
Kabbash et al.’s corresponding example (“the left hand grips the paper, then the
right starts to write with the pen”), the respective actions form a continuous
interplay of complementary motions—as opposed to a sequence of isolated
events—and the transference from left-handed to right-handed motions takes
place at a much finer granularity of temporal scale.
144
3. The granularity of action of the left hand is coarser than that of the right.
In “surfing the fractal wave” the left hand is designated to control the joystick.
These joystick manipulations do not require the hand to reposition itself across
discrete points on the control surface, and they do not require grasping, turning,
or other finger motions that are performed at a fine granularity of scale. I’ve
found that in playing with the model, my left hand will often span the distance
from the joystick to the top row of knobs in Mr. Feely’s Channel Section, leaving
the little finger to move the joystick through the two dimensional plane while the
thumb and pointer finger turn the knobs. But even this action is of a coarser
granularity than the actions designated to the right hand; actions that involve a
constant “hopping” between the fifteen knobs that comprise the Channel Section,
and finely detailed turnings and twiddlings of those knobs.
It’s interesting to note that in the act of playing, left hand activities do not
seem to require any conscious attention, while the right hand activities demand
on-going and focused attention. That the dominant hand should be at the centre
of attention in the midst of bimanual action is not a point that Kabbash et al.
discuss, but it seems that my experience of this phenomenon with “surfing the
fractal wave” might also apply to other activities, such as Kabbash et al.’s
corresponding example: “the left hand brings the painter’s palette in and out of
range, while the right hand holds the brush and does the fine strokes onto the
canvas.”
The two key aspects to the model of activity in “surfing the fractal wave” are
the “surfing” aspect, and the engineering of the interface around habitual
145
embodied patterns of “handedness.” It’s these aspects—or, more specifically, the
entangling of these aspects in the midst of performance—that give the model its
idiosyncratic kind of resistance. In contrast to the “pushing the envelope” model,
in which events are initiated when the performer transmits kinetic energy to the
instrumental mechanism, the “surfing the fractal wave” model is built around a
persistent stream of events. And these events can go by very fast. Motor
patterns, then, emerge not only in the interdependencies between the two hands,
but in the coordination of the hands with respect to timing constraints.
It’s been interesting to note that, over the period of time that I’ve worked
with this model, and as my hands have become both better coordinated and
more individually dexterous, I’ve had a better capacity to deal with the system’s
unfolding in a timely manner, and this in turn has led to a higher level of detail
and nuance in both the shaping of individual sounds at the event level, and in the
elaboration of larger scale events, such as phrases and gestures. This seems to
me indicative of the coevolution of sensory, motor and cognitive competencies
that is definitive of enaction.
Summary
The conventional metaphors of computer science tend to regard computation as
an inherently sequential process. That is, as a function from input to output
comprised of a series of discrete and causally related steps, where the desired
outcome of the function is known in advance of its execution. This is at odds with
the enactive model of interaction, where activity takes place across a network of
interacting components, and where the behavior of those components, and
146
therefore of the network as a whole, is adaptive and emergent with respect to
the ongoing push-and-pull of interactional dynamics. An enactive digital musical
instrument, then, will depend on a fundamentally different view of computation
to that of conventional computer science. Rather than falling back on the
“computation-as-calculation” model, computation would be viewed as a process
in which “the pieces of the model are persistent entities coupled together by their
ongoing interactive behavior (Stein 1999:483).”
This model of “computation-as-interaction” underlies the design of Mr. Feely’s
software system. The system allows for human action to be folded into the
dynamical processes of interacting network components, and to that extent it
also allows for a structural coupling of performer and instrument. Structural
coupling is not, however, a given property of the system; while the software
system provides the required technical infrastructure, the kinds of resistance that
the instrument affords to the human remains a matter of how the infrastructure
is utilized; i.e., a matter of design. And the “right” kinds of resistances—at least
with a view to structural coupling, realization and enaction—will be those that are
neither so transparent to human action that they demand little thought or effort,
or so ungraspable that they forever remain beyond motor and cognitive
capability.
I suggested in chapter 2 that while there is much to be learned from the
physical modeling of conventional acoustic instruments, the focus of my work is
directed more towards the development of instrumental behaviors that are
indigenous to computing media. The examples I’ve outlined in this section point,
however, to a kind of physical model, in that they embody networks of dynamical
147
dependencies in which human action is resisted by forces that are immanent to
the software network. But in contrast to physical models of conventional
instruments, the virtual physics of these systems is speculative; i.e., the models
are not based on data from real world measurements, or on differential equations
that describe well known physical systems. Rather, they are evolved interactively
through experimentation with various mapping and calibration schemes. So,
while the components of the audio synthesis network certainly continue to play a
critical role in the instrument’s behavior, the focus of development is shifted to
the mapping framework. I’m suggesting that it’s through this shift that we see
the potential arise for what I have called an “indigenous” computer music.
In the approach I’ve taken, design choices as to “kinds of resistance”—i.e.,
classes of behavior—are effectively decoupled from audio synthesis
implementations, leaving the designer free to experiment with any manner of
sound-producing and processing components. I’ve found that the “right” kinds of
resistances, however, continue to be those that are resonant with phenomenal
experience and past practices of embodiment. Essentially, this means that the
simulated physics of resistance will be—in some way or other—functionally
related to physical descriptions of real world behavior. The design of these
systems, then, takes a middle course between normative and speculative modes
of interactivity; between that which is familiar and that which is other to every
day phenomenal experience. If the balance between these two poles is apposite,
then the kinds of resistance that the systems afford will be sufficiently rich in
dynamical potential that, over a sustained period of time, the performer will
continue to realize new practices, and new ways of encountering the instrument.
148
4.5 Prospects
The two usage examples I’ve outlined in this chapter demonstrate just a small
number of possible approaches to engineering the kinds of resistance that digital
musical instruments might store in potentia. In my work with Mr. Feely, as both
designer and performer, it seems I’m still just scratching at the surface of these
matters, and that there are a great many implementational possibilities yet to be
uncovered. It also seems that at a certain point, these “uncoverings” will
necessarily require the development of patterns, in both design and performance,
that are of a higher order than those I’ve outlined to this point.
For design, this will likely be a matter of evolving a body of general principles
that might be employed such that design knowledge can be added to
incrementally. At first glance, this may appear to contradict my observation at
the beginning of this chapter that the task of arriving at a universal template for
the design of enactive instruments may be ultimately impracticable. But the issue
I’m raising here is more directly concerned with arriving at general principles that
operate at a higher level of abstraction than purely implementational concerns.
The concern, rather, would lie with the way in which models might be generated
from a consistent but open-ended application of principles that emerge from the
interaction between philosophical and technical problematics. This would be a
kind of meta-design. Rather than persistently hopping back and forth between
philosophical and technical discourses, there would exist an evolving metric for
balancing the constraints of one against the other in an integrated framework. At
this point in my work, I can’t say for certain how one would go about putting
149
such a framework together. But problems such as these are not without
precedent in the history of design,5 and it seems to me a potentially very
productive avenue of investigation.
The development of higher order patterns in performance is also a matter of
balancing opposing constraints. As I’ve been careful to make clear, the two usage
examples I’ve outlined in this chapter embody very different kinds of resistance,
and therefore afford very different varieties of human action. It’s interesting to
consider, though, how these models might be interleaved in the context of the
same performance. This kind of multitasking is part and parcel of expert
musicianship. And while the two usage examples I outlined in the previous
section might involve a certain degree of multitasking in and of themselves, there
is a higher order of multitasking that could potentially encompass both models
simultaneously. At the same time that this may eventually lead to more complex
and diverse sonic utterances, it may also lead to a heightened sense of flow—of
performative embodiment.
In considering the merging of the two models into a single integrated model,
it would seem that they are in fact so different in playing technique as to be
incompatible. Again, the issue comes back to design. Multitasking must
necessarily involve some degree of compatibility between the actional patterns
that comprise the sub-tasks. In designing for multitasking, then, it may prove
useful to have in store some metric of actional distance between the kinds of
5 For example, see Alexander ([1964] 1997).
150
motor activities that different models afford. The balancing of these constraints
may prove to be difficult. But again, such approaches are not without precedent
in design.6
With or without these higher order design methods, the products of design
will invariably afford opportunities for action that were at no point factored into
the design process. This has certainly been the case with conventional acoustic
instruments—and is perhaps definitive of so-called “extended” techniques—and
there’s no reason to assume that the situation should be any different for digital
musical instruments. It’s been interesting for me to note that, the more I play
with the “surfing the fractal wave” model, the more I’m able to isolate certain
quirks and glitches in the system.7 These kinds of discoveries constitute an
important aspect of the learning process; not just because they can be
assimilated into the accumulating motor and sonic vocabulary, but because in
certain cases they can lead to entirely new avenues of investigation—avenues
that would have remained closed had the system been insulated from random
environmental inputs in the first instance. There is a stochastic element in
enactive process, and this element is accounted for in the contingencies of
environmental dynamics. The glitch, then, is simply folded into the enactive
model of interaction. Its appearance or suppression in performance becomes a
6 For example, see Wild, Johnson and Johnson (2004).
7 It's also interesting to note that, at least to this point, the "pushing the envelope"
model has yielded no such interesting anomalies.
151
matter for human intentionality. Either choice will lead to the appropriate
refinement of actional dispositions.
152
5 Groundlessness
Whatever comes into being dependent on another
Is not identical to that thing.
Nor is it different from it.
Therefore it is neither nonexistent in time nor permanent.
— Nagarjuna, Mūlamadhyamakakārikā XVIII:10
The important thing is to understand life, each living individuality, not as a form,
or a development of form, but as a complex relation between differential
velocities, between deceleration and acceleration of particles.
— Gilles Deleuze, Spinoza: Practical Philosophy
The main thing is that you forget yourself.
— Barbara McClintock
153
The structure of our language typically leads us to characterizations of interaction
that focus on one side or the other of the interactional loop. “Humans use
technologies,” “technologies determine humans,” and so on. These are, of
course, the unavoidable products of a subject/object syntax, and my writing in
this essay has not been immune to the lopsided characterizations of interaction
that such products embody. But despite the inevitable linguistic constraints, I’ve
sought to describe the inherent circularity of the continuous interactional
unfolding that is definitive of enactive process; a process that is not concerned
with subjects and objects, but with relations, linkages, heterogeneity, and the
dynamic momentum of the emergent system that arises in the relations and
linkages between heterogeneous elements.
One of the more radical outcomes of Varela, Rosch and Thompson’s outline of
an enactive cognitive science is the model of subjectivity that necessarily follows
from enactive process. It’s precisely because enactive process concerns “the
processual transformation of the past into the future through the intermediary of
transitional forms that in themselves have no permanent substance (Varela,
Thompson, and Rosch 1991:116),” that enactive theory necessarily implies a
“groundless” or “selfless” self—i.e., a self with “no permanent substance;” a
“subjectless subjectivity (Deleuze and Guattari 1987).” This is the non-self that
appears in the experience of flow—in an unselfconscious, active and embodied
participation in the dynamical unfolding of real time and space—and it’s the same
non-self that vanishes the moment that attention is turned inward, and
perception is geared towards abstract contemplation of the objectness of things
in the world.
154
In this essay, I have not dealt with the epistemological or ontological
implications of an enactive approach to design in any significant manner. But to
my mind (however that may now be defined), it’s precisely these implications
that are most critical when thinking about design, or when implementing
implementations. If an implementation might afford the potential to undermine
essentialist ways of being—i.e., if the performative way of being that it brings
about is concerned with the unfolding of relations rather than the ordering of
things—then I would say that the implementation in question has utility, and that
the epistemological and ontological qualities that it embodies necessarily imply an
ethics.
At various points throughout the essay, I’ve invoked Heidegger’s use of the
term “equipment.” In Heidegger’s terminology, an equipment is a tool that
presents itself to human perception and intentionality as something-in-order-to.
That is, it affords a particular utility. It would be easy enough to arrive at the
conclusion that, in designing a digital musical instrument, we are designing
something-in-order-to-perform-music. While the statement is obviously true, it is
not, I think, a conclusion. An enactive approach to digital musical instrument
design would necessarily account for the realizational potential of the instrument;
i.e., a potential which would lead to an incremental unfolding of relationality, and
which at the same time would serve as the measure of the instrument’s
resistance. The concern for design, then, is directed towards designing an
encounter. Or, it’s directed towards designing something-in-order-to-not-be-
some-thing. In this respect, computers have a significant potential.
155
Bibliography
Agre, P. 1995. “Computation and embodied agency.” Informatica 19 (4):527-535.
———. 1996. “Computational research on interaction and agency.” In P. Agre and S.
Rosenschein, ed. Computational theories of interaction and agency. Cambridge, MA: MIT
Press, pp.1-52.
———. 1997. Computation and human experience. Cambridge, UK: Cambridge University
Press.
———. 2002. “The practical logic of computer work.” In Computationalism: New
directions. Cambridge, MA: MIT Press, pp.130-142.
Alexander, C. [1964] 1997. Notes on the synthesis of form. Cambridge, MA: Harvard
University Press.
Anderson, C. 2005. “Dynamic networks of sonic interactions: An interview with Agostino Di
Scipio.” Computer Music Journal 29 (3):11-28.
Arbib, M., and J.-S. Liaw. 1996. “Sensorimotor transformations in the worlds of frogs and
robots.” In P. Agre and S. Rosenschein, ed. Computational theories of interaction and
agency. Cambridge, MA: MIT Press, pp.53-79.
Ashby, W. R. [1952] 1960. Design for a brain: The origin of adaptive behaviour. New
York: John Wiley & Sons.
———. [1956] 1965. An introduction to cybernetics. London: William Clowes and Sons.
Bahn, C., T. Hahn, and D. Trueman. 2001. “Physicality and feedback: A focus on the body
in the performance of electronic music.” In Proceedings of the 2001 International
Computer Music Conference. San Francisco, CA: International Computer Music
Association, pp.44-51.
Bailey, D. 1992. Improvisation: Its nature and practice in music. New York: Da Capo
Press.
Barbaras, R. 1999. “The movement of the living as the originary foundation of perceptual
intentionality.” In B. Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing
156
phenomenology: Issues in contemporary phenomenology and cognitive science. Stanford,
CA: Stanford University Press, pp.525-538.
Bateson, G. 1980. Mind and nature: A necessary unity. Toronto; New York: Bantam
Books.
Beer, R. D. 1990. Intelligence as adaptive behavior: an experiment in computational
neuroethology. San Diego, CA: Academic Press Professional, Inc.
———. 1996. “A dynamical systems perspective on agent-environment interaction.” In P.
Agre and S. Rosenschein, ed. Computational theories of interaction and agency.
Cambridge, MA: MIT Press, pp.173-216.
———. 1997. “The dynamics of adaptive behavior: A research program.” Robotics and
Autonomous Systems 20:257-289.
———. 2000. “Dynamical approaches to cognitive science.” Trends in Cognitive Sciences 4
(3):91-99.
———. 2004. “Autopoiesis and cognition in the game of life.” Artificial Life 10:309-326.
Blum, T. 1979. “Herbert Brün: Project sawdust.” Computer Music Journal 3 (1):6-7.
Bongers, B. 2000. “Physical interfaces in the electronic arts: Interaction theory and
interfacing techniques for real-time performance.” In M. Wanderley and M. Battier, ed.
Trends in gestural control of music. Paris: IRCAM, pp.41-70.
Borgo, D. 2005. Sync or swarm: Improvising music in a complex age. New York:
Continuum.
Bourdieu, P. 1977. Outline of a theory of practice. R. Nice, trans. Cambridge, U.K.:
Cambridge University Press.
———. 1990. The logic of practice. Cambridge, U.K.: Polity.
———. 1991. Language and symbolic power. G. Raymond and M. Adamson, trans.
Cambridge, MA: Harvard University Press.
———. 1993. The field of cultural production: Essays on art and literature. New York:
Columbia University Press.
Brooks, R. 1991. “Intelligence without representation.” Artificial Intelligence 47:139-159.
———. 1991. “New approaches to robotics.” Science 253 (5025):1227-1232.
157
———. 1992. “Artificial life and real robots.” In F. Varela and P. Bourgine, ed. Toward a
practice of autonomous systems: Proceedings of the first European conference on artificial
life. Cambridge, MA: MIT Press, pp.3-10.
Brün, H. 1969. “Infraudibles.” In H. Von Foerster and J. Beauchamp, ed. Music by
computer. New York: John Wiley and Sons.
Bruner, J. 1987. Actual minds, possible worlds. Cambridge, MA: Harvard University Press.
———. 1991. Acts of meaning. Cambridge, MA: Harvard University Press.
Burzik, A. 2003. “Go with the flow.” The Strad:714-718.
Buxton, W. 1986. “There’s more to interaction than meets the eye: Some issues in manual
input.” In D. Norman and S. W. Draper, ed. User centered system design: New
perspectives on human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates,
pp.319-337.
Capra, F. 1996. The web of life: A new scientific understanding of living systems. New
York: Anchor Books.
Casati, R. 1999. “Formal structures in the phenomenology of motion.” In B. Pachoud, J.
Petitot, J.-M. Roy and F. Varela, ed. Naturalizing phenomenology: Issues in contemporary
phenomenology and cognitive science. Stanford, CA: Stanford University Press, pp.372-
384.
Cascone, K. 2000. “The aesthetics of failure: ‘Post-digital’ tendencies in contemporary
computer music.” Computer Music Journal 24 (4):12-18.
———. 2002. “Laptop music - counterfeiting aura in the age of infinite reproduction.”
Parachute (107):56.
———. 2003. “Grain, sequence, system: Three levels of reception in the performance of
laptop music.” Contemporary Music Review 22 (4):101-104.
———. 2003. “Introduction.” Contemporary Music Review 22 (4):1-2.
Casserley, L. 2001. “Plus ça change: Journeys, instruments and networks, 1966-2000.”
Leonardo Music Journal 11:43-49.
Chabot, X. 1994. “To listen and to see: Making and using electronic instruments.”
Leonardo Music Journal 3:11-16.
158
Chadabe, J. 1997. Electric sound: The past and promise of electronic music. Upper Saddle
River, NJ: Prentice Hall.
———. 2002. “The limitations of mapping as a structural descriptive in electronic
instruments.” In E. Brazil, ed. Proceedings of the 2002 New Interfaces for Musical
Expression Conference. pp.197-201.
Chiel, H. J., and R. D. Beer. 1997. “The brain has a body: adaptive behavior emerges from
interactions of nervous system, body and environment.” Trends in Neurosciences 20
(12):553-557.
Choi, I. 1995. “A manifold interface for a high dimensional control space.” In Proceedings
of the 1995 International Computer Music Conference. San Francisco, CA: International
Computer Music Association, pp.385-392.
———. 2003. “A component model of gestural primitive throughput.” In Proceedings of
the 2003 New Interfaces for Musical Expression Conference. pp.201-204.
Church, A. 1932. “A set of postulates for the foundation of logic.” Annals of Mathematics,
second series 33:346-366.
———. 1936. “An unsolvable problem of elementary number theory.” American Journal of
Mathematics 58:345-363.
Clancey, W. 1997. Situated cognition: On human knowledge and computer
representations. Cambridge, U.K.: Cambridge University Press.
Clark, A. 1995. “Moving minds: Situating content in the service of real-time success.”
Philosophical Perspectives 9:89-104.
———. 1997. Being there: Putting brain, body and world together again. Cambdrige, MA:
MIT Press.
———. 2003. Natural-born cyborgs: Minds, technologies, and the future of human
intelligence. Oxford: Oxford University Press.
Clark, A., and D. Chalmers. 1998. “The extended mind.” Analysis 58 (1):7-19.
Clarke, E. F. 1993. “Generativity, mimesis and the human body in music performance.”
Contemporary Music Review 9:207-220.
Collins, N. 2003. “Generative music and laptop performance.” Contemporary Music Review
22 (4):67-79.
159
Cook, P. 2001. “Principles for designing computer music controllers.” In Proceedings of
the 2001 New Interfaces for Musical Expression Conference.
Cook, P. R. 2004. “Remutualizing the musical instrument: Co-design of synthesis
algorithms and controllers.” Journal of New Music Research 33 (3):315-320.
Csikszentmihaly, M. 1991. Flow: The psychology of optimal experience. New York: Harper
Perennial.
Cull, J. 2000. “The circularity of living systems: The movement and direction of behavior.”
Journal of Applied Systems Studies 1 (1):51-65.
Damasio, A. R. 1994. Descartes’ error: Emotion, reason and the human brain. New York:
Putnam.
De Certeau, M. 1984. The practice of everyday life. Berkeley, CA: University of California
Press.
Dedieu, E., and E. Mazer. 1992. “An approach to sensorimotor relevance.” In F. Varela and
P. Bourgine, ed. Toward a practice of autonomous systems: Proceedings of the first
European conference on artificial life. Cambridge, MA: MIT Press, pp.88-95.
Deleuze, G. 1988. Spinoza: Practical philosophy. R. Hurley, trans. San Francisco: City
Lights.
———. 1990. Expressionism in philosophy: Spinoza. M. Joughin, trans. New York: Zone
Books.
———. 1991. Bergsonism. H. Tomlinson and B. Habberjam, trans. New York: Zone Books.
———. [1968] 1994. Difference and repetition. P. Patton, trans. New York: Columbia
University Press.
Deleuze, G., and F. Guattari. 1983. Anti-Oedipus: Capitalism and schizophrenia.
Minneapolis: University of Minnesota Press.
———. 1987. A thousand plateaus: Capitalism and schizophrenia. B. Massumi, trans.
Minneapolis: University of Minnesota Press.
Di Scipio, A. 1994. “Formal processes of timbre composition: Challenging the dualistic
paradigm of computer music.” In Proceedings of the 1994 International Computer Music
Conference. San Francisco, CA: International Computer Music Association, pp.202-208.
160
———. 1997. “Interpreting music technology: From Heidegger to subversive
rationalization.” Sonus 18 (1):63-80.
———. 2000. “Ecological modeling of textural sound events by iterated nonlinear
functions.” In Proceedings of the 2000 Colloquium on Musical Informatics. pp.33-36.
Dietrich, E. 1990. “Computationalism.” Social Epistemology 4 (135-154).
Dourish, P. 1999. Embodied interaction: Exploring the foundations of a new approach to
HCI. http://www.ics.uci.edu/~jpd/publications/misc/embodied.pdf.
———. 2001. Where the action is: Foundations of embodied interaction. Cambridge, MA:
MIT Press.
Dreyfus, H. 1991. Being-in-the-world: A commentary on Heidegger’s ‘Being and Time,
Division I’. Cambrdige, MA: MIT Press.
———. 1992. What computers still can’t do: A critique of artificial reason. Cambridge, MA:
MIT Press.
———. 1993. “Heidegger’s critique of the Husserl/Searle account of intentionality.” Social
Research 60 (1):17-38.
———. 1996. The current relevance of Merleau-Ponty’s phenomenology of embodiment.
Electronic Journal of Analytic Philosophy (4),
http://ejap.louisiana.edu/EJAP/1996.spring/dreyfus.1996.spring.html.
Emmerson, S. 2000. “’Losing touch?’: The human performer and electronics.” In S.
Emmerson, ed. Music, Electronic Media and Culture. Aldershot: Ashgate Publishing,
pp.194-216.
Evens, A. 2005. Sound ideas: music, machines, and experience. Minneapolis: University
of Minnesota Press.
Feenberg, A. 1991. Critical theory of technology. Oxford: Oxford University Press.
———. 1999. Questioning technology. London: Routledge.
———. 2000. “From essentialism to constructivism: Philosophy of technology at the
crossroads.” In E. Higgs, A. Light and D. Strong, ed. Technology and the good life?
Chicago: University of Chicago Press, pp.294-315.
———. 2002. Transforming technology. Oxford: Oxford University Press.
161
Fishkin, K., A. Gujar, B. Harrison, T. Moran, and R. Want. 2000. “Embodied user interfaces
for really direct manipulation.” Communications of the ACM 43 (9):75-80.
Fitzmaurice, G., and W. Buxton. 1997. “An empirical evaluation of graspable user
interfaces: towards specialized, space-multiplexed input.” In Proceedings of the 1997
SIGCHI Conference on Human Factors in Computing Systems. pp.43-50.
Fodor, J. A. 1983. The modularity of mind. Cambridge, MA: MIT Press.
Gallagher, S. 2003. How the body shapes the mind. Oxford: Oxford University Press.
Garnett, G. E. 2001. “The aesthetics of interactive computer music.” Computer Music
Journal 25 (1):21-33.
Gibson, J. J. 1977. “The theory of affordances.” In R. E. Shaw and J. Bransford, ed.
Perceiving, acting, and knowing: Toward an ecological psychology. Hillsdale, NJ: Lawrence
Erlbaum Associates.
———. 1979. The ecological approach to visual perception. New York: Houghton-Mifflin.
Gillespie, B. 1999. “Haptics.” In P. Cook, ed. Music, cognition and computerized sound.
Cambridge, MA: MIT Press, pp.247-260.
———. 1999. “Haptics in manipulation.” In P. Cook, ed. Music, cognition and computerized
sound. Cambridge, MA: MIT Press, pp.261-276.
Giunti, M. 1997. Computation, dynamics and cognition. Oxford: Oxford University Press.
Goudeseune, C. 2001. Composing with parameters for synthetic instruments, University of
Illinois at Urbana-Champaign, Urbana-Champaign, IL.
Greenfield, A. 2006. Everyware: The dawning age of ubiquitous computing. Berkeley, CA:
New Riders Press.
Guiard, Y. 1987. “Asymmetric division of labor in human skilled bimanual action: The
kinematic chain as a model.” Journal of Motor Behavior 19 (4):486-517.
Gunther, E., G. Davenport, and S. O’modhrain. 2002. “Cutaneous grooves: Composing for
the sense of touch.” In E. Brazil, ed. Proceedings of the 2002 New Interfaces for Musical
Expression Conference. pp.37-43.
Hamman, M. 1997. “Interaction as composition: Toward the paralogical in computer
music.” Sonus 18 (1):26-44.
162
———. 1999. “From symbol to semiotic: Representation, signification, and the composition
of music interaction.” Journal of New Music Research 28 (2):90-104.
———. 1999. “Structure as performance: Cognitive musicology and the objectification of
procedure.” In J. Tabor, ed. Otto Laske: Navigating new musical horizons. Westport:
Greenwood Press.
———. 2000. “Priming computer-assisted music composition through design of
human/computer interaction.” In N. Mastorakis, ed. Mathematics and computers in
modern science: Acoustics and music, biology and chemistry, business and economics.
Athens: World Scientific Engineering Society.
———. 2000. “From technical to technological: Interpreting technology through
composition.” In Proceedings of the 2000 Coloquium on Musical Informatics.
———. 2002. The technical as aesthetic: Technology, composition, interpretation.
http://www.shout.net/~mhamman/papers/montpelier_2000.pdf.
———. 2002. “From technical to technological: The imperative of technology in
experimental music composition.” Perspectives of New Music 40 (1):92-120.
Haraway, D. 1991. “A cyborg manifesto: Science, technology, and socialist-feminism in
the late ywentieth century.” In. New York: Routledge, pp.149-181.
Haugeland, J. 1985. Artificial intelligence: The very idea. Cambridge, MA: MIT Press.
———. 2002. “Authentic intentionality.” In M. Scheutz, ed. Computationalism: new
directions. Cambridge, MA: MIT Press, pp.159-174.
Hayles, N. K. 1999. How we became posthuman: Virtual bodies in cybernetics, literature
and informatics. Chicago: University of Chicago Press.
Heidegger, M. 1988. Basic problems of phenomenology. A. Hofstadter, trans.
Bloomington: Indiana University Press.
———. [1927] 1962. Being and time. J. Macquarrie and E. Robinson, trans. London: SCM
Press.
———. [1949] 1977. “The question concerning technology.” In The question concerning
technology and other essays. New York: Harper and Row, pp.3-35.
Hendriks-Jansen, H. 1996. Catching ourselves in the act: Situated activity, interactive
emergence, evolution, and human thought. Cambridge, MA: MIT Press.
163
Hinckley, K., R. Pausch, D. Proffitt, J. Patten, and N. Kassell. 1997. “Cooperative bimanual
action.” In Proceedings of the 1997 SIGCHI Conference on Human Factors in Computing
Systems. pp.27-34.
Holland, J. 1992. Adaptation in natural and artificial systems: An introductory analysis
with applications to biology, control, and artificial intelligence. Cambridge, MA: MIT Press.
———. 1995. Hidden order: How adaptation builds complexity. Cambridge, MA: Perseus
Book.
———. 1998. Emergence: From chaos to order. Cambridge, MA: Perseus Books.
Honing, H. 2003. Some comments on the relation between music and motion. Music
Theory Online (1), http://smt.ucsb.edu/mto/issues/mto.03.9.1/mto.03.9.1.honing.html.
Horkheimer, M., and T. W. Adorno. 1972. Dialectic of enlightenment. J. Cumming, trans.
New York: Continuum.
Horswill, I. 1996. “Analysis of adaptation and environment.” In P. Agre and S.
Rosenschein, ed. Computational theories of interaction and agency. Cambridge, MA: MIT
Press, pp.367-396.
Hunt, A., M. Wanderley, and R. Kirk. 2000. “Towards a model for instrumental mapping in
expert musical interaction.” In Proceedings of the 2000 International Computer Music
Conference. San Francisco, CA: International Computer Music Association, pp.209-212.
Hunt, A., M. Wanderley, and M. Paradis. 2003. “The importance of parameter mapping in
electronic instrument design.” Journal of New Music Research 32 (4):429-440.
Hutchins, E. 1995. Cognition in the wild. Cambridge, MA: MIT Press.
Iazzetta, F. 1996. “Formalization of computer music interaction through a semiotic
approach.” Journal of New Music Research 25 (3):212-230.
———. 2000. Meaning in musical gesture. Trends in gestural control of music.
Ihde, D. 1983. Existential technics. Albany, NY: State University of New York Press.
———. 1990. Technology and the lifeworld: From garden to earth. Bloomington, IN:
Indiana University Press.
———. 1991. Instrumental realism: The interface between philosophy of science and
philosophy of technology. Bloomington, IN: Indiana University Press.
———. 1993. Philosophy of technology: An introduction. New York: Paragon House.
164
———. 2002. Bodies in technology. Minneapolis: University of Minnesota Press.
Jackendoff, R. 1987. Consciousness and the computational mind. Cambridge, MA: MIT
Press.
Jacob, R., L. Sibert, D. Mcfarlane, and M. Mullen. 1994. “Integrality and separability of
input devices.” ACM Transactions on Computer-Human Interaction (1):3-26.
Jaeger, T. 2003. “The (anti-)laptop aesthetic.” Contemporary Music Review 22 (4).
Johnson, M. 1987. The body in the mind: The bodily basis of meaning, imagination, and
reason. Chicago: University of Chicago Press.
Jordà, S. 2002. “Improvising with computers: A personal survey (1989-2001).” Journal of
New Music Research 31 (1):1-10.
———. 2002. “FMOL: Toward user-friendly, sophisticated new musical instruments.”
Computer Music Journal 26 (3):23-39.
———. 2003. “Interactive music systems for everyone: Exploring visual feedback as a way
for creating more intuitive, efficient and learnable instruments.” In Proceedings of the
2003 Stockholm Music Acoustics Conference.
Jordan, S. 2003. “The embodiment of intentionality.” In W. Tschacher, ed. Dynamical
systems approach to cognition: concepts and empirical paradigms based on self-
organization, embodiment, and coordination dynamics. Cambridge, MA: MIT Press,
pp.201-228.
Kabbash, P., B. Buxton, and A. Sellen. 1994. “Two handed input in a compound task.” In
Proceedings of the 1994 SIGCHI Conference on Human Factors in Computing Systems.
pp.417-423.
Karmiloff-Smith, A. 1992. Beyond modularity: A developmental perspective on cognitive
science. Cambridge, MA: MIT Press.
Kartadinata, S. 2003. “The gluiph: A nucleus for integrated instruments.” In Proceedings
of the 2003 New Interfaces for Musical Expression Conference. pp.180-183.
Kauffman, S. 1993. The origins of order: Self-organization and selection in evolution.
Oxford: Oxford University Press.
Krefeld, V. 1990. “The Hand in the web: An interview with Michel Waisvisz.” Computer
Music Journal 7 (7):43-55.
165
Lakoff, G., and M. Johnson. 1980. Metaphors we live by. Chicago: University of Chicago
Press.
———. 1999. Philosophy in the flesh: The embodied mind and its challenge to western
thought. New York: Basic Books.
Lakoff, G., and R. Núñez. 2000. Where mathematics comes from: How the embodied
mind brings mathematics into being. New York: Basic Books.
Lansky, P. 1990. “A view from the bus: When machines make music.” Perspectives of
New Music 28 (2):102-110.
Laske, O. E. 1991. “Toward an epistemology of composition.” Interface (20):235-269.
Latour, B. 1993. We have never been modern. Hemel Hempstead: Harvester Wheatsheaf.
Leppert, R. 1993. The sight of sound: Music, representation, and the history of the body.
Berkeley, CA: University of California Press.
Lidov, D. 1987. “Mind and body in music.” Semiotica 1 (3):69-97.
Loren, L., E. Dietrich, C. Morrison, and J. Beskin. 1998. “What is means to be ‘situated’.”
Cybernetics and Systems 29:751-777.
Lyons, D. M., and A. J. Hendriks. 1996. “Exploiting patterns of interaction to achieve
reactive bahvior.” In P. Agre and S. Rosenschein, ed. Computational theories of interaction
and agency. Cambridge, MA: MIT Press, pp.483-514.
Maes, P. 1992. “Learning behavior networks from experience.” In F. Varela and P.
Bourgine, ed. Toward a practice of autonomous systems: Proceedings of the first European
conference on artificial life. Cambridge, MA: MIT Press, pp.48-57.
Maturana, H. R., and F. J. Varela. 1980. Autopoiesis and cognition: The realization of the
living. Dordrecht, Holland: D. Reidel Publishing Company.
———. 1987. The tree of knowledge: The biological roots of human understanding.
Boston: New Science Library.
Merleau-Ponty, M. 1968. The visible and the invisible. A. Lingis, trans. Evanston, IL:
Northwestern University Press.
———. [1945] 2004. The phenomenology of perception. C. Smith, trans. London:
Routledge.
Minsky, M. 1986. The society of mind. New York: Simon and Schuster.
166
Mulder, A. 1999. Radical user interfaces for real-time musical control, University of York.
Mumma, G. 1967. “Creative aspects of live electronic music technology.” In Proceedings
of the 33rd National Convention of the American Audio Engineering Society. New York:
American Audio Engineering Society.
———. 1974. “Notes on cybersonics: Artificial intelligence in live musical performance.” In.
London: Guildhall Music and Drama Annual.
———. 1974. “Live electronic music.” In J. Appleton and R. C. Perera, ed. The
Development and Practice of Electronic Music. Englewood Cliffs, NJ: Prentice Hall, pp.286-
335.
Nardi, B. 1996. Context and consciousness: Activity theory and human-computer
interaction. Cambridge, MA: MIT Press.
Nardi, B., and V. O’day. 1999. Information ecologies: Using technology with heart.
Cambridge, MA: MIT Press.
Ng, K. 2002. “Interactive gesture music performance interface.” Paper read at Proceedings
of the 2002 New Interfaces for Musical Expression Conference.
Noë, A. 2004. Action in perception. Cambridge, MA: MIT Press.
Norman, D. 1986. “Cognitive engineering.” In D. Norman and S. Draper, ed. Hillsdale, NJ:
Lawrence Erlbaum Associates.
———. 1999. “Affordances, conventions and design.” Interactions 6 (3):38-43.
———. 1999. The invisible computer: Why good products can fail, the personal computer
is so complex, and information appliances are the solution. Cambridge, MA: MIT Press.
———. 2002. The design of everyday things. New York: Basic Books.
Norman, D., and S. Draper, eds. 1986. User centered system design: New perspectives on
human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates.
Norman, D., J. D. Holland, and E. L. Hutchins. 1986. “Direct manipulation interfaces.” In
D. Norman and S. Draper, ed. Hillsdale, NJ: Lawrence Erlbaum Associates.
Ostertag, B. 2002. “Human bodies, computer music.” Leonardo Music Journal 12:11-14.
Pacherie, E. 1999. “’Leibhaftigkeit’ and representational theories of perception.” In B.
Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing phenomenology: Issues in
167
contemporary phenomenology and cognitive science. Stanford, CA: Stanford University
Press, pp.148-160.
Pachoud, B. 1999. “The teleological dimension of perceptual and motor intentionality.” In
B. Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing phenomenology: Issues in
contemporary phenomenology and cognitive science. Stanford, CA: Stanford University
Press, pp.196-219.
Pachoud, B., J. Petitot, J.-M. Roy, and F. Varela. 1999. “Beyond the gap: An introduction
to naturalizing phenomenology.” In B. Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed.
Naturalizing phenomenology: Issues in contemporary phenomenology and cognitive
science. Stanford, CA: Stanford University Press, pp.1-80.
Paradiso, J. A. 1997. “Electronic music: New ways to play.” IEEE Spectrum 34 (12):18-30.
———. 2003. “Current trends in electronic music interfaces.” Journal of New Music
Research 32 (4):345-349.
Pask, G. 1962. An approach to cybernetics. New York, NY: Harper.
Pask, G., and S. Curran. 1982. Micro man: Computers and the evolution of consciousness.
New York: Macmillan.
Perkis, T. 1996. “Bringing digital music to life.” Computer Music Journal 20 (2):28-32.
Pfeifer, R., and P. Verschure. 1992. “Distributed adaptive control: A paradigm for
designing autonomous agents.” In F. Varela and P. Bourgine, ed. Toward a practice of
autonomous systems: Proceedings of the first European conference on artificial life.
Cambridge, MA: MIT Press, pp.21-30.
Pinker, S. 1997. How the mind works. New York: W. W. Norton.
Prem, E. 1997. “Epistemic autonomy in models of living systems.” In P. Husbands and I.
Harvey, ed. Fourth European conference on artificial life. pp.2-9.
Preston, E. F. 1988. Representational and non-representational intentionality: Husserl,
Heidegger, and artificial intelligence. Ph.D., Philosophy, Boston University, Boston.
———. 1993. “Heidegger and artificial intelligence.” Philosophy and Phenomenological
Research 53 (1):43-69.
Prévost, E. 1995. No sound is innocent. Matching Tye, Essex: Copula.
168
Reddell, T. 2003. “Laptopia: The spatial poetics of networked laptop performance.”
Contemporary Music Review 22 (4):11-22.
Riethmüller, A. 1994. “The matter of music is sound and body-motion.” In H. U.
Gombrecht and K. L. Pfeiffer, ed. Materialities of communication. Stanford, CA: Stanford
University Press, pp.148-156.
Roads, C. 1985. “Improvisation with George Lewis.” In. Los Altos, CA: William Kaufmann
Inc.
———. 1989. “Active music representations.” In Proceedings of the 1989 International
Computer Music Conference. San Francisco, CA: International Computer Music
Association, pp.257-259.
Rosenschein, S., and L. P. Kaelbling. 1996. “A situated view of representation and
control.” In P. Agre and S. Rosenschein, ed. Computational theories of interaction and
agency. Cambridge, MA: MIT Press, pp.541-596.
Rowe, R. 1993. Interactive music systems: Machine listening and composing. Cambridge,
MA: MIT Press.
———. 2001. Machine musicianship. Cambridge, MA: MIT Press.
Roy, J.-M. 1999. “Saving intentional phenomena: Intentionality, representation and
symbol.” In B. Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing
phenomenology: Issues in contemporary phenomenology and cognitive science. Stanford,
CA: Stanford University Press, pp.111-147.
Ryan, J. 1991. “Some remarks on musical instrument design at STEIM.” Contemporary
Music Review 6 (1):3-17.
———. 1992. “Effort and expression.” In A. Strange, ed. Proceedings of the 1992
International Computer Music Conference. San Francisco, CA: International Computer
Music Association, pp.414-416.
Sapir, S. 2002. “Gestural control of digital audio environments.” Journal of New Music
Research 31 (2):119-129.
Scheutz, M. 2002. “Computationalism - The next generation.” In Computationalism: New
directions. Cambridge, MA: MIT Press, pp.1-21.
Schloss, W. A. 2003. “Using contemporary technology in live performance: The dilemma of
the performer.” Journal of New Music Research 32 (3):239-242.
169
Shannon, C. 1949. The mathematical theory of communication. Urbana, IL: University of
Illinois Press.
Sheets-Johnstone, M. 2000. The primacy of movement. Philadelphia: John Benjamins.
Shove, P. 1995. “Musical motion and performance: Theoretical and empirical
perspectives.” In J. Rink, ed. Cambridge, UK: Cambridge University Press, pp.55-83.
Simon, H. A. 2001. The sciences of the artificial. Cambridge, MA: MIT Press.
Small, C. 1998. Musicking: The meanings of performing and listening. Hanover, NH:
Wesleyan University Press.
Smith, B. C. 1996. On the origin of objects. Cambridge, MA: MIT Press.
———. 2002. “The foundations of computing.” In M. Scheutz, ed. Computationalism: New
directions. Cambridge, MA: MIT Press, pp.23-58.
Smith, D. W. 1999. “Intentionality naturalized?” In B. Pachoud, J. Petitot, J.-M. Roy and F.
Varela, ed. Naturalizing phenomenology: Issues in contemporary phenomenology and
cognitive science. Stanford, CA: Stanford University Press, pp.83-110.
Smyth, T., and J. Smith. 2002. “Creating sustained tones with the cicada’s rapid
sequential buckling mechanism.” Paper read at Proceedings of the 2002 New Interfaces for
Musical Expression Conference.
Spiegel, L. 1992. “An alternative to a standard taxonomy for electronics and computer
instruments.” Computer Music Journal 16 (3).
Stein, L. A. 1998. “What we’ve swept under the rug: Radically rethinking CS1.” Computer
Science Education 8 (2):118-129.
———. 1999. “Challenging the computational metaphor: Implications for how we think.”
Cybernetics and Systems 30 (6):473-507.
Stuart, C. 2003. “The object of performance: Aural performativity in contemporary laptop
music.” Contemporary Music Review 22 (4):59-65.
Suchman, L. 1987. “Plans and situated actions: The problem of human-machine
communication.” In. Cambridge, U.K.: Cambridge University Press.
Sudnow, D. 2001. Ways of the hand: A rewritten account. Cambridge, MA: MIT Press.
170
Tanaka, A., and R. B. Knapp. 2002. “Multimodal interaction in music using the
electromyogram and relative position sensing.” In E. Brazil, ed. Proceedings of the 2002
New Interfaces for Musical Expression Conference. pp.43-48.
Temprado, J. J. 2003. “Cognition in action: The interplay of attention and bimanual
coordination dynamics.” In W. Tschacher, ed. Dynamical systems approach to cognition:
Concepts and empirical paradigms based on self-organization, embodiment, and
coordination dynamics. Cambridge, MA: MIT Press, pp.93-132.
Thelen, E. 1994. A dynamic systems approach to the development of cognition and action.
Cambridge, MA: MIT Press.
———. 1995. “Time scale dynamics and the development of an embodied cognition.” In R.
Port and T. Van Gelder, ed. Mind as motion: Explorations in the dynamics of cognition.
Cambridge, MA: MIT Press, pp.69-99.
———. 2003. “Grounded in the world: Developmental origins of the embodied mind.” In
W. Tschacher, ed. Dynamical systems approach to cognition: Concepts and empirical
paradigms based on self-organization, embodiment and coordination dynamics.
Singapore: World Scientific Publishing Company, pp.17-44.
Thompson, E., A. Noë, and L. Pessoa. 1999. “Perceptual completion: A case study in
phenomenology and cognitive science.” In B. Pachoud, J. Petitot, J.-M. Roy and F. Varela,
ed. Naturalizing phenomenology: Issues in contemporary phenomenology and cognitive
science. Stanford, CA: Stanford University Press, pp.161-195.
Todd, N. P. M. 1992. “The dynamics of dynamics: A model of musical expression.” Journal
of the Acoustical Society of America 91 (6):3540-3550.
———. 1999. “Motion in music: A neurobiological perspective.” Music Perception 17
(1):115-126.
Todes, S. 2001. Body and world Cambridge, MA: MIT Press.
Trueman, D. 1999. Reinventing the violin. Ph.D., Music, Princeton University, Princeton,
NJ.
———. 2000. “BoSSA: The deconstructed violin reconstructed.” Journal of New Music
Research 31 (2):119-129.
Trueman, D., C. Bahn, and P. Cook. 2000. “Alternative voices for electronic sound:
Spherical speakers and sensor-speaker arrays (SenSAs).” In Proceedings of the 2000
171
International Computer Music Conference. San Francisco, CA: International Computer
Music Association, pp.248-251.
Turing, A. M. 1936. “On computable numbers, with an application to the
Entscheidungsproblem.” Proceedings of the London Mathematical Society, Series 2
42:230-265.
———. 1950. “Computing machinery and intelligence.” Mind LIX (236):433-460.
Turkle, S. 1984. The second self: Computers and the human spirit. New York: Simon and
Schuster.
Turner, T. 2003. “The resonance of the cubicle: Laptop performance in post-digital
musics.” Contemporary Music Review 22 (4):81-92.
Ullmer, B., and H. Ishii. 2001. “Emerging frameworks for tangible user interfaces.” In J. M.
Carroll, ed. Human-computer interaction in the new millenium. Addison-Wesley, pp.579-
601.
Ungvary, T., and R. Vertegaal. 2000. “Designing musical cyberinstruments with body and
soul in mind.” Journal of New Music Research 19 (3):245-255.
Van Gelder, T. 1998. “The dynamical hypothesis in cognitive science.” Behavioral and
Brain Sciences 21:615-628.
Van Nort, D., M. Wanderley, and P. Depalle. 2004. “On the choice of mappings based on
geometric properties.” In Proceedings of the 2004 New Interfaces for Musical Expression
Conference. pp.87-91.
Varela, F. 1979. Principles of biological autonomy. Amsterdam: Elsevier (North Holland).
———. 1980. “Describing the logic of the living: The adequacy and limitations of the idea
of autopoiesis.” In M. Zeleny, ed. Autopoiesis: A theory of living organization. New York:
North Holland, pp.36-48.
———. 1992. “Making it concrete: Before, during and after breakdowns.” In J. Ogilvy, ed.
Revisioning Philosophy. Albany, NY: State University of New York Press, pp.97-109.
———. 1999. “The specious present: A neurophenomenology of time consciousness.” In B.
Pachoud, J. Petitot, J.-M. Roy and F. Varela, ed. Naturalizing phenomenology: Issues in
contemporary phenomenology and cognitive science. Stanford, CA: Stanford University
Press, pp.266-314.
172
Varela, F., and E. Thompson. 2001. “Radical embodiment: Neural dynamics and
consciousness.” Trends in Cognitive Sciences 5 (10):418-425.
Varela, F., E. Thompson, and E. Rosch. 1991. The embodied mind: Cognitive science and
human experience. Cambridge, MA: MIT Press.
Verplank, W. 2001. “A course on controllers.” In Proceedings of the 2001 New Interfaces
for Musical Expression Conference.
Von Neumann, J. 1958. The computer and the brain. New Haven, CT: Yale University
Press.
Wanderley, M. 2001. Performer-instrument interaction: applications to gestural control of
sound synthesis, University Paris 6, Paris.
Webb, B., and T. Smithers. 1992. “The connection between AI and biology in the study of
behavior.” In F. Varela and P. Bourgine, ed. Toward a practice of autonomous systems:
Proceedings of the first European conference on artificial life. Cambridge, MA: MIT Press,
pp.421-428.
Wegner, P. 1997. “Why interaction is more important than algorithms.” Communications
of the ACM 40 (5):80-91.
Weinberg, G. 2005. “Interconnected musical networks: Toward a theoretical framework.”
Copmuter Music Journal 29 (2):23-39.
Weiser, M. 1988. Ubiquitous computing #1 and #2. Palo Alto, CA: Xerox PARC.
———. 1991. “The computer for the twenty-first century.” Scientific American 265 (3):94-
104.
———. 1994. “The world is not a desktop.” Interactions 1 (1):7-8.
Weiser, M., and S. Brown. 1996. The coming age of calm technology. Palo Alto, CA: Xerox
PARC.
Wessel, D., and M. Wright. 2001. “Problems and prospects for intimate musical control of
computers.” In Proceedings of the 2001 New Interfaces for Musical Expression
Conference.
Whitelaw, M. 2003. “Sound particles and microsonic materialism.” Contemporary Music
Review 22 (4):93-100.
173
Wiener, N. 1961. Cybernetics; Or, control and communication in the animal and the
machine. Cambridge, MA: MIT Press.
Wild, P., P. Johnson, and H. Johnson. 2004. “Towards a composite modelling approach for
multitasking.” In Proceedings of the 3rd International Workshop on Task Models and
Diagrams for User Interface Design. Prague: ACM International Conference Proceeding
Series, pp.17-24.
Wilden, A. 1977. System and structure: Essays in communication and exchange. London:
Tavistock Publications.
Winkler, T. 1995. “Making motion musical: Gesture mapping strategies for interactive
computer music.” In Proceedings of the 1994 International Computer Music Conference.
San Francsico, CA: International Computer Music Association, pp.261-264.
Winograd, T., and F. Flores. 1986. Understanding computers and cognition: A new
foundation for design. Norwood, NJ: Ablex Publishing.
Zumthor, P. 1994. “Body and performance.” In H. U. Gombrecht and K. L. Pfeiffer, ed.
Stanford, CA: Stanford University Press, pp.218-226.