A computational model of rubato (Todd).pdf

Embed Size (px)

Citation preview

  • 8/11/2019 A computational model of rubato (Todd).pdf

    1/21

    PLEASE SCROLL DOWN FOR ARTICLE

    This article was downloaded by: [Ingenta Content Distribution Psy Press Titles]

    On: 5 December 2009

    Access details: Access Details: [subscription number 911796916]

    Publisher Routledge

    Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-

    41 Mortimer Street, London W1T 3JH, UK

    Contemporary Music Review

    Publication details, including instructions for authors and subscription information:

    http://www.informaworld.com/smpp/title~content=t713455393

    A computational model of rubato

    Neil Todd a

    aDepartment of Psychology, University of Exeter, Exeter, UK

    To cite this ArticleTodd, Neil'A computational model of rubato', Contemporary Music Review, 3: 1, 69 88

    To link to this Article DOI

    10.1080/07494468900640061

    URL http://dx.doi.org/10.1080/07494468900640061

    Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

    This article may be used for research, teaching and private study purposes. Any substantial orsystematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply ordistribution in any form to anyone is expressly forbidden.

    The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directlyor indirectly in connection with or arising out of the use of this material.

    http://www.informaworld.com/smpp/title~content=t713455393http://dx.doi.org/10.1080/07494468900640061http://www.informaworld.com/terms-and-conditions-of-access.pdfhttp://www.informaworld.com/terms-and-conditions-of-access.pdfhttp://dx.doi.org/10.1080/07494468900640061http://www.informaworld.com/smpp/title~content=t713455393
  • 8/11/2019 A computational model of rubato (Todd).pdf

    2/21

    Contemporary Music Review

    1989, Vol. 3 pp. 69-88

    Photocopying p ermitted b y license on ly

    9 1989 Harw ood Academic Publishers Gm bH

    Printed in the United Kingdom

    c o m p u t a t io n a l m o d e l o f r ub a to

    e i l T o d d

    Department of Psychology Un iversity of Exeter Exeter UK

    Presented is a model o f rubato, im plem ente d in Lisp, in which expression is view ed as the

    m ap pin g of mu sical structure into the variables of expression. T he basic idea is that the

    per form er use s "ph rase finallengthening" as a device to reflect some internal representation

    of the p hra se structure. The representation is bas ed on Lardahl and Jackendoff's time-span

    reduction. The basic heuristic in the m odel is recursive involving look-ahead a nd planning

    at a nu m ber of levels. The planned phrasings are superp osed beat by beat and the ou tput

    from the program is a l is t of durations which could easily be adapted to be sent to a

    synthesiser given a suitable system.

    KEYWORDS computational modelling, music cognition, musical performance, rubato,

    mental representation, m enta l process.

    n t r o du c t i o n

    O n e o f t h e m o s t u b i q u i t o u s e x p r e s s i v e d e v ic e s in m u s i c a l p e r f o r m a n c e i s

    r u b a t o . M o s t n o t a b l y i t i s u s e d i n m u s i c o f t h e r o m a n t i c e r a , b u t i s a l s o

    e v i d e n t i n a v a r i e t y of o t h e r s ty le s . R e s e a r c h o n m u s i c p e r f o r m a n c e

    ( S e a s h o r e , 1 9 3 8 ; S h a f f e r , 1 9 81 ; C l a r k e , 1 9 84 ; T o d d , 1 98 5 ; B e n g t s s o n &

    G a b r i e l s s o n , 1 98 0; S u n d b e r g & V e r iU o , 19 80 ) i n v o l v i n g t h e p r e c i s e

    m e a s u r e m e n t o f d u r a t i o n h a s s h o w n t h a t th e r e a r e a n u m b e r o f b a s i c

    o b s e r v a t i o n s w h i c h c a n b e m a d e . T h e f i rs t i s t h a t s k i ll ed p e r f o r m e r s c a n

    s h o w a r e m a r k a b l e d e g r e e o f r e p ro d u c i b i l i ty f r o m o n e p e r f o r m a n c e t o t h e

    n e x t ( S h a f f e r , 1 9 8 4 ; G a b r i e l s s o n , 1 9 8 7 ) . T h i s p r e c i s i o n i n t i m i n g s h o w s

    t h a t th e p e r f o r m a n c e m u s t i n v o l v e th e u s e o f g e n e r a t i v e p r o c e d u r e s a n d a

    p r e c is e i n te r n a l r e p r e s e n t a t i o n o f u n d e r l y i n g e x p r e s s i v e fo r m . A s e c o n d

    o b s e r v a t i o n is t h e u s e o f s l o w i n g t o m a r k a p h r a s e b o u n d a r y ( T o d d , 1 98 5) ,

    w h i c h h a s b e e n s h o w n t o a p p l y r e c u r s i v e ly a t a n u m b e r o f l e v el s ( S h a ff e r

    & T o d d , 1 9 87 ).

    I n T o d d (1 98 5) a m o d e l o f r u b a t o w a s e s t a b l i s h e d w h i c h g e n e r a t e d a

    d u r a t i o n s t r u c t u r e f r o m a s t r u c t u r a l d e s c r i p t i o n o f a p i e c e o f m u s i c . T h e

    i d e a o f t h e m o d e l w a s t h a t t h e p e r f o r m e r u s e s " p h r a s e f in a l l e n g t h e n i n g "

    t o s ig n a l a b o u n d a r y - - t h e d e g r e e o f s l o w i n g d e t e r m i n e d b y th e

    69

    Downl

    oad

    ed

    By:

    [Ingenta

    Content

    Di

    strib

    uti

    on

    Psy

    Press

    Ti

    tl

    es]

    At:08

    :065

    Decemb

    er2009

    Contemporary

    usic Review

    1989, Vol. 3 pp. 69-88

    Photocopying permitted by license only

    1989 Harwood Academic Publishers GmbH

    Printed in the United Kingdom

    computational model

    o

    rubato

    Neil

    Todd

    Department of Psychology University of Exeter Exeter UK

    Presented is a model of rubato, implemented in Lisp, in which expression is viewed as

    the

    mapping of musical structure into the variables of expression. The basic idea is that the

    performer uses phrase final lengthening as a device to reflect some internal representat ion

    of the phrase structure. The representation is based on Lardahl and Jackendoff's time-span

    reduction. The basic heuristic in the model is recursive involving look-ahead and planning

    at a number of levels. The planned phrasings are superposed beat by beat and the output

    from

    the

    program is a list of durations which could easily be adapted to be

    sent

    to a

    synthesiser given a suitable system.

    KEYWORDS computational modelling, music cognition, musical performance, rubato,

    mental representation, mental process.

    Introduction

    One

    of

    the

    most ubiquitous expressive devices in musical performance is

    rubato. Most notably it is used in music of the romantic era, but is also

    evident in a variety of

    other

    styles. Research on music performance

    (Seashore, 1938; Shaffer, 1981; Clarke, 1984; Todd, 1985; Bengtsson

    Gabrielsson, 1980; Sundberg Verillo, 1980) involving the precise

    measurement of duration has shown that there are a number of basic

    observations which can be made. The first is

    that

    skilled performers can

    show

    a remarkable degree of reproducibility from one performance to the

    next (Shaffer, 1984; Gabrielsson, 1987). This precision in timing shows

    that

    the performance must involve the use of generative procedures and a

    precise internal representation of underlying expressive form. A second

    observation is the use of slowing to mark a phrase boundary (Todd, 1985),

    which has

    been shown

    to apply recursively at a

    number

    of levels (Shaffer

    Todd, 1987).

    In Todd

    1985)

    a model of rubato

    was

    established which generated a

    duration structure from a structural description of a piece of music. The

    idea of the model

    was

    that the

    performer uses

    phrase

    final lengthening

    to signal a

    boundary

    - the degree of slowing determined by the

    69

    Contemporary

    usic Review

    1989, Vol. 3 pp. 69-88

    Photocopying permitted by license only

    1989 Harwood Academic Publishers GmbH

    Printed in the United Kingdom

    computational model

    o

    rubato

    Neil

    Todd

    Department of Psychology University of Exeter Exeter UK

    Presented is a model of rubato, implemented in Lisp, in which expression is viewed as

    the

    mapping of musical structure into the variables of expression. The basic idea is that the

    performer uses phrase final lengthening as a device to reflect some internal representat ion

    of the phrase structure. The representation is based on Lardahl and Jackendoff's time-span

    reduction. The basic heuristic in the model is recursive involving look-ahead and planning

    at a number of levels. The planned phrasings are superposed beat by beat and the output

    from

    the

    program is a list of durations which could easily be adapted to be

    sent

    to a

    synthesiser given a suitable system.

    KEYWORDS computational modelling, music cognition, musical performance, rubato,

    mental representation, mental process.

    Introduction

    One

    of

    the

    most ubiquitous expressive devices in musical performance is

    rubato. Most notably it is used in music of the romantic era, but is also

    evident in a variety of

    other

    styles. Research on music performance

    (Seashore, 1938; Shaffer, 1981; Clarke, 1984; Todd, 1985; Bengtsson

    Gabrielsson, 1980; Sundberg Verillo, 1980) involving the precise

    measurement of duration has shown that there are a number of basic

    observations which can be made. The first is

    that

    skilled performers can

    show

    a remarkable degree of reproducibility from one performance to the

    next (Shaffer, 1984; Gabrielsson, 1987). This precision in timing shows

    that

    the performance must involve the use of generative procedures and a

    precise internal representation of underlying expressive form. A second

    observation is the use of slowing to mark a phrase boundary (Todd, 1985),

    which has

    been shown

    to apply recursively at a

    number

    of levels (Shaffer

    Todd, 1987).

    In Todd

    1985)

    a model of rubato

    was

    established which generated a

    duration structure from a structural description of a piece of music. The

    idea of the model

    was

    that the

    performer uses

    phrase

    final lengthening

    to signal a

    boundary

    - the degree of slowing determined by the

    69

  • 8/11/2019 A computational model of rubato (Todd).pdf

    3/21

    70 Neil Todd

    importance of the boundary. The input to the model was the time-span

    reduction of Lerdahl and Jackendoff's theory (1983). Whilst the model

    gave a reasonable description of the data from actual performances of

    some pieces there were, however, a numbe r of objections to the model as

    iL stood. This has led to the formulation of a new model. In this paper I will

    describe the new model and the reasoning which led to its formulation.

    he reduction hyp othesis and k no w ledg e representat ion

    The first problem with the Todd (1985) model stems from the fact that it

    inherits the reduction hyp othes is of Lerdahl & Jackendoff's theory.

    That is, the listener, and therefore the performer, sees each event in a

    single coherent s truc ture.

    This hypothesis places too high a demand on

    working memory to be psychologically plausible. In terms of the model it

    means that wh en computing a boundary strength, every event in time ~

    span reduction is taken into account, irrespective of how close, or how far

    apart, the events are in time. This leads to the prediction of more degrees

    of boundary strength, and therefore degrees of relative slowing, than can

    be discerned from the data. On the other hand , it is both psychologically

    plausible and musically necessary that the performer should have some

    kind of global overview of the piece as well as being able to look ahe ad

    to some degree in order to plan a phrase.

    A solution to this problem, which is the first premise of the updated

    model, is to suppose that the internal rep resent atio n-- rather than being

    a single, simply connected tree - - is composed of a set (or forest) of trees

    organised on a number of hierarchic levels with each subset of trees at one

    level being bound by a tree at a higher level. This accords with

    Anderson's ACT* theory of cognition (1983). In the theory knowledge

    comes in chunks or cognitive units which can be such things as

    propositions, spatial images or temporal relations. A cognitive unit

    encodes a set of no more than about five elements. Larger structures are

    created by the hierarchical embedding of cognitive units. Of particular

    interest to us here are cognitive units encoding temporal information

    which Anderson refers to as temporal strings . The notion of temporal

    strings accords well with the idea of musical groups.

    A model of performance constructed on this basis predicts a duration

    structure determined by the superposition of a number of hierarchic

    timing components, from a global component, span ning the whole piece,

    to a local component spanning a few beats with each component

    corresponding to structural level. This overcomes the objections

    discussed above because for any event at one level the number of other

    events directly connected is limited. At the same time it allows for look

    ahead and gives the performer global overview.

    he process o f performance

    A second object ion to the Todd (1985) model is that it is off line . In other

    Downl

    oad

    ed

    By:

    [Ingenta

    Content

    Di

    strib

    uti

    on

    Psy

    Press

    Ti

    tl

    es]

    At:08

    :065

    Decemb

    er2009

    70

    Neil Todd

    importance of the

    boundary.

    The fnput to the model was the time-span

    reduction of Lerdahl

    and

    Jackendoff's theory (1983). Whilst the model

    gave a reasonable description of the data from actual performances of

    some

    pieces there were, however, a

    number

    of objections to

    the

    model

    as

    it stood. This has led to the formulation of a new model.

    In

    this paper I will

    describe the

    new model and

    the reasoning which led to its formulation.

    The reduction

    hypothesis

    and knowledge representation

    The first problem with the Todd 1985) model stems from the fact that it

    inherits the reduction hypothesis of Lerdahl Jackendoff's theory.

    That is, the listener, and therefore the performer, sees each event in a

    single

    coherent structure

    This hypothesis places too high a

    demand

    on

    working

    memory

    to

    be

    psychologically plausible. In terms of the model it

    means that when computing a boundary strength, every

    event

    in time

    span reduction is taken into account, irrespective of how close, or how far

    apart,

    the

    events are in time. This leads to

    the

    prediction of more degrees

    of boundary strength, and therefore degrees of relative slowing, than can

    be discerned from

    the

    data. On the

    other hand,

    it is

    both

    psychologically

    plausible and musically necessi'1ry that the performer should have some

    kind

    of global overview of the piece as well as being able to look ahead

    to some degree in

    order

    to

    plan

    a phrase.

    A solution to this problem, which is

    the

    first premise of the

    updated

    model, is to suppose that the internal representation -

    rather

    than being

    a single, simply connected tree - is composed of a

    set

    (or forest) of trees

    organised on a number of hierarchic levels with each

    subset

    of trees at one

    level being bound by a tree at a higher level. This accords with

    Anderson's ACT* theory of cognition (1983). In the theory knowledge

    comes in

    chunks

    or cognitive units which can be

    such

    things as

    propositions, spatial images or temporal relations. A cognitive

    unit

    encodes a

    set

    of

    no

    more than about five elements. Larger structures are

    created by the hierarchical embedding of cognitive units. Of particular

    interest to

    us

    here are cognitive units encoding temporal information

    which

    Anderson

    refers to as temporal strings . The notion of temporal

    strings accords well with the idea of musical groups.

    A model of performance constructed on this basis predicts a duration

    structure

    determined

    by

    the

    superposition of a number of hierarchic

    timing components, from a global

    component, spanning

    the whole piece,

    to a local component spanning a few beats with each component

    corresponding to structural level. This overcomes the objections

    discussed above because for any event at one level the number of other

    events directly connected is limited.

    At

    the same time it allows for look

    ahead

    and

    gives the performer global overview.

    The process of performance

    A second objection to the Todd 1985) model is that it is off line . In other

    70

    Neil Todd

    importance of the

    boundary.

    The fnput to the model was the time-span

    reduction of Lerdahl

    and

    Jackendoff's theory (1983). Whilst the model

    gave a reasonable description of the data from actual performances of

    some

    pieces there were, however, a

    number

    of objections to

    the

    model

    as

    it stood. This has led to the formulation of a new model.

    In

    this paper I will

    describe the

    new model and

    the reasoning which led to its formulation.

    The reduction

    hypothesis

    and knowledge representation

    The first problem with the Todd 1985) model stems from the fact that it

    inherits the reduction hypothesis of Lerdahl Jackendoff's theory.

    That is, the listener, and therefore the performer, sees each event in a

    single

    coherent structure

    This hypothesis places too high a

    demand

    on

    working

    memory

    to

    be

    psychologically plausible. In terms of the model it

    means that when computing a boundary strength, every

    event

    in time

    span reduction is taken into account, irrespective of how close, or how far

    apart,

    the

    events are in time. This leads to

    the

    prediction of more degrees

    of boundary strength, and therefore degrees of relative slowing, than can

    be discerned from

    the

    data. On the

    other hand,

    it is

    both

    psychologically

    plausible and musically necessi'1ry that the performer should have some

    kind

    of global overview of the piece as well as being able to look ahead

    to some degree in

    order

    to

    plan

    a phrase.

    A solution to this problem, which is

    the

    first premise of the

    updated

    model, is to suppose that the internal representation -

    rather

    than being

    a single, simply connected tree - is composed of a

    set

    (or forest) of trees

    organised on a number of hierarchic levels with each

    subset

    of trees at one

    level being bound by a tree at a higher level. This accords with

    Anderson's ACT* theory of cognition (1983). In the theory knowledge

    comes in

    chunks

    or cognitive units which can be

    such

    things as

    propositions, spatial images or temporal relations. A cognitive

    unit

    encodes a

    set

    of

    no

    more than about five elements. Larger structures are

    created by the hierarchical embedding of cognitive units. Of particular

    interest to

    us

    here are cognitive units encoding temporal information

    which

    Anderson

    refers to as temporal strings . The notion of temporal

    strings accords well with the idea of musical groups.

    A model of performance constructed on this basis predicts a duration

    structure

    determined

    by

    the

    superposition of a number of hierarchic

    timing components, from a global

    component, spanning

    the whole piece,

    to a local component spanning a few beats with each component

    corresponding to structural level. This overcomes the objections

    discussed above because for any event at one level the number of other

    events directly connected is limited.

    At

    the same time it allows for look

    ahead

    and

    gives the performer global overview.

    The process of performance

    A second objection to the Todd 1985) model is that it is off line . In other

  • 8/11/2019 A computational model of rubato (Todd).pdf

    4/21

    A c o mp u ta tio n a l mo d e l o f ru b a to 71

    w o r d s i t d o e s n o t d e s c r i b e th e p r o c e s s o f p e r f o r m a n c e . W h i ls t it is

    r e a s o n a b l e t o s u p p o s e t h a t t h e p e r f o r m e r c a n h o l d t h e w h o l e s t r u c t u re i n

    l o n g - t e r m m e m o r y , i n d e e d a m u s i c i a n s ' s a b il it y t o m e m o r i s e i s q u i t e

    r e m a r k a b l e , i t s e e m s i m p l a u s i b le t h a t t h e p e r f o r m e r c o u l d a c c e s s t h e

    w h o l e s t ru c t u r e a t a n y o n e t im e . I n t h e e a rl y m o d e l t h e c o m p u t a t i o n s

    w e r e d o n e f o r e a c h c o m p o n e n t a n d t h e n a d d e d t o g et h e r. I n a n a ct u al

    p e r f o r m a n c e t h e c o m p u t a t i o n s a r e d o n e a s e a c h p h r a s e i s a c c e s s e d i n tu r n

    a n d t h e c o m p o n e n t s s u p e r p o s e d n o t e b y n o te .

    T h e o b v i o u s a n s w e r , a n d t h is is t h e s e c o n d p r e m i s e o f t h e n e w m o d e l ,

    i s t h a t in o r d e r t o d e sc r i b e th e p r o c e s s o f p e r f o r m a n c e t h e m o d e l n e e d s t o

    b e f o r m u l a t e d i n c o m p u t a t i o n a l t e r m s a n d i m p l e m e n t e d i n a s u i t a b l e

    h i g h - l e v e l la n g u a g e s u c h a s L i sp . In p a r ti c u l a r, w h a t i s i m p o r t a n t h e r e i s

    t h e i d e a t h a t a p r o c e s s s h o u l d b e c a st i n te r m s o f a n e f f ec t iv e p r o c e d u r e

    ( L o n g u e t - H i g g i n s , 1 97 8, 1 98 1; J o h n s o n - L a i r d , 1 98 3), t h u s e n a b l i n g t h e

    t h e o r y t o b e p r e c i s e a n d t e s ta b l e .

    he indeterminism of individu al perform ances

    W h i l s t s u c h a t h e o r y d o e s m a k e p r e d i c t i o n s , g i v e n a c e r t a i n i n p u t , t h e

    g o a l o f t h e t h e o r y i s n o t t h e p r e d i c t io n o f i n d i v i d u a l p e r f o r m a n c e s a s

    s u c h , b u t t h e p r i n c i p l e d e x p l a n a t i o n o f p e r f o r m a n c e d a t a . T h i s i s s o i n

    p s y c h o l o g y in g e n e r a l, a n d m u s i c p s y c h o l o g y in p a rt ic u la r , b e c a u s e if t h e

    t h e o r y w e r e c o m p l e t e l y d e te r m i n i s ti c i t w o u l d n e g a t e t h e c r e at iv e a s p e c t

    o f p e r f o r m a n c e . J o h n s o n - L a i r d (198 3) h a s e x p r e s s e d t h is i n d e t e r m i n i s m

    o f i n d i v i d u a l p e r f o r m a n c e s i n t h e l a n g u a g e o f c o m p u t e r s c i en c e :

    I f h u m a n b e in g s a r e a t l e a st a s co m p l i c a te d a s T u r i n g m a c h i n e s a n d t h e ir

    i n d i v i d u a l p r o c e s se s o f t h o u g h t d i f f e r a s a r e s u l t o f t h e i r g e n e s a n d e x p e r ie n c e ,

    t h e n t h e i r b e h a v i o u r i s m o s t u n l i k e l y t o b e c o m e w h o l l y p r e d i c t a b l e , b e c a u s e

    t h e r e is n o e f f e c ti v e p r o c e d u r e t h a t c a n p r e d i c t t h e b e h a v i o u r o f a n a r b i t r a r y

    T u r i n g m a c h i n e . T h e r e i s t h u s l i t t l e d a n g e r o f c r e a t i n g a p s y c h o l o g y c a p ab le o f

    m o d e l l in g a n i n d i v i d u a l ' s t h o u g h t s - - a n e v e n t u a l i t y l ik e l y t o d e s t r o y t h e

    s p o n t a n e i t y a n d s i g n i f ic a n c e o f l ife . B u t t h e r e a r e n o

    a p r io r i

    r e a s o n s f o r

    s u p p o s i n g t h a t i t is i m p o s s i b l e t o d e v e lo p s c i e n t i f ic t h e o r ie s o f g e n e r a l

    p s y c h o l o g i c a l a b i l i t i e s .

    [ Joh nso n-L a i rd , 1983 ; p . 12 ]

    h e c o m p u t a t io n a l t h e o r y o f a n e x p r e s s io n s y s t e m

    T h e s e t w o is s u e s d i s c u s s e d a b o v e , o f r e p r e s e n t a t i o n a n d p r o c e s s , a r e

    c e n tr a l to a n y i n f o r m a t i o n - p r o c e s s i n g t y p e a p p r o a c h t o c o g n i t i o n a n d

    c o g n i t iv e m o d e l l i n g . O u r m a i n t a sk , t h e r e f o re , i n t h e c o n s t r u c t i o n o f s u c h

    a m o d e l is t o m a k e e x p li c it , in t h e f o r m o f a n a l g o r i t h m , t h e p r o c e s s o f

    p e r f o r m a n c e a n d i ts i n p u t . H o w e v e r , a s D a v i d M a r r (1 98 2) h a s s a i d s u c h

    a s y s t e m c a n b e v i e w e d f r o m t h r e e l e v e l s o f e x p l a n a ti o n :

    Downl

    oad

    ed

    By:

    [Ingenta

    Content

    Di

    strib

    uti

    on

    Psy

    Press

    Ti

    tl

    es]

    At:08

    :065

    Decemb

    er2009

    A

    computational model

    of rubato 71

    words it does not describe the process of performance. Whilst it is

    reasonable to

    suppose that

    the performer can hold the whole structure in

    long-term memory, indeed a musicians's ability to memorise is quite

    remarkable, it seems implausible

    that

    the

    performer could access

    the

    whole structure at

    anyone

    time. In the early model the computations

    were

    done

    for each

    component and then

    added together. In

    an

    actual

    performance the computat ions are

    done as each phrase is accessed

    in turn

    and

    the

    components superposed

    note by note.

    The obvious answer, and this is the second premise of the new model,

    is that in order to describe the process of performance the model

    needs

    to

    be formulated in computational terms

    and implemented

    in a suitable

    high-level language

    such

    as Lisp. In particular, what is

    important

    here is

    the idea that a process should be cast in terms of an effective procedure

    (Longuet-Higgins, 1978, 1981; Johnson-Laird, 1983),

    thus

    enabling the

    theory to be precise and testable.

    he indeterminism

    o

    individual performances

    Whilst

    such

    a theory does make predictions, given a certain input, the

    goal of the theory is not the prediction of individual performances as

    such, but the principled explanation of performance data. This is so in

    psychology in general, and music psychology in particular, because

    i

    the

    theory

    were

    completely deterministic it would negate the creative aspect

    of performance. Johnson-Laird

    1983)

    has expressed this indeterminism

    of individual performances in the language of

    computer

    science:

    If human beings

    are

    at least

    as complicated

    as Turing machines and their

    individual processes of thought

    differ

    as a result of

    their

    genes

    and experience

    then their behaviour

    is

    most unlikely

    to become

    wholly predictable because

    there

    is no effective procedure that

    can

    predict the

    behaviour

    of an arbitrary

    Turing machine. There is thus little danger ofcreating a psychology

    capable

    of

    modelling an

    individual's thoughts - an eventuality

    likely to destroy

    the

    spontaneity

    and significance

    of

    life.

    But

    there

    are no

    a priori reasons for

    supposing that

    it

    is impossible to develop scientific theories of general

    psychological abilities.

    Uohnson-Laird, 1983; p. 12]

    he computational theory

    o

    an expression system

    These two issues discussed above, of representation and process, are

    central to

    any

    information-processing type

    approach

    to cognition

    and

    cognitive modelling. Our main task, therefore, in the construction of such

    a model is to make explicit, in the form of

    an

    algorithm, the process of

    performance and its input. However, as David Marr 1982) has said such

    a system can be viewed from three levels of explanation:

    A

    computational model

    of rubato 71

    words it does not describe the process of performance. Whilst it is

    reasonable to

    suppose that

    the performer can hold the whole structure in

    long-term memory, indeed a musicians's ability to memorise is quite

    remarkable, it seems implausible

    that

    the

    performer could access

    the

    whole structure at

    anyone

    time. In the early model the computations

    were

    done

    for each

    component and then

    added together. In

    an

    actual

    performance the computat ions are

    done as each phrase is accessed

    in turn

    and

    the

    components superposed

    note by note.

    The obvious answer, and this is the second premise of the new model,

    is that in order to describe the process of performance the model

    needs

    to

    be formulated in computational terms

    and implemented

    in a suitable

    high-level language

    such

    as Lisp. In particular, what is

    important

    here is

    the idea that a process should be cast in terms of an effective procedure

    (Longuet-Higgins, 1978, 1981; Johnson-Laird, 1983),

    thus

    enabling the

    theory to be precise and testable.

    he indeterminism

    o

    individual performances

    Whilst

    such

    a theory does make predictions, given a certain input, the

    goal of the theory is not the prediction of individual performances as

    such, but the principled explanation of performance data. This is so in

    psychology in general, and music psychology in particular, because

    i

    the

    theory

    were

    completely deterministic it would negate the creative aspect

    of performance. Johnson-Laird

    1983)

    has expressed this indeterminism

    of individual performances in the language of

    computer

    science:

    If human beings

    are

    at least

    as complicated

    as Turing machines and their

    individual processes of thought

    differ

    as a result of

    their

    genes

    and experience

    then their behaviour

    is

    most unlikely

    to become

    wholly predictable because

    there

    is no effective procedure that

    can

    predict the

    behaviour

    of an arbitrary

    Turing machine. There is thus little danger ofcreating a psychology

    capable

    of

    modelling an

    individual's thoughts - an eventuality

    likely to destroy

    the

    spontaneity

    and significance

    of

    life.

    But

    there

    are no

    a priori reasons for

    supposing that

    it

    is impossible to develop scientific theories of general

    psychological abilities.

    Uohnson-Laird, 1983; p. 12]

    he computational theory

    o

    an expression system

    These two issues discussed above, of representation and process, are

    central to

    any

    information-processing type

    approach

    to cognition

    and

    cognitive modelling. Our main task, therefore, in the construction of such

    a model is to make explicit, in the form of

    an

    algorithm, the process of

    performance and its input. However, as David Marr 1982) has said such

    a system can be viewed from three levels of explanation:

  • 8/11/2019 A computational model of rubato (Todd).pdf

    5/21

    72 Neil Todd

    A t o ne e x t r e m e t he top le v e l i s t he abs t r ac t c om pu t a t i on a l t he or y o f t he de v i c e

    i n w h i c h t he pe r f o r m anc e o f t he de v i c e is c har ac t e ri z ed a s a m a pp i ng f r om one

    k i nd o f i n f o r m a t i on t o ano t he r . . . . I n t he c e n t r e i s t he c hoi ce o f r e pr e s e n t a t i on

    f o r t h e i n p u t a n d o u t p u t a n d t h e a l g o r it h m t o be u s e d t o tr a n s f o r m o n e i n to t h e

    o t h er . A n d a t t h e o t h e r e x tr e m e a r e th e d e ta i ls o f h o w t h e a l g o r i th m a n d t h e

    representa t ion are rea l i zed phys ica l ly .

    [Marr, 1982; p. 24]

    At the level of computational theory then, is useful to express the

    various processes of music performance in symbolic terms. Let N stand

    for the music notation or score, P for performance, and 9 for the internal

    representation. Thus we can think of the process of performance as a

    mapping:

    9 :

    v ~ ~ p 1 . a )

    where the map ping is carried out by a pe r f o r m anc e p r oc e dur e or f u n c t i o n ~ .

    In the same way the process of sight-reading can be thought of as a double

    mapping:

    N---~ 9 ~ P (1.b)

    Conversely, we can think of the process of perception as the mapping:

    A :P--~ ~ (2.a)

    where the mapping is carried out by a

    l i s t e n i ng p r oc e dur e

    or

    f u n c t i o n A .

    Again in the same manner the process of notation can be thought of as:

    P---~ 9 ~ N (2.b)

    So, at the algor ithmic level then , our task is twofold: a) to find a suitable

    representa tion for ~; and b) to make explicit an algorithm for performing

    the mapping 9 --~ P.

    ethodology

    The methodology adopted in order to implement the twofold task

    outlined above is threefold:

    (a) a n a l y si s - - which involves finding a value for ~, either from the score

    or the data;

    (b) s y n t h e s i s - - which involves taking the value for 9 and using it as an

    input to a performance algorithm which generates an output in the

    form of a graph or list of numbers; a nd

    (c) e v a l u a t i o n - - which involves the comparison of data with algorithm

    output.

    This metho d is, of course, similar to the analysis-by-synthesis me tho d

    of Sundberg and his co-workers (Fryden & Sundberg, 1984) but pe rhaps

    more closely related to the method of Risset and Wessel in their work on

    timbre (Risset & Wessel, 1982). The differences with the Sundberg

    Downl

    oad

    ed

    By:

    [Ingenta

    Content

    Di

    strib

    uti

    on

    Psy

    Press

    Ti

    tl

    es]

    At:08

    :065

    Decemb

    er2009

    72 Neil

    Todd

    t one extreme the top

    level is

    the

    abstract

    computational theory of

    the

    device

    in which

    the

    performance of

    the

    device is characterized

    as

    a mapping

    from

    one

    kind of information

    to another

    In the centre is

    the

    choice

    of

    representation

    for the

    input

    and

    output

    and

    the

    algorithm

    to

    be

    used

    to

    transform

    one

    into

    the

    other.

    nd

    at

    the

    other extreme are

    the

    details

    of

    how

    the

    algorithm and

    the

    representation are realized physically.

    [Marr,

    1982; p.

    24]

    At the level of computational theory then, is useful to express

    the

    various processes of music performance

    in

    symbolic terms. Let N

    stand

    for

    the

    music notation

    or

    score, P for performance,

    and

    I for the internal

    representation. Thus

    we

    can think of the process of performance as a

    mapping:

    l.a)

    where

    the mapping

    is carried

    out

    by a performance procedure or function

    1T.

    In

    the same way the process of sight-reading can be thought of as a double

    mapping:

    l.b)

    Conversely,

    we

    can think of the process of perception as the mapping:

    A ~

    I 2.a)

    where the

    mapping

    is carried

    out

    by

    a

    listening

    procedure

    or

    function

    A.

    Again in the same manner the process of notation can be thought of as:

    P ~ I ~ N 2.b)

    So,

    at

    the algorithmic level

    then our

    task is twofold: a) to find a suitable

    representation for

    1 ; and b)

    to make explicit

    an

    algorithm for performing

    the

    mapping

    I P.

    ethodology

    The methodology

    adopted

    in order to implement the twofold task

    outlined above is threefold:

    (a)

    analysis - which involves finding a value for 1 , either from the score

    or the data;

    (b) synthesis - which involves taking the value for I

    and

    using it as

    an

    input to a performance algorithm which generates

    an output in

    the

    form of a

    graph or

    list of numbers;

    and

    (c)

    evaluation

    - which involves the comparison of ~ t with algorithm

    output.

    This method is, of course, similar to the analysis-by-synthesis

    method

    of Sundberg

    and

    his co-workers (Fryden Sundberg, 1984) but

    perhaps

    more closely related to the

    method

    of Risset

    and

    Wessel

    in

    their work on

    timbre (Risset Wessel, 1982). The differences with the

    Sundberg

    72 Neil

    Todd

    t one extreme the top

    level is

    the

    abstract

    computational theory of

    the

    device

    in which

    the

    performance of

    the

    device is characterized

    as

    a mapping

    from

    one

    kind of information

    to another

    In the centre is

    the

    choice

    of

    representation

    for the

    input

    and

    output

    and

    the

    algorithm

    to

    be

    used

    to

    transform

    one

    into

    the

    other.

    nd

    at

    the

    other extreme are

    the

    details

    of

    how

    the

    algorithm and

    the

    representation are realized physically.

    [Marr,

    1982; p.

    24]

    At the level of computational theory then, is useful to express

    the

    various processes of music performance

    in

    symbolic terms. Let N

    stand

    for

    the

    music notation

    or

    score, P for performance,

    and

    I for the internal

    representation. Thus

    we

    can think of the process of performance as a

    mapping:

    l.a)

    where

    the mapping

    is carried

    out

    by a performance procedure or function

    1T.

    In

    the same way the process of sight-reading can be thought of as a double

    mapping:

    l.b)

    Conversely,

    we

    can think of the process of perception as the mapping:

    A ~

    I 2.a)

    where the

    mapping

    is carried

    out

    by

    a

    listening

    procedure

    or

    function

    A.

    Again in the same manner the process of notation can be thought of as:

    P ~ I ~ N 2.b)

    So,

    at

    the algorithmic level

    then our

    task is twofold: a) to find a suitable

    representation for

    1 ; and b)

    to make explicit

    an

    algorithm for performing

    the

    mapping

    I P.

    ethodology

    The methodology

    adopted

    in order to implement the twofold task

    outlined above is threefold:

    (a)

    analysis - which involves finding a value for 1 , either from the score

    or the data;

    (b) synthesis - which involves taking the value for I

    and

    using it as

    an

    input to a performance algorithm which generates

    an output in

    the

    form of a

    graph or

    list of numbers;

    and

    (c)

    evaluation

    - which involves the comparison of ~ t with algorithm

    output.

    This method is, of course, similar to the analysis-by-synthesis

    method

    of Sundberg

    and

    his co-workers (Fryden Sundberg, 1984) but

    perhaps

    more closely related to the

    method

    of Risset

    and

    Wessel

    in

    their work on

    timbre (Risset Wessel, 1982). The differences with the

    Sundberg

  • 8/11/2019 A computational model of rubato (Todd).pdf

    6/21

    A comp utational model ofrubato 73

    method are that the starting point here is actual performances, rather

    than performer intuitions, a nd the evaluation process involves the direct

    comparison of data and model, rather than the subjective rating of

    generated output.

    A n a l y s i s : s c o r e > r e p r e s e n t a t i o n v s . data ~ r e p r e s e n t a t i o n

    We ne ed to find a value for the internal representa tion ~. A distinction is

    made here between three possible representations. First, the ana l y s t s

    r e pr e se n t a ti on ~ A , which is determined directly from the score; second, the

    per former s r epresenta t ion ~p, which is also determined from the score but

    which is unobservable; and third, the da ta de t e r m i ne d r e pr e se n t a ti on ~ m . So,

    we can represent the computational theory at this stage thus:

    ~trp --~ Pp ~ at D

    N ~ (3)

    ~A

    To find a value for ~A involves taking the score of the piece of music und er

    investigation and production an anlysis of the grouping or phrase

    structure. At the mome nt t he most useful analytic meth od is that

    developed by Lerdahl and Jackendoff (1983) despite its deficiencies

    (Slawson Peel, 1985; Clarke, 1986; Baker, in press). After the analysis is

    complete the grouping is converted to a Lisp representation which

    becomes the input to an algorithm for generating a duration structure.

    (see figure 1).

    ( s e t q t s r ( ( A) ( B)

    ( s e t q A ( (a ) ( a ) ) )

    ( s e t q B ( (b ) ( b ) ) )

    ( s e t q a ( 3 1 2 i ) )

    ( s e t q b ( 3 1 2 I ) )

    A ) ) )

    Figure 1 A Lisp representationof Lerdahland Jackendoff'sbracket notationfor grouping.

    At the top level there are two groups A and B arranged symmetrically n the order ABA.

    Group A contains he sub-group a repeated and group Bcontains he sub-group b repeated.

    The integers represent the metricalstrength of a beat.

    A value for ~D determined by analysing the data from actual

    performances. The basic idea is that a slowing indicates a group

    bounda ry. This can be done systematically using an algorithm l i s t e n

    which takes that data as input and returns a Lisp representation of the

    grouping t s r (Todd, in press).

    Downl

    oad

    ed

    By:

    [Ingenta

    Content

    Di

    strib

    uti

    on

    Psy

    Press

    Ti

    tl

    es]

    At:08

    :065

    Decemb

    er2009

    A computational model o

    rubato 73

    method

    are

    that the

    starting point here is actual performances, rather

    than

    performer intuitions,

    and

    the evaluation process involves the direct

    comparison of data

    and

    model, rather

    than the

    subjective rating of

    generated

    output.

    Analysis: score _ representation vs. data _ representation

    We

    need

    to find a value for

    the

    internal representation

    It.

    A distinction is

    made

    here between three possible representations. First, the analyst s

    representation qt

    A

    which is determined directly from the score; second, the

    performer s representation It

    p,

    which is also determined from the score but

    which is unobservable;

    and

    third,

    the

    data determined representation ltD So,

    we can represent the computational theory at this stage thus:

    3)

    To find a value for ItA involves taking the score of the piece of music

    under

    investigation

    and

    production

    an

    anlysis of the

    grouping or

    phrase

    structure. At the

    moment

    the most useful analytic

    method

    is

    that

    developed

    by

    Lerdahl

    and

    Jackendoff

    1983)

    despite its deficiencies

    (Slawson Peel, 1985; Clarke, 1986; Baker, n press). After the analysis is

    complete the

    grouping

    is converted to a Lisp representation which

    becomes the input to

    an

    algorithm for generating a duration structure.

    (see figure 1).

    (setq t s r A)

    (B) (A)

    (setq A a )

    ( a)

    (setq B

    b )

    (b)

    (setq

    a

    (3

    1 2

    1

    (setq b (3 1 2 1

    Figure 1 A Lisp representation of Lerdahl and Jackendoff s bracket notation for grouping.

    At the top level there are two groups A and B arranged symmetrically

    in

    the order ABA.

    Group

    A contains the sub-group

    a

    repeated

    and

    group B contains the sub-group

    b

    repeated.

    The integers represent the metrical strength of a beat.

    A value for qtD determined by analysing the data from actual

    performances. The basic idea is

    that a slowing indicates a group

    boundary. This can be done systematically using

    an

    algorithm

    listen

    which takes that data as input and returns a Lisp representation of the

    grouping

    tsr

    (Todd, in press).

    A computational model o

    rubato 73

    method

    are

    that the

    starting point here is actual performances, rather

    than

    performer intuitions,

    and

    the evaluation process involves the direct

    comparison of data

    and

    model, rather

    than the

    subjective rating of

    generated

    output.

    Analysis: score _ representation vs. data _ representation

    We

    need

    to find a value for

    the

    internal representation

    It.

    A distinction is

    made

    here between three possible representations. First, the analyst s

    representation qt

    A

    which is determined directly from the score; second, the

    performer s representation It

    p,

    which is also determined from the score but

    which is unobservable;

    and

    third,

    the

    data determined representation ltD So,

    we can represent the computational theory at this stage thus:

    3)

    To find a value for ItA involves taking the score of the piece of music

    under

    investigation

    and

    production

    an

    anlysis of the

    grouping or

    phrase

    structure. At the

    moment

    the most useful analytic

    method

    is

    that

    developed

    by

    Lerdahl

    and

    Jackendoff

    1983)

    despite its deficiencies

    (Slawson Peel, 1985; Clarke, 1986; Baker, n press). After the analysis is

    complete the

    grouping

    is converted to a Lisp representation which

    becomes the input to

    an

    algorithm for generating a duration structure.

    (see figure 1).

    (setq t s r A)

    (B) (A)

    (setq A a )

    ( a)

    (setq B

    b )

    (b)

    (setq

    a

    (3

    1 2

    1

    (setq b (3 1 2 1

    Figure 1 A Lisp representation of Lerdahl and Jackendoff s bracket notation for grouping.

    At the top level there are two groups A and B arranged symmetrically

    in

    the order ABA.

    Group

    A contains the sub-group

    a

    repeated

    and

    group B contains the sub-group

    b

    repeated.

    The integers represent the metrical strength of a beat.

    A value for qtD determined by analysing the data from actual

    performances. The basic idea is

    that a slowing indicates a group

    boundary. This can be done systematically using

    an

    algorithm

    listen

    which takes that data as input and returns a Lisp representation of the

    grouping

    tsr

    (Todd, in press).

  • 8/11/2019 A computational model of rubato (Todd).pdf

    7/21

    74 Neil Todd

    S y n t h e s i s : r e p r e s e n t a t i o n ---> p e r f o r m a n c e

    Having obtained

    t s r

    we need to make explicit the procedure ,rr for

    mapping representation into the performance. We can represent the

    computational theory at this stage thus:

    ~P ---> PP ~ ~ tY D - ' - > PD

    ~ffA ~ > PA

    4)

    such that each representation ~i has its corresponding performance Pi.

    The performance is modelled using an algorithm p l a y See Append ix 1).

    The basic heuristic of the algorithm is to look-ahead and plan the phrasing

    of a group at a given level then move d own to the next sub-group, look-

    ahead and plan, and so on recursively. The planned phrasings are

    superposed onto an outp ut plan see

    o u t p u t ,

    Appendix 1) which

    continuous ly evolves as the performance unfolds. Whe n a surface-group

    is reached the first element of the ou tpu t plan is printed and discarded,

    and so on. Whe n the surface-group is completed the program backtracks

    to the next level and so on until all the surface groups are played. The

    output from the program is a list of durations, which could easily be

    adapted to be sent to a synthesiser given a suitable sys tem see figure 2).

    The precise durations within a phrase are det ermined by a parabolic

    function PB embedded within the performance procedure. This function

    has the following form:

    a2 { t a 4 - 1 ) } 2

    = a6 , t = 1 , 2 ..... T 5)

    PB t,

    a i

    a l J r ( 1 - - - - a 6 )

    a 3

    as

    where t is metrical time and a i is a vector of parameters such that:

    a l = t e m po ,

    a2 = am p l i t ude ,

    a3 = l e ng t h o f phr as e ,

    a 4 ~ b o u n d a r y s t r e n g t h ,

    a s = u p p e r l i m i t o f b o u n d a r y s t r e n g t h ,

    a6 = o f f se t o f parabo la m i n i m um .

    1 -

    a6) -2 is a normali sation factor such that if the b oundary st rength a4 =

    1 and t = a3 i.e. at the e nd of the phrase) the n a 2 represents the true

    ampl itude Todd, 1985). As for the values of the parameters, al and

    a2 are input at the start of the algorithm p l a y see functions s t a r t and

    s e t _ u p _ v a t s , Appendix 1); a3 and a 4 are computed for each group as the

    program runs see functions

    p l a n

    and

    r u b a t o ,

    Appendix 1); and as and a6

    are set with in the program with a6 = 0.52. In Todd 1985) as = 11 but in the

    new model as = 3 because the number of possible bound ary strengths is

    reduced see function s e t _ u p _ v a t s , Appendix 1).

    Downl

    oad

    ed

    By:

    [Ingenta

    Content

    Di

    strib

    uti

    on

    Psy

    Press

    Ti

    tl

    es]

    At:08

    :065

    Decemb

    er2009

    74 Neil Todd

    Synthesis: representation performance

    Having obtained

    tsr

    we

    need

    to make explicit the procedure

    1T

    for

    mapping

    representation into

    the

    performance. We can

    represent

    the

    computational theory at this stage thus:

    4)

    such

    that

    each representation

    l i has

    its corresponding performance Pi

    The performance is modelled using an algorithm

    play

    See ppendix 1).

    The basic heuristic of the algorithm is to look-ahead and plan the

    phrasing

    of a

    group at

    a given level

    then

    move

    down

    to

    the

    next sub-group, look

    ahead and plan, and so

    on

    recursively. The planned phrasings are

    superposed

    onto

    an output

    plan

    see output,

    ppendix

    1) which

    continuously evolves as

    the

    performance unfolds. When a surface-group

    is reached the first

    element

    of the output plan is printed and discarded,

    and so on. When

    the

    surface-group is completed

    the program

    backtracks

    to

    the

    next level and so on until all

    the

    surface

    groups

    are played. The

    output from the program is a list of durations, which could easily be

    adapted to be

    sent

    to a synthesiser given a suitable system see figure 2).

    The precise

    durations

    within a

    phrase

    are

    determined

    by

    a parabolic

    function

    P

    embedded within

    the

    performance procedure. This function

    has

    the

    following form:

    _ a2

    {t a4

    - 1 } 2 _ )

    PB t,ai)-a1+ 2 -a6 t-1,2, ... T 5

    1

    -

    a6) a3

    as

    where t is metrical time and ai is a vector of parameters such that:

    a1 tempo,

    a2

    amplitude,

    a3

    length of

    phrase,

    a4

    boundary strength,

    as upper limit ofboundary strength,

    a6 offset ofparabola minimum.

    1 - a6)-z is a normalisation factor

    such

    that i the boundary strength a4

    1 and t

    =

    a3 i.e. at the

    end

    of the phrase) then a2 represents the true

    amplitude Todd, 1985). As for

    the

    values of

    the

    parameters, al and

    az are input at the start of the algorithm

    play

    see functions

    start and

    set_up_vars,

    ppendix

    1); a3

    and

    a4

    are

    computed

    for each

    group

    as

    the

    program

    runs

    see functions

    plan

    and

    rubato,

    ppendix

    1);

    and

    as

    and

    a6

    are set within

    the

    program with

    a6 =

    0.52. In

    Todd

    1985) as

    = 11 butin the

    new

    model

    as = 3 because the

    number

    of possible

    boundary strengths

    is

    reduced see function

    set_up_vars,

    ppendix 1).

    74 Neil Todd

    Synthesis: representation performance

    Having obtained

    tsr

    we

    need

    to make explicit the procedure

    1T

    for

    mapping

    representation into

    the

    performance. We can

    represent

    the

    computational theory at this stage thus:

    4)

    such

    that

    each representation

    l i has

    its corresponding performance Pi

    The performance is modelled using an algorithm

    play

    See ppendix 1).

    The basic heuristic of the algorithm is to look-ahead and plan the

    phrasing

    of a

    group at

    a given level

    then

    move

    down

    to

    the

    next sub-group, look

    ahead and plan, and so

    on

    recursively. The planned phrasings are

    superposed

    onto

    an output

    plan

    see output,

    ppendix

    1) which

    continuously evolves as

    the

    performance unfolds. When a surface-group

    is reached the first

    element

    of the output plan is printed and discarded,

    and so on. When

    the

    surface-group is completed

    the program

    backtracks

    to

    the

    next level and so on until all

    the

    surface

    groups

    are played. The

    output from the program is a list of durations, which could easily be

    adapted to be

    sent

    to a synthesiser given a suitable system see figure 2).

    The precise

    durations

    within a

    phrase

    are

    determined

    by

    a parabolic

    function

    P

    embedded within

    the

    performance procedure. This function

    has

    the

    following form:

    _ a2

    {t a4

    - 1 } 2 _ )

    PB t,ai)-a1+ 2 -a6 t-1,2, ... T 5

    1

    -

    a6) a3

    as

    where t is metrical time and ai is a vector of parameters such that:

    a1 tempo,

    a2

    amplitude,

    a3

    length of

    phrase,

    a4

    boundary strength,

    as upper limit ofboundary strength,

    a6 offset ofparabola minimum.

    1 - a6)-z is a normalisation factor

    such

    that i the boundary strength a4

    1 and t

    =

    a3 i.e. at the

    end

    of the phrase) then a2 represents the true

    amplitude Todd, 1985). As for

    the

    values of

    the

    parameters, al and

    az are input at the start of the algorithm

    play

    see functions

    start and

    set_up_vars,

    ppendix

    1); a3

    and

    a4

    are

    computed

    for each

    group

    as

    the

    program

    runs

    see functions

    plan

    and

    rubato,

    ppendix

    1);

    and

    as

    and

    a6

    are set within

    the

    program with

    a6 =

    0.52. In

    Todd

    1985) as

    = 11 butin the

    new

    model

    as = 3 because the

    number

    of possible

    boundary strengths

    is

    reduced see function

    set_up_vars,

    ppendix 1).

  • 8/11/2019 A computational model of rubato (Todd).pdf

    8/21

    00

    W

    X

    I

    J

    Q

    00o

    0

    ~

    0

    0

    i

    I

    |

    |

    |

    0

    0

    0

    0

    0

    C

    ,

    0

    0

    0

    W

    I

    0W

    O

    [

    N

    (

    s

    N

    O

    I

    V

    ~

    3

    7

    V

    3

    N

    Downloaded By: [Ingenta Content Distribution Psy Press Titles] At: 08:06 5

    VI

    E

    z

    0

    {

    a

    >

    0

    U J

    : ;

    --

    0

    U J

    : ;

    --

  • 8/11/2019 A computational model of rubato (Todd).pdf

    9/21

    76 NeilTodd

    Ev alu atio n: PA ~ PP ~ PD?

    H a v i n g g e n e r a t e d PA o r PD w e n e e d t o c o m p a r e t h e m w i t h a n a c tu a l

    p e r f o r m a n c e

    Pp.

    I n T o d d (1 98 5) t h e d a t a a n d m o d e l ( ie P p a n d

    PA

    w e r e

    c o m p a r e d v i s u a l ly u s i n g t h e c r it er ia ; a ) t h e p o s i t io n o f p e a k s o r p o i n t s o f

    s l o w i n g ; b ) t h e r e l a t iv e h e i g h t s o f th e p e a k s . W h i l st t h is m e t h o d i s u s e f u l

    i t i s u n s a t i s f a c t o r y fo r a n u m b e r o f r e a s o n s . F i rs t, t h e c o m p a r i s o n o f

    r e l a ti v e h e i g h t s i s o n l y q u a l i t a ti v e . S o , o b v i o u s l y a m o r e s y s t e m a t i c a n d

    q u a n t i t a t iv e t e s t i s r e q u i r e d . H i e r a r c h i c a l c l u s t e r i n g ( J o h n s o n , 1 9 6 7 ) i s

    s u c h a m e t h o d w h i c h h a s b e e n s u c c e s s f u ll y a p p l i e d i n a n a l y s i n g s p e e c h

    ( G r o s je a n a n d G e e , 19 83) a n d I a m c u r r e n t l y w o r k i n g o n w a y s o f a p p l y i n g

    t h is t o m u s i c p e r f o r m a n c e .

    A s e c o n d p r o b l e m l ie s i n t h e i n d e t e r m i n i s m o f i n d i v i d u a l p e r f o r m a n c e s

    a s d i s c u s s e d a b o v e . W h i l s t i t i s o f t e n p o s s i b l e t o o b s e r v e c o n s i d e r a b l e

    a c r o s s - p e r f o r m e r a g r e e m e n t ( S h a f fe r & T o d d , 1 98 7) t h e r e a r e a ls o m a n y

    d i f fe r e n c e s . A l so , t h e r e i s n o r e a s o n w h y t h e a n a l y s t 's i n t e r p r e t a t io n

    ~ t r

    s h o u l d b e t h e s a m e a s t h e p e r f o r m e r ' s ~ v s in c e th e r e is n o s u c h t h i n g a s a

    s i n g le c o r r e c t g r o u p i n g . I t is f o r t h e s e r e a s o n s t h a t t h e i n p u t u s e d is t h e

    r e p r e s e n t a t i o n d e r i v e d f r o m t h e d a t a ~ D w h i c h i s o b t a i n e d v i a t h e

    a l g o r i t h m listen. T h is p r o c e d u r e i s c e r ta i n ly n o t i n t e n d e d t o g i v e li c en c e

    t o a d j u s t t h e t h e o r y po st hoc t o fit e a c h s e t o f d a t a - - o n t h e c o n t r a r y t h e

    s a m e p e r f o rm a n c e m a p p i n g p l a y is u s e d i n e a c h c a se . R e m e m b e r t h e g o a l

    o f a t h e o r y o f p e r f o r m a n c e i s t h e p r i n c ip l e d e x p l a n a t io n o f p e r f o r m a n c e

    d a t a. S o , w h i ls t w e c a n n o t p r e d i c t a p e r f o r m a n c e w i t h a n y c e r t a in t y w h a t

    w e c a n s a y f o r e a c h p e r f o r m a n c e i s t h a t i f t h e f o ll o w i n g t h r e e a s s u m p t i o n s

    p r o d u c e a g o o d m a t c h b e t w e e n P p a n d PD t h e n t h e a s s u m p t i o n s

    c o n s t i t u t e a re a s o n a b l e e x p l a n a t i o n :

    (a) t h e p e r f o r m e r h a s u s e d s l o w i n g t o i n d ic a t e a g r o u p i n g b o u n d a r y ;

    (b ) t h e p e r f o r m e r ' s g r o u p i n g a n a l y s is c o r r e s p o n d s t o tsr;

    (c) t h e p e r fo r m e r ' s m a p p i n g p r o c e d u r e c o r r e s p o n d s to play .

    T h e m o d e l t h e n i s re a l ly a n a n a l y ti c al t h e o r y o f p e r f o r m a n c e r a t h e r t h a n a

    p r e s c ri p ti v e t h e o r y o f p e r f o r m a n c e . H o w e v e r , t h e r e a r e n o r e a s o n s , if

    e n o u g h t d a t a i s a m a s s e d , w h y p r o b a b i l i t y w e i g h t i n g s c o u l d n o t b e

    a s s i g n e d t o v a r io u s p e r f o r m a n c e s a s a f u n c t i o n o f s ty l e a n d i n s t r u m e n t .

    E x a m p l e s

    P r e s e n t e d i n F i g u r e s 3 a n d 4 a r e t w o e x a m p l e s o f d a t a f r o m a c t u a l

    p e r f o r m a n c e s c o m p a r e d a g a i n s t t h e m o d e l . T h e f i r s t e x a m p l e i s t a k e n

    f r o m a p e r f o r m a n c e o f t h e A d a g i o f r o m t h e H a y d n S o n a t a in B -F lat w h i c h

    w a s a ls o u s e d i n T o d d (1 98 5) s o th a t c o m p a r i s o n w i t h t h e o l d m o d e l is

    p o s si b le . T h e s e c o n d e x a m p l e is t a k e n f r o m t w o p e r f o r m a n c e s o f t h e

    C h o p i n p r e l u d e i n F # M i n o r (S h a f fe r a n d T o d d , 1 9 87 ). T h e d a t a w e r e

    o b t a i n e d u s i n g t h e m e t h o d o f S h a f f e r (1 98 1).

    Downl

    oad

    ed

    By:

    [Ingenta

    Content

    Di

    strib

    uti

    on

    Psy

    Press

    Ti

    tl

    es]

    At:08

    :065

    Decemb

    er2009

    76 Neil Todd

    Evaluation PA

    e p e

    P

    D

    ?

    Having generated P

    A

    or P

    D

    we need to compare them with

    an

    actual

    performance

    P

    p

    In

    Todd 1985)

    the

    data

    and

    model (ie

    p

    and

    P

    A

    were

    compared visually using the criteria; a)

    the

    position of peaks or points of

    slowing; b) the relative heights of the peaks. Whilst this

    method

    is useful

    it is unsatisfactory for a

    number

    of reasons. First,

    the

    comparison of

    relative heights is only qualitative. So, obviously a more systematic

    and

    quantitative test is required. Hierarchical clustering Gohnson, 1967) is

    such

    a

    method

    which

    has been

    successfully applied in analysing speech

    (Grosjean

    and

    Gee, 1983)

    and

    I am currently working

    on

    ways of applying

    this to music performance.

    A second problem lies in

    the indeterminism of individual performances

    as discussed above. Whilst it is often possible to observe considerable

    across-performer agreement (Shaffer Todd, 1987) there are also

    many

    differences. Also, there is

    no

    reason

    why

    the analyst's interpretation

    qr

    should be the same as

    the

    performer's

    qrp

    since there is

    no

    such thing as a

    single correct grouping.

    t

    is for these reasons

    that the

    input

    used

    is the

    representation derived from the data

    qrD

    which is obtained via

    the

    algorithm

    listen

    This procedure is certainly

    not

    intended to give licence

    to adjust the theory

    post hoc

    to fit each set of data - on

    the

    contrary

    the

    same performance

    mapping pl y

    is

    used

    in each case. Remember the goal

    of a theory of performance is the principled explanation of performance

    data. So, whilst

    we

    cannot predict a performance

    with any

    certainty

    what

    we

    can say for each performance is

    that

    if the following three assumptions

    produce a good match between

    p and

    P

    D

    then

    the

    assumptions

    constitute a reasonable explanation:

    a) the

    performer

    has used

    slowing to indicate a grouping boundary;

    (b)

    the

    performer's grouping analysis corresponds to

    tsr;

    c)

    the performer's

    mapping

    procedure corresponds to

    play

    The model

    then

    is really

    an

    analytical theory of performance rather than a

    prescriptive theory of performance. However, there are

    no

    reasons,

    if

    enought

    data is amassed, why probability weightings could not be

    assigned to various performances as a function of style

    and

    instrument.

    Examples

    Presented

    in

    Figures 3

    and

    4 are two examples of data from actual

    performances compared against the model. The first example is taken

    from a performance of the Adagio from

    the Haydn

    Sonata

    in

    B-Flat which

    was

    also

    used in

    Todd 1985) so

    that

    comparison

    with

    the

    old model is

    possible. The second example is taken from two performances of the

    Chopin prelude in

    F

    Minor (Shaffer

    and

    Todd, 1987). The data were

    obtained using the

    method

    of Shaffer (1981).

    76 Neil Todd

    Evaluation PA

    e p e

    P

    D

    ?

    Having generated P

    A

    or P

    D

    we need to compare them with

    an

    actual

    performance

    P

    p

    In

    Todd 1985)

    the

    data

    and

    model (ie

    p

    and

    P

    A

    were

    compared visually using the criteria; a)

    the

    position of peaks or points of

    slowing; b) the relative heights of the peaks. Whilst this

    method

    is useful

    it is unsatisfactory for a

    number

    of reasons. First,

    the

    comparison of

    relative heights is only qualitative. So, obviously a more systematic

    and

    quantitative test is required. Hierarchical clustering Gohnson, 1967) is

    such

    a

    method

    which

    has been

    successfully applied in analysing speech

    (Grosjean

    and

    Gee, 1983)

    and

    I am currently working

    on

    ways of applying

    this to music performance.

    A second problem lies in

    the indeterminism of individual performances

    as discussed above. Whilst it is often possible to observe considerable

    across-performer agreement (Shaffer Todd, 1987) there are also

    many

    differences. Also, there is

    no

    reason

    why

    the analyst's interpretation

    qr

    should be the same as

    the

    performer's

    qrp

    since there is

    no

    such thing as a

    single correct grouping.

    t

    is for these reasons

    that the

    input

    used

    is the

    representation derived from the data

    qrD

    which is obtained via

    the

    algorithm

    listen

    This procedure is certainly

    not

    intended to give licence

    to adjust the theory

    post hoc

    to fit each set of data - on

    the

    contrary

    the

    same performance

    mapping pl y

    is

    used

    in each case. Remember the goal

    of a theory of performance is the principled explanation of performance

    data. So, whilst

    we

    cannot predict a performance

    with any

    certainty

    what

    we

    can say for each performance is

    that

    if the following three assumptions

    produce a good match between

    p and

    P

    D

    then

    the

    assumptions

    constitute a reasonable explanation:

    a) the

    performer

    has used

    slowing to indicate a grouping boundary;

    (b)

    the

    performer's grouping analysis corresponds to

    tsr;

    c)

    the performer's

    mapping

    procedure corresponds to

    play

    The model

    then

    is really

    an

    analytical theory of performance rather than a

    prescriptive theory of performance. However, there are

    no

    reasons,

    if

    enought

    data is amassed, why probability weightings could not be

    assigned to various performances as a function of style

    and

    instrument.

    Examples

    Presented

    in

    Figures 3

    and

    4 are two examples of data from actual

    performances compared against the model. The first example is taken

    from a performance of the Adagio from

    the Haydn

    Sonata

    in

    B-Flat which

    was

    also

    used in

    Todd 1985) so

    that

    comparison

    with

    the

    old model is

    possible. The second example is taken from two performances of the

    Chopin prelude in

    F

    Minor (Shaffer

    and

    Todd, 1987). The data were

    obtained using the

    method

    of Shaffer (1981).

  • 8/11/2019 A computational model of rubato (Todd).pdf

    10/21

    I

    I

    l

    l

    I

    I

    I

    I

    I

    I

    O

    0

    0

    0

    0

    0

    I

    0

    0

    0

    c

    0

    I

    I

    I

    (

    o

    0

    )

    c00

    o

    d

    c

    O

    0

    0

    oo

    o(

    1

    i

    ~

    I

    i

    0

    ~

    r

    O

    ~

    ~

    .

    ~

    (

    ~

    )

    N

    O

    I

    L

    V

    H

    A

    O

    H

    V

    H

    Downloaded By: [Ingenta Content Distribution Psy Press Titles] At: 08:06 5

    -

    '

    -

    z

    5000

    4000

    a

    1

    a2

    3200ms

    300ms

    HAYDN

    SONATA

    B-FLAT

    MAJOR

    V\ \ :

    :VJ

    \

    A \NVi l

    a

    . .

    :; .

    .,

    ; . j /

    ::

    0

    cc 3000