Cilfe 2003 Fin Short

Embed Size (px)

Citation preview

  • 8/14/2019 Cilfe 2003 Fin Short

    1/23

    1

    Automatic

    Identificationo

    f

    Automatic

    Identificationo

    f

    OrganizationalS

    tructu

    reinWriting

    OrganizationalS

    tructu

    reinWriting

    usingMa

    chineLearning

    usingMachineLearning

    Laurenc

    eAnthonyandGeo

    rgeV.

    LaurenceAnthonyandGeo

    rgeV.Lashkia

    Lashkia

    Dep

    t.of

    Compu

    terScience,

    Facul

    tyof

    Enginee

    ring

    Dep

    t.of

    Compu

    terScience,

    Facul

    tyof

    Enginee

    ring

    Okayama

    Univ.of

    Science,

    1

    Okayama

    Univ.of

    Science,

    1--11

    Ridai

    Rid

    ai--c

    hocho,

    Okayama

    ,Okay

    ama

    antho

    [email protected]

    p

    anthony

    @ice.ous.ac.j

    plash

    [email protected].

    jp

    lash

    [email protected].

    jp

    h

    ttp:

    //antpc

    1.ice.ou

    s.ac.j

    p

    h

    ttp:

    //antpc

    1.ice.ou

    s.ac.j

    p

  • 8/14/2019 Cilfe 2003 Fin Short

    2/23

    2

    P

    resenta

    tionOu

    tline

    P

    resenta

    tionOu

    tline

    Background

    Background

    ResearchAim

    ResearchAim

    SystemDesign(O

    verview)

    SystemDesign(O

    verview)

    ApplicationtoRes

    earchAbstracts

    ApplicationtoRes

    earchAbstracts

    Results(Accuracy

    )

    Results(Accuracy)

    Results(EffectivenessintheClassroom)

    Results(EffectivenessintheClassroo

    m)

    SoftwareDemons

    tration

    SoftwareDemons

    tration

    Conclusions

    Conclusions

  • 8/14/2019 Cilfe 2003 Fin Short

    3/23

    3

    Background

    Background

    ImportanceofTextStructu

    re

    ImportanceofTextStructure

    Swales(1981,199

    0),

    Swales(1981,199

    0),Carrell

    Carrell(1982)

    (1982)

    Hinds(1982,1983

    ),

    Hinds(1982,1983

    ),Hoey

    Hoey(1994),Winter(1994)

    (1994),Winter(19

    94)

    StudiesonTextStructure

    StudiesonTextStructure

    TITL

    ES

    TITL

    ES--Dudley

    Dudley--Evans(1994),Anthony(20

    01)

    Evans(1994),Anthony(20

    01)

    ABSTRACTS

    ABSTRACTS--Ay

    ers(1993),P

    osteguillo(19

    96)

    Ay

    ers(1993),Posteguillo(19

    96)

    INSTRODUCTIO

    NS

    INST

    RODUCTIO

    NS--Swales(1990),Anthony(1999)

    Swales(1990),Antho

    ny(1999)

    DISCUSSIONS

    DISC

    USSIONS--

    Hopkins&Dudley

    Hopkins&Dudley--Evans(

    1988)

    Evans(

    1988)

    PATENTS

    PATENTS--Bazer

    man(1994)

    Bazer

    man(1994)

    GRA

    NTPROPOS

    ALS

    GRANTPROPOS

    ALS--Connor&

    Connor&Mauranen

    Mauranen

    (1999)

    (1999)

    LEGALWRITING

    LEGALWRITING

    --Bhatia(19

    93)

    Bhatia(19

    93)

  • 8/14/2019 Cilfe 2003 Fin Short

    4/23

    4

    Background

    Background

    Problems

    withAnaly

    zingTextS

    tructure

    Problems

    withAnaly

    zingTextS

    tructure

    Weneedalargecorpusoftextdata

    Weneedalargecorpusoftextdata

    (Thetextdatamust

    (ThetextdatamustACURATELY

    ACURATELYre

    presentwhatw

    ehopeto

    re

    presentwhatw

    ehopeto

    study)

    study)

    Weneedalotofresearchtime

    Weneedalotofresearchtime

    (Wem

    ustanalyzealotoftexts)

    (Wem

    ustanalyzealotoftexts)

    Weneedgoodvalida

    tionandreliabilitytests

    Weneedgoodvalida

    tionandreliabilitytests

    (Beca

    useevaluating

    structurecanbeverysubjectiv

    e)

    (Beca

    useevaluatingstructurecanbeverysubjectiv

    e)

    MostTextStructure

    Studiesare

    MostTextStructure

    Studiesare

    SmallScale

    SmallScale

  • 8/14/2019 Cilfe 2003 Fin Short

    5/23

    5

    Background

    Background

    Henryet

    al.(2001)

    Henryet

    al.(2001)

    40ApplicationLetters

    40ApplicationLetters

    Tarone

    Taroneetal.(200

    0)

    etal.(200

    0)

    2PhysicsResearc

    hArticles

    2PhysicsResearc

    hArticles

    Connoretal.(1999)

    Connoretal.(1999)

    34G

    rantProposa

    ls

    34GrantProposals

    Williams(1999)

    Williams(1999)

    5Me

    dicalResearchArticles

    5Me

    dicalResearchArticles

    Anthony(1999)

    Anthony(1999)

    12ComputerScie

    nceResearchArticle

    12ComputerScie

    nceResearchArticle

    Intro

    ductions

    Intro

    ductions

  • 8/14/2019 Cilfe 2003 Fin Short

    6/23

    6

    ResearchAim

    ResearchAim

    Develop

    aCompu

    terSystem

    toProcess

    Develop

    aComputerSystem

    toProcess

    Textsan

    dAnalyze

    TextStructure

    Textsan

    dAnalyze

    TextStructure

    Automatically

    Automatically

    AAMac

    hine

    Learning

    System

    Machine

    Learning

    System

    fortext

    fortext

    structu

    re

    structu

    re

    Easy

    toprocessa

    largecorpusoftextdata

    Easy

    toprocessa

    largecorpus

    oftextdata

    FastFast

    Theanalyticprocesswouldbe

    clearlydefin

    ed

    Theanalyticprocesswouldbe

    clearlydefin

    ed

    Easy

    totestthereliabilityand

    validity

    Easy

    totestthereliabilityand

    validity

  • 8/14/2019 Cilfe 2003 Fin Short

    7/23

    7

    SystemDes

    ign(Ov

    erview)

    SystemDes

    ign(Ov

    erview)

    Machine

    Learning:Unsupervised?

    Machine

    Learning:Unsuper

    vised?

    SupervisedLearning

    Supervis

    edLearning??

    InSupervisedLea

    rning,

    InSupervisedLea

    rning,

    Give

    thesystema

    structuralm

    odel(setof

    classes)

    Give

    thesystema

    structuralm

    odel(setof

    classes)

    Give

    thesysteme

    xamplesofthemodel

    Give

    thesysteme

    xamplesoft

    hemodel

    Tellthesystemw

    hat

    Tellthesystemw

    hatfeatures

    features

    intheexam

    plesare

    intheexam

    plesare

    impo

    rtant

    impo

    rtant

    Defin

    earelationbetweenthe

    classesandthe

    Defin

    earelationbetweenthe

    classesandthe

    featu

    res

    featu

    res

    Class

    ifynewtext

    examplesby

    comparingits

    Class

    ifynewtext

    examplesby

    comparingits

    featu

    reswiththoseineachcla

    ss

    featu

    reswiththoseineachcla

    ss

  • 8/14/2019 Cilfe 2003 Fin Short

    8/23

    8

    SystemDes

    ign(Ov

    erview)

    SystemDes

    ign(Ov

    erview)

    Problems

    Problems

    Wen

    eeda

    Wen

    eedagood

    goodmodelofstr

    ucture

    modelofstr

    ucture

    Buttherearema

    nymodelsofstructureintheliterature

    Buttherearema

    nymodelsofstructureintheliterature

    Wen

    eedasetof

    Wen

    eedasetof

    labeledexamples

    labeledexamples

    Butmanysystem

    sworkwellwithonlyafewlabeled

    Butmanysystem

    sworkwellwithonlyafewlabeled

    examples

    examples

    Wen

    eeda

    Wen

    eedagood

    goodsetoffeatures

    setoffeatures

    ButlanguagecontainsaLOTof

    noisewords!

    ButlanguagecontainsaLOTof

    noisewords!

    (e

    .g.a,the,of,in,at,but?,though?,

    (e

    .g.a,the,of,in,at,but?,though?,))

    Buildingalistof

    featuresbyhan

    disinfeasible

    Buildingalistof

    featuresbyhan

    disinfeasible

    Wen

    eeda

    Wen

    eedagood

    goodrelationbetweentheclassesand

    relationbetw

    eentheclassesand

    thefeatures

    thefeatures

  • 8/14/2019 Cilfe 2003 Fin Short

    9/23

    9

    Application

    ofSystemto

    Application

    ofSystemto

    Researc

    hAbstracts

    Researc

    hAbstracts

    Givethesystemastructuremo

    del:

    Givethesystemastructuremo

    del:

    Modified

    Modified

    CARSModel

    (Swales,19

    90:Anthony

    ,1999)

    CARSModel

    (Swales,19

    90:Anthony

    ,1999)

    Move1

    Move1Establishing

    Establishing

    1.11.1

    Claimingcentrality

    Claimingcentrality

    aTerritory

    aTerritory

    1.21.2

    Makingtopicg

    eneralizations

    Makingtopicg

    eneralizations

    1.31.3

    Reviewingitem

    sofpreviousresearch

    Reviewingitem

    sofpreviousresearch

    Move2

    Move2Establishing

    Establishing

    2.1A

    2.1A

    Counterclamin

    g

    Counterclamin

    g

    aniche

    aniche

    2.1B

    2.1B

    Indicatingaga

    p

    Indicatingaga

    p

    2.1C

    2.1C

    Questionraisin

    g

    Questionraisin

    g

    2.1D

    2.1D

    Continuingatradition

    Continuingatr

    adition

    Move3

    Move3Occupy

    ing

    Occupy

    ing

    3.1A

    3.1A

    Outliningpurpose

    Outliningpurpose

    theniche

    theniche

    3.1B

    3.1B

    Announcingpr

    esentresearch

    Announcingpresentresearch

    3.23.2

    Announcingprincipalfindings

    Announcingprincipalfindings

    3.33.3

    Evaluationofr

    esearch

    Evaluationofresearch

    3.43.4

    IndicatingRAstructure

    IndicatingRAstructure

  • 8/14/2019 Cilfe 2003 Fin Short

    10/23

    10

    Application

    ofSystemto

    Application

    ofSystemto

    Researc

    hAbstracts

    Researc

    hAbstracts

    Givethesystemexa

    mplesofth

    emodel

    Givethesystemexa

    mplesofth

    emodel

    100A

    bstracts(IEEETrans.onPDS)

    dividedinto

    100A

    bstracts(IEEETrans.onPDS)

    dividedinto

    692labeled

    692labeledStepsUnits

    StepsUnits(onlyexamplesfrom6classes)

    (onlyexamplesfrom6classes)

    554S

    tepUnits(80%

    )usedfor

    554S

    tepUnits(80%

    )usedfortrain

    ing

    train

    ingthesystem

    thesystem

    138S

    tepUnits(20%

    )usedfor

    138S

    tepUnits(20%

    )usedfortesting

    testingthesystem

    thesystem

    Tellthes

    ystemwha

    t

    Tellthesystemwhatfeatures

    features

    tolookat

    tolookat

    Allwo

    rdclusters(chunks)upto5w

    ordslong

    Allwo

    rdclusters(chunks)upto5w

    ordslong

    Positionofstepunitinabstract(i.e.

    1

    Positionofstepunitinabstract(i.e.

    1ststline,2

    line,2ndndline

    ,

    line

    ,))

    (Reduce

    (ReduceNoise

    NoiseinFeatures)

    inFeatures)

    Autom

    aticallyrankw

    ordsby

    Autom

    aticallyrankw

    ordsbyimport

    ance

    importanceusing:

    using:

    rawfrequency,

    rawfrequency,InformationG

    ain

    InformationG

    ain

    Useo

    nlyhighranked

    words

    Useo

    nlyhighranked

    words

  • 8/14/2019 Cilfe 2003 Fin Short

    11/23

    11

    Application

    ofSystemto

    Application

    ofSystemto

    Researc

    hAbstracts

    Researc

    hAbstracts

    Inthispaper,we

    propose

    anewsystem.

    Inthispaper,we

    propose

    anewsystem.

    1word

    chunks

    1word

    chunks

    in/th

    is/paper/w

    e/propose/a/new/system

    in/th

    is/paper/w

    e/propose/a/new/syste

    m

    2word

    chunks

    2word

    chunks

    inthis/thispaper

    /paperwe/

    wepropose/

    inthis/thispaper

    /paperwe/

    wepropose/

    proposea/anew

    /newsystem

    proposea/anew

    /newsystem

    3word

    schunks

    3word

    schunks

    inthispaper/thispaperwe/paperwepro

    pose/

    inthispaper/thispaperwe/paperwepro

    pose/

    wep

    roposea/proposeanew/anewsyste

    m

    wep

    roposea/proposeanew/anewsyste

    m

  • 8/14/2019 Cilfe 2003 Fin Short

    12/23

    12

    Application

    ofSystemto

    Application

    ofSystemto

    Researc

    hAbstracts

    Researc

    hAbstracts

    Inthispaper,we

    propose

    anewsystem.

    Inthispaper,we

    propose

    anewsystem.

    1word

    chunks

    1word

    chunks

    inin//th

    is

    th

    is//paper

    paper//w

    e

    w

    e//propose

    propose//aa//newnew//system

    syste

    m

    2word

    chunks

    2word

    chunks

    inthis/

    inthis/thispaper

    thispaper

    /paperwe/

    /paperwe/

    wepropose

    wepropose//

    proposea/anew

    /

    proposea/anew

    /newsystem

    newsystem

    3word

    chunks

    3word

    chunks

    inthispaper

    inthispaper/thispaperwe/paperwepro

    pose/

    /thispaperwe/paperwepro

    pose/

    wep

    roposea

    wep

    roposea/proposeanew/

    /proposeanew/anewsyste

    m

    anewsyste

    m

  • 8/14/2019 Cilfe 2003 Fin Short

    13/23

    13

    InformationGain

    (IG)

    In

    formationGain

    (IG)

    j

    j

    p

    p

    D

    Entropy

    c j

    2

    log

    )

    (

    1

    =

    where

    whereppjjis

    theproportionofdata(

    is

    theproportionofdata(DD

    )inaclass

    )inaclassjj

    from

    from

    thesetofclasses

    thesetofclasses

    CC..

    )

    (

    |

    |

    |

    |

    )

    (

    )

    ,(

    )

    (

    v

    v

    w

    Values

    v

    D

    Entro

    py

    DD

    D

    Entropy

    wD

    Gain

    where

    where

    Values

    Values(w(w))isthesetofallpossiblevaluesfo

    r

    isthesetofallpossiblevaluesfo

    r

    word

    wordw,

    w,andand

    DDvv

    isthesubsetof

    isthesubsetofDDfor

    whichword

    for

    whichword

    wwhashas

    avalue

    avaluevv..

  • 8/14/2019 Cilfe 2003 Fin Short

    14/23

    14

    InformationGain

    (IG)

    In

    formationGain

    (IG)

    Process

    3

    10

    task_

    migration

    2

    9

    difficult

    1

    8

    not

    and

    7

    often

    is

    6

    transmitting

    of

    5

    is_

    often

    in

    4

    difficult

    _to

    to

    3

    2_

    however

    a

    2

    however

    the

    1

    InformationGain(IG)

    RawFrequen

    cy

    Rank

  • 8/14/2019 Cilfe 2003 Fin Short

    15/23

    15

    Application

    ofSystemto

    Application

    ofSystemto

    Researc

    hAbstracts

    Researc

    hAbstracts

    Definearelationbetweenfeaturesandclasses

    Definearelationbetweenfeatu

    resandclasses

    Usepro

    babilityofea

    chclassand

    theprobabilityof

    Useprobabilityofea

    chclassand

    theprobabilityof

    features(clusters)beingineach

    class

    features

    (clusters)beingineachclass

    ((ANA

    ANAVE

    BAYESClassifier)

    VE

    BAYESClassifier)

    Class1

    (ClaimingCentrality)

    Class1

    (ClaimingCentrality)

    Class2

    (Makingtopicgeneralizations)

    Class2

    (Makingtopicgeneralizations)

    Class3

    (Indicatingagap)

    Class3

    (Indicatingagap)

    Class4

    (Outliningpurpose

    )

    Class4

    (Outliningpurpose

    )

    Class5

    (Announcingprinc

    ipalfindings)

    Class5

    (Announcingprinc

    ipalfindings)

    Class6

    (Evaluationofresearch)

    Class6

    (Evaluationofresearch)

    Class1

    :

    Class1Prob.

    Class1

    :

    Class1Prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Class2

    :

    Class2Prob.

    Class2

    :

    Class2Prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Class3

    :

    Class3Prob.

    Class3

    :

    Class3Prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Class4

    :

    Class4Prob.

    Class4

    :

    Class4Prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Class5

    :

    Class5Prob.

    Class5

    :

    Class5Prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Class6

    :

    Class6Prob.

    Class6

    :

    Class6Prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

    Feat.1prob.F

    eat.2prob.

    Feat

    .3prob.

  • 8/14/2019 Cilfe 2003 Fin Short

    16/23

    16

    Application

    ofSystemto

    Application

    ofSystemto

    Researc

    hAbstracts

    Researc

    hAbstracts

    Classifythestructureofnewte

    xtexample

    s

    Classifythestructureofnewte

    xtexample

    s

    Choose

    themostpro

    bableclasscontainingthe

    Choose

    themostpro

    bableclassc

    ontainingthe

    featuresineachstep

    unit.

    features

    ineachstep

    unit.

    2this

    paperisaneffortinthesamedirectio

    n

    2this

    paperisaneffortinthesamedirection

    (Step3

    .1B

    (Step3

    .1B--AnnouncingP

    resentResearch

    AnnouncingP

    resentResearch))

    FeaturesContainedinTraining

    Data

    FeaturesContainedinTraining

    Data

    paper(c3),

    paper(c3),this_paper

    this_paper(c4)

    ,is(c14)this(c1

    8)the(c39)

    (c4)

    ,is(c14)this(c1

    8)the(c39)

    2(c103)

    2(c103)is_an

    is_an(c364)in(c571)

    (c364)in(c571)

    Step1.1Pro

    b.

    =

    Step1.1Pro

    b.

    =

    --2.9498+

    2.9498+

    --7.0449+

    7.0449+--7.0449+

    7.0449+--4.3368+

    4.3368+++--4.4058=

    4.4058=--48.7690

    48.7690

    Step1.2Pro

    b.

    =

    Step1.2Pro

    b.

    =

    --1.8398+

    1.8398+

    --7.4899+

    7.4899+--7.4899+

    7.4899+--3.8523+

    3.8523+++--3.8790=

    3.8790=--45.5972

    45.5972

    Step2.1BProb.

    =

    Step2.1BPr

    ob.

    =

    --3.1391+

    3.1391+

    --6.9157+

    6.9157+--6.9157+

    6.9157+--4.3507+

    4.3507+++--4.2076=

    4.2076=--47.0826

    47.0826

    Step3.1BPr

    ob.

    =

    Step3.1BPr

    ob.

    =

    --1.3335+

    1.3335+

    --4.1566+

    4.1566+--4.2436+

    4.2436+--4.8497+

    4.8497+++--3.9169=

    3.9169=--39.0836

    39.0836

    Step3.2Pro

    b.

    =

    Step3.2Pro

    b.

    =

    --1.8398+

    1.8398+

    --6.3677+

    6.3677+--6.3677+

    6.3677+--3.6936+

    3.6936+++--3.7837=

    3.7837=--40.8448

    40.8448

    Step3.3Pro

    b.

    =

    Step3.3Pro

    b.

    =

    --1.5809+

    1.5809+

    --6.6178+

    6.6178+--6.6178+

    6.6178+--3.7846+

    3.7846+++--4.0528=

    4.0528=--43.2638

    43.2638

    MostProbable

    Step

    MostProbable

    Step

  • 8/14/2019 Cilfe 2003 Fin Short

    17/23

    17

    Application

    ofSystemto

    Application

    ofSystemto

    Researc

    hAbstracts

    Researc

    hAbstracts

    Classifythestructureofnewte

    xtexample

    s

    Classifythestructureofnewte

    xtexample

    s

    Choose

    themostpro

    bableclasscontainingthe

    Choose

    themostpro

    bableclassc

    ontainingthe

    featuresineachstep

    unit.

    features

    ineachstep

    unit.

    2this

    paperisaneffortinthesamedirectio

    n

    2this

    paperisaneffortinthesamedirection

    (Step3

    .1B

    (Step3

    .1B--AnnouncingP

    resentResearch

    AnnouncingP

    resentResearch))

    FeaturesContainedinTraining

    Data

    FeaturesContainedinTraining

    Data

    paper(c3),

    paper(c3),this_paper

    this_paper(c4)

    ,is(c14)this(c1

    8)the(c39)

    (c4)

    ,is(c14)this(c1

    8)the(c39)

    2(c103)

    2(c103)is_an

    is_an(c364)in(c571)

    (c364)in(c571)

    Step1.1Pro

    b.

    =

    Step1.1Pro

    b.

    =

    --2.9498+

    2.9498+

    --7.0449+

    7.0449+--7.0449+

    7.0449+--4.3368+

    4.3368+++--4.4058=

    4.4058=--48.7690

    48.7690

    Step1.2Pro

    b.

    =

    Step1.2Pro

    b.

    =

    --1.8398+

    1.8398+

    --7.4899+

    7.4899+--7.4899+

    7.4899+--3.8523+

    3.8523+++--3.8790=

    3.8790=--45.5972

    45.5972

    Step2.1BProb.

    =

    Step2.1BPr

    ob.

    =

    --3.1391+

    3.1391+

    --6.9157+

    6.9157+--6.9157+

    6.9157+--4.3507+

    4.3507+++--4.2076=

    4.2076=--47.0826

    47.0826

    Step3.1BProb.=

    Step3.1BP

    rob.=

    --1.3335+

    1.3335+--4.1566+

    4.1566+--4.2436+

    4.2436+

    --4.8497+

    4.8497+

    ++--3.9169=

    3.916

    9=--39.0836

    39.0836

    Step3.2Pro

    b.

    =

    Step3.2Pro

    b.

    =

    --1.8398+

    1.8398+

    --6.3677+

    6.3677+--6.3677+

    6.3677+--3.6936+

    3.6936+++--3.7837=

    3.7837=--40.8448

    40.8448

    Step3.3Pro

    b.

    =

    Step3.3Pro

    b.

    =

    --1.5809+

    1.5809+

    --6.6178+

    6.6178+--6.6178+

    6.6178+--3.7846+

    3.7846+++--4.0528=

    4.0528=--43.2638

    43.2638

    MostProbable

    Step=h

    MostProbable

    Step=hstep3.1B

    step3.1B

    ==--39.0836

    39.0836

    (DecisionisStep3.1B

    (DecisionisStep3.1BAnn

    ouncingPresen

    tResearch

    Ann

    ouncingPresen

    tResearch))

  • 8/14/2019 Cilfe 2003 Fin Short

    18/23

    18

    Results(Classification

    Accuracy)

    Results(Classification

    Accuracy)

    ClassificationAccura

    cy(Overall)

    ClassificationAccura

    cy(Overall)

    554StepUnitsusedfor

    554StepUnitsusedfortraining

    trainingthesystem(80%o

    fentiredata)

    thesystem(80%o

    fentiredata)

    138StepUnitsusedfor

    138StepUnitsusedfortesting

    testingthesystem(20%o

    fentiredata)

    thesystem(20%o

    fentiredata)

    No.

    ofFeatures

    Accu

    racy

    (RawFrequency)

    Accura

    cy

    (InformationGain)

    2208(all)

    56

    %

    -

    1000

    51

    %

    70%

    700

    56

    %

    70%

    500

    59

    %

    69%

    300

    59

    %

    69%

    100

    54

    %

    -

    N

    ote:Random

    guessinghas

    anaccuracyo

    f16.66%(

    NO

    T50%!)

    N

    ote:Random

    guessinghasanaccuracyof16.66%(

    NO

    T50%!)

    Choosing

    themostcom

    monclass=

    26%

    Choosing

    themostcom

    monclass=

    26%

  • 8/14/2019 Cilfe 2003 Fin Short

    19/23

    19

    Results(Classification

    Accuracy)

    Results(Classification

    Accuracy)

    ClassificationAccura

    cy(EachStepUnit)

    ClassificationAccura

    cy(EachStepUnit)

    Numb

    eroffeatures=700

    Numb

    eroffeatures=

    700

    Rankedby

    Ranke

    dbyInformationGain

    InformationGainmeasure

    measure

    Accur

    acy(overall)=

    70%

    Accuracy(overall)=

    70%

    Class

    Step1

    .1

    Step1.2

    Step2.1b

    Step

    3.1b

    Step3.2

    Step3.3

    Step1.1

    2(43%)

    4

    0

    0

    1

    0

    Step1.2

    0

    17(77%)

    0

    0

    4

    1

    Step2.1b

    0

    2

    1(17%)

    0

    2

    1

    Step3.1b

    0

    0

    0

    34(92%)

    3

    0

    Step3.2

    0

    2

    0

    2

    25(66%)

    9

    Step3.3

    0

    1

    0

    2

    8

    17(61%)

    N

    ote:Classifica

    tionscorrespo

    ndwithCARS

    Model

    N

    ote:Classifica

    tionscorrespo

    ndwithCARS

    Modelmoves

    moves

    (Accuracy=88%wh

    enusing

    (Accura

    cy=88%wh

    enusingsec

    ondopinion

    sec

    ondopinion

    ))

  • 8/14/2019 Cilfe 2003 Fin Short

    20/23

    20

    Results(In

    theclas

    sroom)

    Results(In

    theclas

    sroom)

    AAWindows

    Windo

    wsInter

    face

    Interface

    Toenableresearchers,teache

    rsandstudentsto

    Toenableresearc

    hers,teache

    rsandstudentsto

    usethesystemit

    needstobe

    easilyaccess

    iblevia

    uset

    hesystemit

    needstobe

    easilyaccess

    iblevia

    aawindows

    windowsinterf

    ace

    interf

    ace

    AAwindows

    windowssyste

    mhasbeenbuiltusingth

    e

    syste

    mhasbeenbuiltusingth

    e

    programminglanguagePERL5.6andPERL

    /

    programminglanguagePERL5.6andPERL

    /TkTk

  • 8/14/2019 Cilfe 2003 Fin Short

    21/23

    21

    Results(In

    theclas

    sroom)

    Results(In

    theclas

    sroom)

    Materials

    Selectionb

    yNon

    Materials

    Selectionb

    yNon--NativeTeacher

    NativeTeacher

    Thed

    ecisionsarefast.

    Thed

    ecisionsarefast.

    Itiss

    impleandeasytocompletethetask.

    Itiss

    impleandeasytocompletethetask.

    Irely

    toomuchonthesoftwareandstopfe

    eling

    Irely

    toomuchonthesoftwareandstopfe

    eling

    likedo

    ingtheanalysismy

    self.

    likedo

    ingtheanalysismy

    self.

    Comments

    Comments

    1/71/7

    2/72/7

    Errors

    Errors

    28min.

    28min.

    (1min.foranalys

    isplus

    (1min.foranalys

    isplus

    timetocheckresults)

    timetocheckresults)

    100min.

    100min.

    Timetocompletetasks

    Timetocompletetasks

    UsingSystem

    UsingSystem

    Byhand

    Byhand

    Selectionof7

    texts

    Selectionof7

    texts

    from10textc

    orpus

    from10textc

    orpus

  • 8/14/2019 Cilfe 2003 Fin Short

    22/23

    22

    Results(In

    theclas

    sroom)

    Results(In

    theclas

    sroom)

    TextAnalysisbyNon

    TextAnalysisbyNon--NativeStudent

    NativeStudent

    ItItsveryfast.

    sve

    ryfast.

    Thes

    tructureisnowveryclear.

    Thestructureisnowveryclear.

    Thes

    ystemhasclearlya

    nalyzedthestructu

    re,

    Thesystemhasclearlya

    nalyzedthestructu

    re,

    whaty

    oushoulddoiscorrectonlythepartthatis

    whaty

    oushoulddoiscorrectonlythepartt

    hatis

    strange.Sotheworkislittle.

    strange.Sotheworkislittle.

    Comments

    Comments

    0/40/4

    2/42/4

    Errors

    Errors

    15min.

    15min.

    (1min.foranalys

    isplus

    (1min.foranalys

    isplus

    timetocheckresults)

    timetocheckresults)

    38min.

    38min.

    Timetocompletetasks

    Timetocompletetasks

    UsingSystem

    UsingSystem

    Byhand

    Byhand

    Selectionof4

    texts

    Selectionof4

    texts

    from10textc

    orpus

    from10textc

    orpus

  • 8/14/2019 Cilfe 2003 Fin Short

    23/23

    23

    Conclusions

    Conclusions

    Acompu

    tersystemwasdevelopedto

    Acompu

    tersystemwasdevelopedto

    analyze

    textstruc

    ture

    analyze

    textstructure

    Learningmethod:

    Learningmethod:Supervised

    Learning

    Supervised

    Learning

    Accu

    racy70%(8

    8%w

    henusingsecondopinion)

    Accuracy70%(8

    8%w

    henusingsecondopinion)

    Systemerrorscor

    respondedwithCARS

    SystemerrorscorrespondedwithCARS

    Model

    Modelm

    oves

    m

    oves

    Effective

    intheclassroomforuseby

    Effective

    intheclassroomforuseby

    teachers

    andstud

    ents

    teachers

    andstud

    ents

    Runsin

    Windows

    environm

    ent

    RunsinWindows

    environm

    ent