Click here to load reader

Theory and practice of XML processing programming languagesgc/slides/semsubandxml.pdf · Theory and practice of XML processing programming languages ... Level 3: XML types taken

  • View
    217

  • Download
    0

Embed Size (px)

Text of Theory and practice of XML processing programming languagesgc/slides/semsubandxml.pdf · Theory and...

  • logoP7

    MPRI

    Theory and practice of XML processingprogramming languages

    Giuseppe Castagna

    CNRSUniversite Paris 7 - Denis Diderot

    MPRI Lectures on Theory of Subtyping

    G. Castagna: Theory and practice of XML processing languages 1/110

  • logoP7

    MPRI

    Outline of the lecture

    1 XML Programming in CDuce

    XML Regular Expression Types and PatternsXML Programming in CDuceTools on top of CDuce

    2 Theoretical Foundations

    Semantic subtypingSubtyping algorithmsCDuce functional core

    3 Polymorphic Subtyping

    Current statusSemantic solutionSubtyping algorithm

    4 Polymorphic Language

    Motivating exampleFormal settingExplicit substitutionsInference SystemEfficient implementation

    G. Castagna: Theory and practice of XML processing languages 2/110

  • logoP7

    MPRI

    Outline of the lecture

    1 XML Programming in CDuceXML Regular Expression Types and PatternsXML Programming in CDuceTools on top of CDuce

    2 Theoretical Foundations

    Semantic subtypingSubtyping algorithmsCDuce functional core

    3 Polymorphic Subtyping

    Current statusSemantic solutionSubtyping algorithm

    4 Polymorphic Language

    Motivating exampleFormal settingExplicit substitutionsInference SystemEfficient implementation

    G. Castagna: Theory and practice of XML processing languages 2/110

  • logoP7

    MPRI

    Outline of the lecture

    1 XML Programming in CDuceXML Regular Expression Types and PatternsXML Programming in CDuceTools on top of CDuce

    2 Theoretical FoundationsSemantic subtypingSubtyping algorithmsCDuce functional core

    3 Polymorphic Subtyping

    Current statusSemantic solutionSubtyping algorithm

    4 Polymorphic Language

    Motivating exampleFormal settingExplicit substitutionsInference SystemEfficient implementation

    G. Castagna: Theory and practice of XML processing languages 2/110

  • logoP7

    MPRI

    Outline of the lecture

    1 XML Programming in CDuceXML Regular Expression Types and PatternsXML Programming in CDuceTools on top of CDuce

    2 Theoretical FoundationsSemantic subtypingSubtyping algorithmsCDuce functional core

    3 Polymorphic SubtypingCurrent statusSemantic solutionSubtyping algorithm

    4 Polymorphic Language

    Motivating exampleFormal settingExplicit substitutionsInference SystemEfficient implementation

    G. Castagna: Theory and practice of XML processing languages 2/110

  • logoP7

    MPRI

    Outline of the lecture

    1 XML Programming in CDuceXML Regular Expression Types and PatternsXML Programming in CDuceTools on top of CDuce

    2 Theoretical FoundationsSemantic subtypingSubtyping algorithmsCDuce functional core

    3 Polymorphic SubtypingCurrent statusSemantic solutionSubtyping algorithm

    4 Polymorphic LanguageMotivating exampleFormal settingExplicit substitutionsInference SystemEfficient implementation

    G. Castagna: Theory and practice of XML processing languages 2/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    PART 1: XML PROGRAMMING INCDUCE

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 3/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Programming with XML

    Level 0: textual representation of XML documents

    AWK, sed, Perl

    Level 1: abstract view provided by a parser

    SAX, DOM, . . .

    Level 2: untyped XML-specific languages

    XSLT, XPath

    Level 3: XML types taken seriously (aka: related work)

    XDuce, XtaticXQueryC (Microsoft). . .

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 4/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Programming with XML

    Level 0: textual representation of XML documents

    AWK, sed, Perl

    Level 1: abstract view provided by a parser

    SAX, DOM, . . .

    Level 2: untyped XML-specific languages

    XSLT, XPath

    Level 3: XML types taken seriously (aka: related work)

    XDuce, XtaticXQueryC (Microsoft). . .

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 4/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Programming with XML

    Level 0: textual representation of XML documents

    AWK, sed, Perl

    Level 1: abstract view provided by a parser

    SAX, DOM, . . .

    Level 2: untyped XML-specific languages

    XSLT, XPath

    Level 3: XML types taken seriously (aka: related work)

    XDuce, XtaticXQueryC (Microsoft). . .

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 4/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Programming with XML

    Level 0: textual representation of XML documents

    AWK, sed, Perl

    Level 1: abstract view provided by a parser

    SAX, DOM, . . .

    Level 2: untyped XML-specific languages

    XSLT, XPath

    Level 3: XML types taken seriously (aka: related work)

    XDuce, XtaticXQueryC (Microsoft). . .

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 4/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Programming with XML

    Level 0: textual representation of XML documents

    AWK, sed, Perl

    Level 1: abstract view provided by a parser

    SAX, DOM, . . .

    Level 2: untyped XML-specific languages

    XSLT, XPath

    Level 3: XML types taken seriously (aka: related work)

    XDuce, XtaticXQueryC (Microsoft). . .

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 4/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Presentation of CDuce

    Features:Oriented to XML processingType centricGeneral-purpose featuresVery efficient

    Intended use:Small adapters between different XML applicationsLarger applications that use XMLWeb developmentWeb services

    Status:Public release available (0.5.3) in all major Linux distributions.

    Integration with standardsInternally: Unicode, XML, Namespaces, XML SchemaExternally: DTD, WSDL

    Some tools: graphical queries, code embedding (a la php)

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 5/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Presentation of CDuceFeatures:

    Oriented to XML processingType centricGeneral-purpose featuresVery efficient

    Intended use:Small adapters between different XML applicationsLarger applications that use XMLWeb developmentWeb services

    Status:Public release available (0.5.3) in all major Linux distributions.

    Integration with standardsInternally: Unicode, XML, Namespaces, XML SchemaExternally: DTD, WSDL

    Some tools: graphical queries, code embedding (a la php)

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 5/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Presentation of CDuceFeatures:

    Oriented to XML processingType centricGeneral-purpose featuresVery efficient

    Intended use:Small adapters between different XML applicationsLarger applications that use XMLWeb developmentWeb services

    Status:Public release available (0.5.3) in all major Linux distributions.

    Integration with standardsInternally: Unicode, XML, Namespaces, XML SchemaExternally: DTD, WSDL

    Some tools: graphical queries, code embedding (a la php)

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 5/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Presentation of CDuceFeatures:

    Oriented to XML processingType centricGeneral-purpose featuresVery efficient

    Intended use:Small adapters between different XML applicationsLarger applications that use XMLWeb developmentWeb services

    Status:Public release available (0.5.3) in all major Linux distributions.

    Integration with standardsInternally: Unicode, XML, Namespaces, XML SchemaExternally: DTD, WSDL

    Some tools: graphical queries, code embedding (a la php)

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 5/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Presentation of CDuceFeatures:

    Oriented to XML processingType centricGeneral-purpose featuresVery efficient

    Intended use:Small adapters between different XML applicationsLarger applications that use XMLWeb developmentWeb services

    Status:Public release available (0.5.3) in all major Linux distributions.

    Integration with standardsInternally: Unicode, XML, Namespaces, XML SchemaExternally: DTD, WSDL

    Some tools: graphical queries, code embedding (a la php)

    Used both for teaching and in production code.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 5/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types, Types, Types!!!

    Types are pervasive in CDuce:

    Static validationE.g.: does the transformation produce valid XHTML ?

    Type-driven programming semanticsAt the basis of the definition of patternsDynamic dispatchOverloaded functions

    Type-driven compilationOptimizations made possible by static typesAvoids unnecessary and redundant tests at runtimeAllows a more declarative style

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 6/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types, Types, Types!!!

    Types are pervasive in CDuce:Static validation

    E.g.: does the transformation produce valid XHTML ?

    Type-driven programming semanticsAt the basis of the definition of patternsDynamic dispatchOverloaded functions

    Type-driven compilationOptimizations made possible by static typesAvoids unnecessary and redundant tests at runtimeAllows a more declarative style

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 6/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types, Types, Types!!!

    Types are pervasive in CDuce:Static validation

    E.g.: does the transformation produce valid XHTML ?

    Type-driven programming semanticsAt the basis of the definition of patternsDynamic dispatchOverloaded functions

    Type-driven compilationOptimizations made possible by static typesAvoids unnecessary and redundant tests at runtimeAllows a more declarative style

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 6/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types, Types, Types!!!

    Types are pervasive in CDuce:Static validation

    E.g.: does the transformation produce valid XHTML ?

    Type-driven programming semanticsAt the basis of the definition of patternsDynamic dispatchOverloaded functions

    Type-driven compilationOptimizations made possible by static typesAvoids unnecessary and redundant tests at runtimeAllows a more declarative style

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 6/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Regular ExpressionTypes and Patterns for XML

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 7/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types & patterns: the functional languages perspective

    Types are sets of values

    Values are decomposed by patterns

    Patterns are roughly values with capture variables

    Instead of

    let x = fst(e) inlet y = snd(e) in (y,x)

    with pattern one can write

    let (x,y) = e in (y,x)

    which is syntactic sugar for

    match e with (x,y) -> (y,x)

    match is more interesting than let, since it can testseveral |||-separated patterns.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 8/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types & patterns: the functional languages perspective

    Types are sets of values

    Values are decomposed by patterns

    Patterns are roughly values with capture variables

    Instead of

    let x = fst(e) inlet y = snd(e) in (y,x)

    with pattern one can write

    let (x,y) = e in (y,x)

    which is syntactic sugar for

    match e with (x,y) -> (y,x)

    match is more interesting than let, since it can testseveral |||-separated patterns.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 8/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types & patterns: the functional languages perspective

    Types are sets of values

    Values are decomposed by patterns

    Patterns are roughly values with capture variables

    Instead of

    let x = fst(e) inlet y = snd(e) in (y,x)

    with pattern one can write

    let (x,y) = e in (y,x)

    which is syntactic sugar for

    match e with (x,y) -> (y,x)

    match is more interesting than let, since it can testseveral |||-separated patterns.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 8/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types & patterns: the functional languages perspective

    Types are sets of values

    Values are decomposed by patterns

    Patterns are roughly values with capture variables

    Instead of

    let x = fst(e) inlet y = snd(e) in (y,x)

    with pattern one can write

    let (x,y) = e in (y,x)

    which is syntactic sugar for

    match e with (x,y) -> (y,x)

    match is more interesting than let, since it can testseveral |||-separated patterns.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 8/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Types & patterns: the functional languages perspective

    Types are sets of values

    Values are decomposed by patterns

    Patterns are roughly values with capture variables

    Instead of

    let x = fst(e) inlet y = snd(e) in (y,x)

    with pattern one can write

    let (x,y) = e in (y,x)

    which is syntactic sugar for

    match e with (x,y) -> (y,x)

    match is more interesting than let, since it can testseveral |||-separated patterns.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 8/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , nnn) -> n| (( ,ttt), nnn) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nilnilnil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List)(Any,List)(Any,List) | nil

    fun length (x:(List,Int)(List,Int)(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nilnilnil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (AnyAnyAny,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Key idea behind regular patterns

    Patterns are types with capture variables

    Define types: patterns come for free.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Key idea behind regular patterns

    Patterns are types with capture variables

    Define types: patterns come for free.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Example: tail-recursive version of length for lists:

    type List = (Any,List) | nil

    fun length (x:(List,Int)) : Int =match x with| (nil , n) -> n| (( ,t), n) -> length(t,n+1)

    So patterns are values with capture variables, wildcards, constants.

    But if we:

    1 use for types the same constructors as for values(e.g. (s,t) instead of s t)

    2 use values to denote singleton types(e.g. nil in the list type);

    3 consider the wildcard as synonym of Any

    Key idea behind regular patterns

    Patterns are types with capture variables

    Define types: patterns come for free.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 9/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    t = {v | v value of type t} and ***p+++ = {v | v matches pattern p}

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    t = {v | v value of type t} and ***p+++ = {v | v matches pattern p}

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    t = {v | v value of type t} and ***p+++ = {v | v matches pattern p}

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    t = {v | v value of type t} and ***p+++ = {v | v matches pattern p}

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    t = {v | v value of type t} and ***p+++ = {v | v matches pattern p}

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    t = {v | v value of type t} and ***p+++ = {v | v matches pattern p}

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    t = {v | v value of type t} and ***p+++ = {v | v matches pattern p}

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t&&& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \\\ *** p1+++)&&& *** p2+++;- The type of the match is t1|||t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    t = {v | v value of type t} and ***p+++ = {v | v matches pattern p}

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    Roadmap to extend it to XML:

    1 Define types for XML documents,2 Add boolean type constructors,3 Define patterns as types with capture variables

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    Roadmap to extend it to XML:

    1 Define types for XML documents,

    2 Add boolean type constructors,3 Define patterns as types with capture variables

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    Roadmap to extend it to XML:

    1 Define types for XML documents,2 Add boolean type constructors,

    3 Define patterns as types with capture variables

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Which types should we start from?

    Patterns are tightly connected to boolean type constructors,that is unions (|), intersections (&) and differences (\\\):

    Boolean operators are needed to type pattern matching:match e with p1 -> e1 | p2 -> e2

    - To infer the type t1 of e1 we need t& *** p1+++ (where e : t);- To infer the type t2 of e2 we need (t \ *** p1+++)& *** p2+++;- The type of the match is t1|t2 .

    Boolean type constructors are useful for programming:

    map catalogue withx :: (Car & (Guaranteed|(Any\Used)) -> x

    Select in catalogue all cars that if used then are guaranteed.

    Roadmap to extend it to XML:

    1 Define types for XML documents,2 Add boolean type constructors,3 Define patterns as types with capture variables

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 10/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book Book]

    [Object-Oriented Programming]

    [Castagna][Giuseppe]

    ][56]Bikhauser

    ]

    [Regexp Types for XML]

    [Hosoya][Haruo]

    ]UoT

    ]]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book Book][[Object-Oriented Programming][[Castagna][Giuseppe]

    ][56]Bikhauser

    ][[Regexp Types for XML]

    [Hosoya][Haruo]

    ]UoT

    ]]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book Book][[Object-Oriented Programming][[Castagna][Giuseppe]

    ][56]Bikhauser

    ][[Regexp Types for XML]

    [Hosoya][Haruo]

    ]UoT

    ]]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book Book] String = [PCDATA] = [Char*][

    [[PCDATA][PCDATA]

    ][PCDATA]PCDATA

    ][[PCDATA]

    [PCDATA][PCDATA]

    ]PCDATA

    ]]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book Book]type Book = [

    Title(Author+ | Editor+)Price?PCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*]type Book = [

    Title(Author+ | Editor+)Price?PCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*]type Book = [

    Title(Author+ | Editor+)Price?PCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*] Kleene startype Book = [

    Title(Author+ | Editor+)Price?PCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*]type Book = [ attribute types

    Title(Author+ | Editor+)Price?PCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*]type Book = [

    Title nested elements(Author+ | Editor+)Price?PCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*]type Book = [

    Title(Author+ | Editor+) unionsPrice?PCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*]type Book = [

    Title(Author+ | Editor+)Price? optional elemsPCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*]type Book = [

    Title(Author+ | Editor+)Price?PCDATA] mixed content

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML syntax

    type Bib = [Book*]type Book = [

    Title(Author+ | Editor+)Price?PCDATA]

    type Author = [Last First]type Editor = [Last First]type Title = [PCDATA]type Last = [PCDATA]type First = [PCDATA]type Price = [PCDATA]

    This and: singletons, intersections, differences, Empty, and Any.

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 11/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    TY

    PE

    SP

    AT

    TE

    RN

    S

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = String

    TY

    PE

    SP

    AT

    TE

    RN

    S

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[x::Book*] -> x

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[x::Book*] -> x

    The pattern binds x to the sequence of all books in the bibliographyPA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[x::Book*] -> x

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[x::Book*] -> x

    Returns the content of bibs.PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[( x:: | y:: )*] -> [email protected]

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[( x:: | y:: )*] -> [email protected]

    Binds x to the sequence of all this years books, and y to all theother books.

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[( x:: | y:: )*] -> [email protected]

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[( x:: | y:: )*] -> [email protected]

    Returns the concatenation (i.e., @) of the two captured sequencesPA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[(x::[ * Publisher\"ACM"] | )*] -> x

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[(x::[ * Publisher\"ACM"] | )*] -> x

    Binds x to the sequence of books published in 1990 from publishersothers than ACM and discards all the others.

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[(x::[ * Publisher\"ACM"] | )*] -> x

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[(x::[ * Publisher\"ACM"] | )*] -> x

    Returns all the captured booksPA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[(x::[ * Publisher\"ACM"] | )*] -> x

    Returns all the captured books

    Exact type inference:

    E.g.: if we match the pattern [(x::Int| )*] against an expressionof type [Int* String Int] the type deduced for x is [Int+]

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Patterns

    Patterns = Types + Capture variables

    type Bib = [Book*]

    type Book = [Title Author+ Publisher]

    type Publisher = StringTY

    PE

    S

    match bibs with[(x::[ * Publisher\"ACM"] | )*] -> x

    Returns all the captured books

    Exact type inference:

    E.g.: if we match the pattern [(x::Int| )***] against an expressionof type [Int* String Int] the type deduced for x is [Int+++]

    PA

    TT

    ER

    NS

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 12/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    XML-programming in CDuce

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 13/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: basic usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Extract subsequences (union polymorphism)

    fun (Invited|Talk -> [Author+])< >[ Title x::Author* ] -> x

    Extract subsequences of non-consecutive elements:

    fun ([(Invited|Talk|Event)*] -> ([Invited*], [Talk*]))[ (i::Invited | t::Talk | )* ] -> (i,t)

    Perl-like string processing (String = [Char*])

    fun parse email (String -> (String,String))| [ local:: * @ domain:: * ] -> (local,domain)| -> raise "Invalid email address"

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 14/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: basic usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Extract subsequences (union polymorphism)

    fun (Invited|Talk -> [Author+])< >[ Title x::Author* ] -> x

    Extract subsequences of non-consecutive elements:

    fun ([(Invited|Talk|Event)*] -> ([Invited*], [Talk*]))[ (i::Invited | t::Talk | )* ] -> (i,t)

    Perl-like string processing (String = [Char*])

    fun parse email (String -> (String,String))| [ local:: * @ domain:: * ] -> (local,domain)| -> raise "Invalid email address"

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 14/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: basic usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Extract subsequences (union polymorphism)

    fun (Invited|Talk -> [Author+])< >[ Title x::Author* ] -> x

    Extract subsequences of non-consecutive elements:

    fun ([(Invited|Talk|Event)*] -> ([Invited*], [Talk*]))[ (i::Invited | t::Talk | )* ] -> (i,t)

    Perl-like string processing (String = [Char*])

    fun parse email (String -> (String,String))| [ local:: * @ domain:: * ] -> (local,domain)| -> raise "Invalid email address"

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 14/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: basic usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Extract subsequences (union polymorphism)

    fun (Invited|Talk -> [Author+])< >[ Title x::Author* ] -> x

    Extract subsequences of non-consecutive elements:

    fun ([(Invited|Talk|Event)*] -> ([Invited*], [Talk*]))[ (i::Invited | t::Talk | )* ] -> (i,t)

    Perl-like string processing (String = [Char*])

    fun parse email (String -> (String,String))| [ local:: * @ domain:: * ] -> (local,domain)| -> raise "Invalid email address"

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 14/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Functions: advanced usage

    type Program = [ Day* ]type Day = [ Invited? Talk+ ]type Invited = [ Title Author+ ]type Talk = [ Title Author+ ]

    Functions can be higher-order and overloaded

    let patch program(p :[Program], f :(Invited -> Invited) &&& (Talk -> Talk)):[Program]

    = xtransform p with (Invited | Talk) & x -> [ (f x) ]

    Higher-order, overloading, subtyping provide name/code sharing...

    let first author ([Program] -> [Program];Invited -> Invited;Talk -> Talk)

    | [ Program ] & p -> patch program (p,first author)| [ t a * ] -> [ t a ]| [ t a * ] -> [ t a ]

    Even more compact: replace the last two branches with:

    [ t a * ] -> [ t a ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 15/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    . . . it is all syntactic sugar!

    Types

    t ::= Int | v | (t, t) | tt | t t | t t | t | Any

    Patterns

    p ::= t | x | (p, p) | p p | p p

    Example:

    type Book = [Title (Author+|Editor+) Price?]

    encoded as

    Book = (book, (Title,X Y ))X = (Author,X (Price, nil) nil)Y = (Editor,Y (Price, nil) nil)

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 16/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    . . . it is all syntactic sugar!

    Types

    t ::= Int | v | (t, t) | tt | t t | t t | t | Any

    Patterns

    p ::= t | x | (p, p) | p p | p p

    Example:

    type Book = [Title (Author+|Editor+) Price?]

    encoded as

    Book = (book, (Title,X Y ))X = (Author,X (Price, nil) nil)Y = (Editor,Y (Price, nil) nil)

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 16/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    . . . it is all syntactic sugar!

    Types

    t ::= Int | v | (t, t) | tt | t t | t t | t | Any

    Patterns

    p ::= t | x | (p, p) | p p | p p

    Example:

    type Book = [Title (Author+|Editor+) Price?]

    encoded as

    Book = (book, (Title,X Y ))X = (Author,X (Price, nil) nil)Y = (Editor,Y (Price, nil) nil)

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 16/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    . . . it is all syntactic sugar!

    Types

    t ::= Int | v | (t, t) | tt | t t | t t | t | Any

    Patterns

    p ::= t | x | (p, p) | p p | p p

    Example:

    type Book = [Title (Author+|Editor+) Price?]

    encoded as

    Book = (book, (Title,X Y ))X = (Author,X (Price, nil) nil)Y = (Editor,Y (Price, nil) nil)

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 16/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Some reasons to consider regularexpression types and patterns

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 17/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Some good reasons to consider regexp patterns/types

    Theoretical reason: very compact

    Nine practical reasons:

    1 Classic usage2 Informative error messages3 Error mining4 Efficient execution5 Compact programs6 Logical optimisation of pattern-based queries7 Pattern matches as building blocks for iterators8 Type/pattern-based data pruning for memory usage optimisation9 Type-based query optimisation

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 18/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Some good reasons to consider regexp patterns/types

    Theoretical reason: very compact ( 6= simple)

    Nine practical reasons:

    1 Classic usage2 Informative error messages3 Error mining4 Efficient execution5 Compact programs6 Logical optimisation of pattern-based queries7 Pattern matches as building blocks for iterators8 Type/pattern-based data pruning for memory usage optimisation9 Type-based query optimisation

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 18/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Some good reasons to consider regexp patterns/types

    Theoretical reason: very compact ( 6= simple)

    Nine practical reasons:

    1 Classic usage2 Informative error messages3 Error mining4 Efficient execution5 Compact programs6 Logical optimisation of pattern-based queries7 Pattern matches as building blocks for iterators8 Type/pattern-based data pruning for memory usage optimisation9 Type-based query optimisation

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 18/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Some good reasons to consider regexp patterns/types

    Theoretical reason: very compact ( 6= simple)

    Nine practical reasons:

    1 Classic usage2 Informative error messages3 Error mining4 Efficient execution5 Compact programs6 Logical optimisation of pattern-based queries7 Pattern matches as building blocks for iterators8 Type/pattern-based data pruning for memory usage optimisation9 Type-based query optimisation

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 18/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    Some good reasons to consider regexp patterns/types

    Theoretical reason: very compact ( 6= simple)

    Nine practical reasons:

    1 Classic usage2 Informative error messages3 Error mining4 Efficient execution5 Compact programs6 Logical optimisation of pattern-based queries7 Pattern matches as building blocks for iterators8 Type/pattern-based data pruning for memory usage optimisation9 Type-based query optimisation

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 18/110

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    fun onlyAuthors (year:Int,books:[Book*]):[Book*] =select ([email protected]) from[(t::Title | a::Author | )+] in books

    where int of(y) = year

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

    type Book = [Title (Author+|Editor+) Price?]

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    fun onlyAuthors (year:Int,books:[Book*]):[Book*] =select ([email protected]) from[(t::Title | a::Author | )+] in books

    where int of(y) = year

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

    type Book = [Title (Author+|Editor+) Price?]

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    fun onlyAuthors (year:Int,books:[Book*]):[Book*] =select ([email protected]) from[(t::Title | a::Author | )+] in books

    where int of(y) = year

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

    type Book = [Title (Author+|Editor+) Price?]

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    fun onlyAuthors (year:Int,books:[Book*]):[Book*] =select ([email protected]) from[(t::Title | a::Author | )+] in books

    where int of(y) = year

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

    type Book = [Title (Author+|Editor+) Price?]

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    fun onlyAuthors (year:Int,books:[Book*]):[Book*] =select ([email protected]) from[(t::Title | a::Author | )+] in books

    where int of(y) = year

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

    type Book = [Title (Author+|Editor+) Price?]

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    fun onlyAuthors (year:Int,books:[Book*]):[Book*] =select ([email protected]) from[(t::Title | a::Author | )+] in books

    where int of(y) = year

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

    type Book = [Title (Author+|Editor+) Price?]

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    2. Informative error messages

    List of books of a given year, stripped of the Editors and Price

    fun onlyAuthors (year:Int,books:[Book*]):[Book*] =select ([email protected]) from[ t::Title a::Author+++ * ] in books

    where int of(y) = year

    Returns the following error message:Error at chars 81-83:

    select ([email protected]) fromThis expression should have type:[ Title (Editor+|Author+) Price? ]but its inferred type is:[ Title Author+ | Title ]which is not a subtype, as shown by the sample:

    [ [ ] ]

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 19/110

    In case of error return a sample value in the difference of theinferred type and the expected one

    type Book = [Title (Author+|Editor+) Price?]

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    5. Efficient execution

    Idea: if types tell you that something cannot happen, dont test it.

    type A = [A*]type B = [B*]

    fun check(x : A|B) = match x with A -> 1 | B -> 0

    fun check(x : A|B) = match x with -> 1 | -> 0

    No backtracking.

    Whole parts of the matched data are not checked

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 20/110

    Use static type information to perform an optimal set of tests

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    5. Efficient execution

    Idea: if types tell you that something cannot happen, dont test it.

    type A = [A*]type B = [B*]

    fun check(x : A|B) = match x with A -> 1 | B -> 0

    fun check(x : A|B) = match x with -> 1 | -> 0

    No backtracking.

    Whole parts of the matched data are not checked

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 20/110

    Use static type information to perform an optimal set of tests

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    5. Efficient execution

    Idea: if types tell you that something cannot happen, dont test it.

    type A = [A*]type B = [B*]

    fun check(x : A|B) = match x with A -> 1 | B -> 0

    fun check(x : A|B) = match x with -> 1 | -> 0

    No backtracking.

    Whole parts of the matched data are not checked

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 20/110

    Use static type information to perform an optimal set of tests

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    5. Efficient execution

    Idea: if types tell you that something cannot happen, dont test it.

    type A = [A*]type B = [B*]

    fun check(x : A|B) = match x with A -> 1 | B -> 0

    fun check(x : A|B) = match x with -> 1 | -> 0

    No backtracking.

    Whole parts of the matched data are not checked

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 20/110

    Use static type information to perform an optimal set of tests

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    5. Efficient execution

    Idea: if types tell you that something cannot happen, dont test it.

    type A = [A*]type B = [B*]

    fun check(x : A|B) = match x with A -> 1 | B -> 0

    fun check(x : A|B) = match x with -> 1 | -> 0

    No backtracking.

    Whole parts of the matched data are not checked

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 20/110

    Use static type information to perform an optimal set of tests

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    5. Efficient execution

    Idea: if types tell you that something cannot happen, dont test it.

    type A = [A*]type B = [B*]

    fun check(x : A|B) = match x with A -> 1 | B -> 0

    fun check(x : A|B) = match x with -> 1 | -> 0

    No backtracking.

    Whole parts of the matched data are not checked

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 20/110

    Use static type information to perform an optimal set of tests

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    5. Efficient execution

    Idea: if types tell you that something cannot happen, dont test it.

    type A = [A*]type B = [B*]

    fun check(x : A|B) = match x with A -> 1 | B -> 0

    fun check(x : A|B) = match x with -> 1 | -> 0

    No backtracking.

    Whole parts of the matched data are not checked

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 20/110

    Use static type information to perform an optimal set of tests

  • logoP7

    1. Introduction 2. Regexp types/patterns 3. XML Programming in CDuce 3. Properties 4. Toolkit MPRI

    5. Efficient execution

    Idea: if types tell you that something cannot happen, dont test it.

    type A = [A*]type B = [B*]

    fun check(x : A|B) = match x with A -> 1 | B -> 0

    fun check(x : A|B) = match x with -> 1 | -> 0

    No backtracking.

    Whole parts of the matched data are not checked

    Part 1: XML Programming in CDuce G. Castagna: Theory and practice of XML processing languages 20/110

    Use static type information to perform an optimal set of