181
Digital.Humanities@Oxford Summer School 2012 edited by James Cummings and Sebastian Rahtz July 2012 1

Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Digital.Humanities@Oxford Summer School 2012

edited by James Cummings and Sebastian Rahtz

July 2012

1

Page 2: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Digital.Humanities @ Oxford Summer School 2012

2

Page 3: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Digital Humanities @ Oxford

Contents

1 Overall Timetable 4

2 Introduction 6

3 Full Programme 73.1 Monday 2 July 2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.2 Tuesday 3 July 2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.3 Wednesday 4 July 2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.4 Thursday 5 July 2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.5 Friday 6 July 2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Workshop Abstracts 134.1 An Introduction to XML and the Text Encoding Initiative . . . . . . . . . . . . . . . . . 134.2 Working with TEI Texts (Advanced) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.3 An Introduction to Digital Humanities Tools and Approaches . . . . . . . . . . . . . . . 134.4 A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the

Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5 Workshop: An Introduction to XML and the Text Encoding Initiative 155.1 Timetable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155.2 Exercise 1: Create an XML Document . . . . . . . . . . . . . . . . . . . . . . . . . . . 165.3 Exercise 2: Create a TEI Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205.4 Exercise 3: Improving a teiHeader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.5 Exercise 4: Marking Up Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315.6 Exercise 5: Creating a Manuscript Description . . . . . . . . . . . . . . . . . . . . . . . 365.7 Exercise 6: Transcribing with the TEI . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.8 Exercise 7: Encoding Spoken Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.9 Exercise 8: Linguistic Markup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.10 Exercise 9: Customise the TEI with Roma . . . . . . . . . . . . . . . . . . . . . . . . . 595.11 Exercise 10: OxGarage and the TEI Community . . . . . . . . . . . . . . . . . . . . . . 665.12 TEI reference material: summary of elements . . . . . . . . . . . . . . . . . . . . . . . 715.13 Wilfred Owen: Letter To Leslie Gunston . . . . . . . . . . . . . . . . . . . . . . . . . . 955.14 Wilfred Owen: Preface MS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975.15 Stuart Lee interviews Ian Hislop (fragment) . . . . . . . . . . . . . . . . . . . . . . . . 98

6 Workshop: Working with TEI Texts 1016.1 Timetable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1016.2 Data samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1026.3 Getting better quality TEI XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1246.4 XSLT transformations for genetic editions . . . . . . . . . . . . . . . . . . . . . . . . . 1246.5 Grouping Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1256.6 Using XQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1256.7 Using TEI stylesheet family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1256.8 TEI reference material: XSL stylesheets . . . . . . . . . . . . . . . . . . . . . . . . . . 1286.9 Quick reference cards for XSLT, XQuery, XPath, Regular Expressions, and Schematron . 155

7 Workshop: An Introduction to Digital Humanities Tools and Approaches 1687.1 Corpus Linguistics and Text Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

8 Workshop: A Humanities Web of Data: Publishing, Linking, Querying and Visualisationon the Semantic Web 181

3

Page 4: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Overall Timetable

1 Overall Timetable

4

Page 5: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5

Page 6: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Introduction

2 IntroductionThe Digital.Humanities@Oxford Summer School (DHOXSS) 2012 takes places from 2nd - 6thJuly at the University of Oxford. DHOXSS delegates will be introduced to a range of topics suitablefor researchers, project managers, research assistants, and students who are interested in the creation,management, or publication of digital data in the humanities.

Delegates will follow one of our 5 day workshops on:

• An Introduction to XML and the Text Encoding Initiative

• Working with TEI Texts (Advanced)

• An Introduction to Digital Humanities Tools and Approaches

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb

Each day will also contain plenary guest lectures by experts in their fields, plus sessions on a widevariety of Digital Humanities topics. There will be morning surgery sessions to discuss projects andpossibilities with tutors. The summer school is a collaboration for Digital.Humanities@Oxford betweenOxford University Computing Services (OUCS), Oxford e-Research Centre (OeRC), with the assistanceof the Humanities Division, the Bodleian Libraries, the Oxford Internet Institute, and e-Research South.The DHOXSS is organized by James Cummings and Sebastian Rahtz at OUCS and Erin Snyder atOeRC.

The Summer School will be located at Merton College, OUCS, and the OeRC, all situated in thecentre of Oxford.

6

Page 7: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

3 Full Programme3.1 Monday 2 July 20123.1.1 09:30 - 10:00: RegistrationRegistration will take place in the foyer of the TS Eliot Lecture Theatre at Merton College from 09:30 -10:00 on Monday Morning. Registration may be available at other times by prior arrangement.

3.1.2 10:00 - 11:00: Plenary LecturePlenary Lecture: Crowdsourcing in the Humanities Chris Lintott (Zooniverse)

3.1.3 11:00 - 11:30: Tea BreakTea Break will take place is the foyer of the TS Eliot Lecture Theatre at Merton College.

3.1.4 11:30 - 12:30: Workshops – Introductory Lectures• An Introduction to XML and the Text Encoding Initiative – David Harvey Room, Merton College

• Working with TEI Texts (Advanced) – Ian Taylor Room, Merton College

• An Introduction to Digital Humanities Tools and Approaches: "Corpus and Text Analysis forResearch in the Humanities", Martin Wynne (OUCS and OeRC) – TS Eliot Lecture Theatre,Merton College

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Sir Howard Stringer Room, Merton College

3.1.5 12:30 - 13:30: LunchLunch will be in the foyer of the TS Eliot Lecture Theatre

3.1.6 13:30 - 14:00: Travel Time to OUCSThe computer-based practical aspects of the workshops will take place in the Thames Suite of the OxfordUniversity Computing Services, 13 Banbury Road, Oxford, OX2 6NN. Leave adequate time to walkthere from Merton College.

3.1.7 14:00 - 16:00: Workshops – Practical• An Introduction to XML and the Text Encoding Initiative – Evenlode Room, OUCS

• Working with TEI Texts (Advanced) –Cherwell Room, OUCS

• An Introduction to Digital Humanities Tools and Approaches: "Dealing with the Data Deluge:Corpus Linguistics for Text-Based Research", Martin Wynne (OUCS and OeRC) – Isis Room,OUCS

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Windrush Room, OUCS

3.1.8 16:00 - 16:30: Tea BreakThe Tea Break and Parallel Sessions will be held at the Oxford e-Research Centre, 7 Keble Road Oxford,OX1 3QG. Tea Break will be in the OeRC Atrium.

3.1.9 16:30 - 17:30: Parallel SessionsYou have a free choice on the day of which session to attend:

• Parallel Session 1: Oxford adventures in crowdsourcing: models for engaging communitiesand enhancing digital collections Kate Lindsay (OUCS) and David Tomkins (Bodleian) – OeRCLecture Theatre B

• Parallel Session 2: Creating Digital Data Resources: Issues to consider David Robey (OeRC) –OeRC Conference Room

7

Page 8: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Full Programme

3.1.10 19:00 - : Drinks ReceptionA free reception with drinks and nibbles will take place from 19:00 on the Sundial Lawn at MertonCollege (or in case of rain, the TS Eliot Foyer).

3.2 Tuesday 3 July 20123.2.1 09:30 - 10:00: Surgery A (Optional)Surgery A is Focus Group on Sustainability and EEBO-TCP by Judith Siefring (Bodleian) – Sir HowardStringer Room, Merton College.

3.2.2 10:00 - 11:00: Plenary LecturePlenary Lecture: Humanities Research Data – Rate me! Wolfram Horstmann (Bodleian) – TS EliotLecture Theatre, Merton College.

3.2.3 11:00 - 11:30: Tea BreakTea Break will take place is the foyer of the TS Eliot Lecture Theatre at Merton College.

3.2.4 11:30 - 12:30: Workshops – Introductory Lectures• An Introduction to XML and the Text Encoding Initiative – David Harvey Room, Merton College

• Working with TEI Texts (Advanced) – Ian Taylor Room, Merton College

• An Introduction to Digital Humanities Tools and Approaches: "The Dangers and Delights of DataMining", Glenn Roe (OeRC) – TS Eliot Lecture Theatre, Merton College

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Sir Howard Stringer Room, Merton College

3.2.5 12:30 - 13:30: LunchLunch will be in the foyer of the TS Eliot Lecture Theatre

3.2.6 13:30 - 14:00: Travel Time to OUCSThe computer-based practical aspects of the workshops will take place in the Thames Suite of the OxfordUniversity Computing Services, 13 Banbury Road, Oxford, OX2 6NN. Leave adequate time to walkthere from Merton College.

3.2.7 14:00 - 16:00: Workshops – Practical• An Introduction to XML and the Text Encoding Initiative – Evenlode Room, OUCS

• Working with TEI Texts (Advanced) –Cherwell Room, OUCS

• An Introduction to Digital Humanities Tools and Approaches: "A Practical Introduction to TextMining", Glenn Roe (OeRC) – Isis Room, OUCS

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Windrush Room, OUCS

3.2.8 16:00 - 16:30: Tea BreakThe Tea Break and Parallel Sessions will be held at the Oxford e-Research Centre, 7 Keble Road Oxford,OX1 3QG. Tea Break will be in the OeRC Atrium.

3.2.9 16:30 - 17:30: Parallel Sessions• Parallel Session 3: The other 99%: two approaches to project modelling Pip Willcox (Bodleian)

– – OeRC Conference Room

• Parallel Session 4: Encoding Music Text and Text with Music Raffaele Viglianti (King’s CollegeLondon) – OeRC Lecture Theatre B

8

Page 9: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

3.3 Wednesday 4 July 2012

3.3 Wednesday 4 July 20123.3.1 09:30 - 10:00: Surgery B (Optional)Surgery B is Surgery B: Text Encoding Project Advice James Cummings (OUCS) – Sir Howard StringerRoom, Merton College.

3.3.2 10:00 - 11:00: Plenary LecturePlenary Lecture: Social Machines Dave DeRoure (OeRC) – TS Eliot Lecture Theatre, Merton College.

3.3.3 11:00 - 11:30: Tea BreakTea Break will take place is the foyer of the TS Eliot Lecture Theatre at Merton College.

3.3.4 11:30 - 12:30: Workshops – Introductory Lectures• An Introduction to XML and the Text Encoding Initiative – David Harvey Room, Merton College

• Working with TEI Texts (Advanced) – Ian Taylor Room, Merton College

• An Introduction to Digital Humanities Tools and Approaches: "Introduction to Markup", LouBurnard (Adonis TGE) – TS Eliot Lecture Theatre, Merton College

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Sir Howard Stringer Room, Merton College

3.3.5 12:30 - 13:30: LunchLunch will be a buffet in Merton College Hall.

3.3.6 13:30 - 14:00: Travel Time to OUCSThe computer-based practical aspects of the workshops will take place in the Thames Suite of the OxfordUniversity Computing Services, 13 Banbury Road, Oxford, OX2 6NN. Leave adequate time to walkthere from Merton College.

3.3.7 14:00 - 16:00: Workshops – Practical• An Introduction to XML and the Text Encoding Initiative – Evenlode Room, OUCS

• Working with TEI Texts (Advanced) –Cherwell Room, OUCS

• An Introduction to Digital Humanities Tools and Approaches: "TEI a la Carte", Lou Burnard(Adonis TGE) – Isis Room, OUCS

• A Humanities Web of Data : Publishing, Linking, Querying and Visualisation on the SemanticWeb – Windrush Room, OUCS

3.3.8 16:00 - 16:30: Tea BreakThe Tea Break and Parallel Sessions will be held at the Oxford e-Research Centre, 7 Keble Road Oxford,OX1 3QG. Tea Break will be in the OeRC Atrium.

3.3.9 16:30 - 17:30: Parallel Sessions• Parallel Session 5: Copyright and Open Licensing Rowan Wilson (OUCS) – OeRC Lecture

Theatre B

• Parallel Session 6: Silos and Street-Literature: Digitising and Linking Cheap Print Collectionsand Traditions Giles Bergel (Merton College and English Faculty) – OeRC Conference Room

3.3.10 19:00 - : BanquetA table-service banquet will take place in Merton College Hall for those who selected this additionaloption when registering and paying.

9

Page 10: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Full Programme

3.4 Thursday 5 July 20123.4.1 09:30 - 10:00: Surgery C (Optional)Surgery C is Web Project and Data Modelling James Cummings (OUCS), Alexander Dutton (OUCS),Monica Messaggi-Kaya (Bodleian), Pip Willcox(Bodleian) – Sir Howard Stringer Room, MertonCollege.

3.4.2 10:00 - 11:00: Plenary LecturePlenary Lecture: Linked Data in the Humanities: An Open-and-Shut Case? Elton Barker (OpenUniversity) and Leif Isaksen (University of Southampton) – TS Eliot Lecture Theatre, Merton College.

3.4.3 11:00 - 11:30: Tea BreakTea Break will take place is the foyer of the TS Eliot Lecture Theatre at Merton College.

3.4.4 11:30 - 12:30: Workshops – Introductory Lectures• An Introduction to XML and the Text Encoding Initiative – David Harvey Room, Merton College

• Working with TEI Texts (Advanced) – Ian Taylor Room, Merton College

• An Introduction to Digital Humanities Tools and Approaches: "Working with Digital Images",Segolene Tarte (OeRC) – TS Eliot Lecture Theatre, Merton College

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Sir Howard Stringer Room, Merton College

3.4.5 12:30 - 13:30: LunchLunch will be a buffet in Merton College Hall.

3.4.6 13:30 - 14:00: Travel Time to OUCSThe computer-based practical aspects of the workshops will take place in the Thames Suite of the OxfordUniversity Computing Services, 13 Banbury Road, Oxford, OX2 6NN. Leave adequate time to walkthere from Merton College.

3.4.7 14:00 - 16:00: Workshops – Practical• An Introduction to XML and the Text Encoding Initiative – Evenlode Room, OUCS

• Working with TEI Texts (Advanced) –Cherwell Room, OUCS

• An Introduction to Digital Humanities Tools and Approaches: "Exploring and Extracting Infor-mation from Images", Segolene Tarte (OeRC) – Isis Room, OUCS

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Windrush Room, OUCS

3.4.8 16:00 - 16:30: Tea BreakThe Tea Break and Parallel Sessions today, for a change, will be held at OUCS.

3.4.9 16:30 - 17:30: Parallel SessionsIn OUCS for a change:

• Parallel Session 7: Impact as a process: Understanding and enhancing the reach of digitalresources Eric Meyer (OII) and Kathryn Eccles (OII) – Evenlode Room, OUCS

• Parallel Session 8: Discoverability, Accessibility, and Machine-Readability Joseph Talbot (OUCS)– Isis Room, OUCS

10

Page 11: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

3.5 Friday 6 July 2012

3.4.10 19:00 - : Drinks ReceptionA free reception with drinks and nibbles (included in registration charge) will take place from 19:00 atthe Oxford University Museum of Natural History,

3.5 Friday 6 July 20123.5.1 09:30 - 10:00: Surgery D (Optional)Surgery D is Surgery D: Making funding proposals for digital projects Martin Wynne (OUCS andOeRC)– Sir Howard Stringer Room, Merton College.

3.5.2 10:00 - 11:00: Plenary LecturePlenary Lecture: Making the Digital Human: Anxieties, Possibilities, and Challenges Andrew Prescott(King’s College London) – TS Eliot Lecture Theatre, Merton College.

3.5.3 11:00 - 11:30: Tea BreakTea Break will take place is the foyer of the TS Eliot Lecture Theatre at Merton College.

3.5.4 11:30 - 12:30: Workshops – Introductory Lectures• An Introduction to XML and the Text Encoding Initiative – David Harvey Room, Merton College

• Working with TEI Texts (Advanced) – Ian Taylor Room, Merton College

• An Introduction to Digital Humanities Tools and Approaches: "Don’t Waste Space: How GIScan Aid Digital Humanities Research", Chris Green (Archaeology) – TS Eliot Lecture Theatre,Merton College

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Sir Howard Stringer Room, Merton College

3.5.5 12:30 - 13:30: LunchLunch will be a buffet in Merton College Hall.

3.5.6 13:30 - 14:00: Travel Time to OUCSThe computer-based practical aspects of the workshops will take place in the Thames Suite of the OxfordUniversity Computing Services, 13 Banbury Road, Oxford, OX2 6NN. Leave adequate time to walkthere from Merton College.

3.5.7 14:00 - 16:00: Workshops – Practical• An Introduction to XML and the Text Encoding Initiative – Evenlode Room, OUCS

• Working with TEI Texts (Advanced) –Cherwell Room, OUCS

• An Introduction to Digital Humanities Tools and Approaches: "Spatial Awareness: A BriefIntroduction to ArcGIS", Chris Green (Archaeology) – Isis Room, OUCS

• A Humanities Web of Data: Publishing, Linking, Querying and Visualisation on the SemanticWeb – Windrush Room, OUCS

3.5.8 16:00 - 16:30: Tea BreakThe Tea Break and Parallel Sessions will be held at the Oxford e-Research Centre, 7 Keble Road Oxford,OX1 3QG. Tea Break will be in the OeRC Atrium.

11

Page 12: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Full Programme

3.5.9 16:30 - 17:30: Parallel Sessions• Parallel Session 9: Digital Library Technologies and Best Practice Neil Jefferies (Bodleian) and

Christine Madsen (Bodleian) – OeRC Conference Room

• Parallel Session 10: Panel: Running Digital Humanities Summer Schools James Cummings(OUCS), Sebastian Rahtz (OUCS), Ray Siemens (University of Victoria), Erin Snyder (OeRC),John Pybus (OeRC) – OeRC Lecture Theatre B

12

Page 13: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

4 Workshop Abstracts4.1 An Introduction to XML and the Text Encoding InitiativeThis introductory workshop will balance lectures with hands-on practical sessions to introduce therecommendations of the Text Encoding Initiative (TEI) for encoding of digital text. The workshopcombines in-depth coverage of the latest version of the TEI P5 Guidelines for the encoding of digitaltext with practical exercises to reinforce the topics covered. It provides an introduction to mark-up, explanations of various aspects of the TEI Guidelines and approaches to publishing TEI texts.Major aspects surveyed will include: basic TEI elements, metadata, names of people and places,manuscript transcription and description, linguistic analysis, and customisation of the TEI. Numerouspractical exercises expose you hands-on experience of a wide range of TEI editing, customisation, andpublication.

Tutors: James Cummings, Renée Baalen, Ylva Berglund-Prytz

4.2 Working with TEI Texts (Advanced)This advanced workshop will teach how to do something practical with your TEI XML texts beyondsimply converting them to HTML and putting them on the web. A mixture of talks and practical exerciseswill take participants through:

• Advanced validation and integrity checking using TEI ODD, Schematron and XSLT

• Transforming your TEI XML to formats other than HTML (Word, ePub, LaTeX etc)

• Extracting data from TEI texts for further analysis (eg names and places)

• Processing some more complex TEI documents (eg genetic encoding and timelines)

• Storing TEI documents in an XML database and querying them

Requirements: You must already have a good basic knowledge of XML, TEI and some familiaritywith programming/scripting ideas. Most of the work will be based on XSLT and XPath.

Tutors: Sebastian Rahtz, Raffaele Viglianti

4.3 An Introduction to Digital Humanities Tools and ApproachesThis workshop will introduce key research areas in the digital humanities, including language tools, textmining, image analysis, and use of geo-spatial data. The lecture sessions will emphasize the researchpotential of each area, discuss the theoretical implications of modelling data through these methods,and provide guidance about how these techniques are most usefully adapted to humanities research. Theworkshops will focus on actively addressing research questions, providing datasets and guidance on howto begin to conduct research with these tools. The course is conceived as a wide-ranging introduction tosome of the most exciting areas in digital humanities research, and will enable its participants to quicklybecome familiar with the possibilities and processes of conducting research in these areas.

• Monday – Martin WynneLecture: Corpus and Text Analysis for Research in the HumanitiesWorkshop: Dealing with the Data Deluge: Corpus Linguistics for Text-Based Research

• Tuesday – Glenn Roe:Lecture: The Dangers and Delights of Data MiningWorkshop: A Practical Introduction to Text Mining

• Wednesday – Lou Burnard:Lecture: Introduction to MarkupWorkshop: TEI a la Carte

13

Page 14: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop Abstracts

• Thursday – Segolene Tarte:Lecture: Working with Digital ImagesWorkshop: Exploring and Extracting Information from Images

• Friday – Chris Green:Lecture: Don’t Waste Space: How GIS can Aid Digital Humanities ResearchWorkshop: Spatial Awareness: A Brief Introduction to ArcGIS

Tutors: Erin Snyder, Christopher Green, Glenn Roe, Segolene Tarte, Martin Wynne

4.4 A Humanities Web of Data: Publishing, Linking, Querying andVisualisation on the Semantic Web

This workshop will introduce the Semantic Web and show how to publish your data so that it is availableas Linked Open Data within the web of data. Topics covered will include: the RDF format; modellingyour data and publishing to the web; querying RDF data using SPARQL; choosing and designingvocabularies and ontologies, and more.

Tutors: John Pybus, Alexander Dutton, Kevin Page

14

Page 15: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.1 Timetable

5 Workshop: An Introduction to XML and the Text EncodingInitiative

5.1 Timetable

Time Monday Tuesday Wednesday Thursday FridayMorning(1hr)

XML andTEI [JC]

TEI Metadata[JC]

MS Description[JC]

Spoken Texts[YB]

Customising theTEI [RB]

Practical 1 Createan XMLDocument

Improving aTEI Header

Adding a MSDescription

TranscribingSpeech

Customise theTEI with Roma

Talk (1hr) TEI CoreModule[JC]

Names,People, Places[RB]

Transcription,Facsimile andGenetic Editing[JC]

LinguisticAnalysis andTools [YB]

Talk: Trans-forming the TEI[JC]

Practical 2 Create a TEIDocument

Marking upNames andPeople

TranscriptionExercise

LinguisticAnalysis

OxGarageand the TEICommunity

15

Page 16: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.2 Exercise 1: Create an XML Document5.2.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• mark up an XML declaration

• insert a text file into an XML editor

• mark up basic features of a poem

• create a well-formed XML document

5.2.2 SummaryThis exercise will walk you through creating an XML document in the oXygen editor and introducea variety of ways to mark this document up. You will first start a new document, then insert someunmarked up text into the editor, and then mark up the stanzas or line-groups (lg) and lines (l). You willlearn to check that your document is well-formed or not.

5.2.3 Starting A New XML FileLet’s start a new XML file by following the following steps:

• Load up the oXygen XML Editor if it isn’t already loaded by using the Windows Start Menu, ordouble-clicking the icon on the desktop.

• Once the editor has fully loaded from the ’File’ menu select ’New’ and under ’New Document’select ’XML Document’. This should open up a blank document with an XML Declaration added.

• An XML Declaration looks like:

<?xml version="1.0" encoding="UTF-8"?>

and The XML declaration in the element tells anything processing your XML file, including theeditor, that this is an XML file and what version of XML you are using through the @versionattribute. It also conveys which characters the program may expect in attribute @encoding.XML version 1.0 is a W3C recommendation from 2008. UTF-8 (Universal Character SetTransformation Format - 8 bit) contains most characters from all human writing systems. TheXML declaration needs no closing tag as it takes the form of a special processing-instruction thatstarts and ends with an angle-bracket and a question mark.

5.2.4 Creating a DivisionLet’s create a division of a text using the <div> element. This is a generic division or section element.

• On the line below the XML declaration type: <div>.

• Notice what happens when you type the final ’>’. oXygen is trying to help you and inserts in theclosing </div> tag. This is because it knows the rules of XML, and knows that if you type anopening <div> you are required to have a closing </div> sooner or later.

• We haven’t said what type of division this is, so lets categorise it as ’verse’ by adding a @typeattribute. Move the cursor back until your just after the letter ’v’ in the opening tag. Press space,and then type: type=" and notice what happens when you type the quotation mark. oXygenis again trying to help you by putting the closing quotation mark, because it knows that attributevalues must always be quoted.

• In between the quotation marks type ’verse’ to categorise our division as being verse.

• Move back until you are directly in between the opening <div> and closing </div>. Press’enter’ a couple times to give yourself some space inside the element.

16

Page 17: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.2 Exercise 1: Create an XML Document

5.2.5 Inserting Some TextWe are going to use the Wilfred Owen poem Strange Meeting as an example for this exercise. But itwould waste a lot of time if we asked you to type the whole poem in, so we’ve done that for you.

• Make sure your cursor is in-between the opening <div> and the closing </div> and go to theDocument menu and select ’File’ and from there then ’Insert File’. Note: This is from the’Document’ menu on the menu bar, not the ’File’ one.

• Select ’strange-meeting.txt’ as the file to insert.

• The start of your document should look like:

<?xml version="1.0" encoding="UTF-8"?><div type="verse">STRANGE MEETING

It seemed that out of battle I escapedDown some profound dull tunnel, long since scoopedThrough granites which titanic wars had groined.

[...a lot more text...]</div>

5.2.6 Encoding the Heading (using ’Surround with Tags’)The text ’STRANGE MEETING’ at the top of the poem is obviously a heading. The TEI <head>element should be used to mark this. To mark this do the following:

• Highlight the text ’STRANGE MEETING’ with the mouse.

• Either press control-e as a shortcut key, or right-click and under ’Refactoring’ select ’Surroundwith Tags’. A box should pop up and type head into it. Notice how oXygen helps you again byputting the opening tag before what you had highlighted and the close tag afterwards.

5.2.7 Marking Stanzas (using both ’Surround with Tags’ and ’Split Element’)Let’s mark the stanzas that appear doing the following steps:

• Highlight the first stanza, from "It seemed" to "had groined".

• Using control-e, or the menus, as you did above, mark this stanza as an <lg> element.

• Add a @type attribute with a value of ’stanza’ to the <lg> element so it looks like: <lgtype="stanza">.

• The start of your document should now look like:

<?xml version="1.0" encoding="UTF-8"?><div type="verse"><head>STRANGE MEETING</head><lg type="stanza"> It seemed that out of battle I escaped

Down some profound dull tunnel, long since scoopedThrough granites which titanic wars had groined.

</lg>

[...a lot more text...]

</div>

17

Page 18: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

• But if we have lots of stanzas, marking each one of them seems a lot of work, but there is a(possibly) easier way.

• Highlight the entire rest of the poem, from "Yet also there" to "Let us sleep now....’", and thensurround all of it in an <lg> element (by pressing control-e)

• Of course it is silly to have the entire rest of the poem marked as a single line-group, but go andadd a type="stanza" attribute to the opening tag.

• If you move the cursor to just before the start of each stanza, e.g. just before where it says "With athousand pains", and press alt-shift-d (or select Refactoring -> Split Element from the right-clickmenu), oXygen should split the <lg> element, ending it here and starting it just before wherethere cursor is located.

• Do this for other stanzas that are not marked yet.

5.2.8 Marking LinesWe’ve marked all the stanzas but we’ve not marked the lines.

• Highlight the first line in the first stanza, press control-e to surround with a tag, and type ’l’ as theelement name. (<l> is the line element, meaning a line of metrical verse).

• It might be a bit painful to mark up each and every line this way, you could try using the split-element technique above, but there is another shortcut to try as well. Highlight the second lineand press control-/ and notice that oXygen has wrapped the line in a <l> element. The reason forthis is that control-/ is the ’surround with the last element I surrounded something with’ shortcutkey.

• Using this technique, quickly mark all the remaining lines.

5.2.9 Format and IndentOur poem is marked up, but some of the markup might be a bit messy.

• Make sure that your file is ’well-formed’. You’ll be able to tell it is well-formed because oXygenwill have a happy green square in the upper right-hand corner. If it is red, you better find theproblem (where a red bar on the right-hand side is) and correct the mistake!

• Now let’s format and indent our file. This tidies up some of the whitespace and indents elementsbased on their place in the hierarchy. Either select the ’Format and Indent’ icon from the toolbar(it looks like some indented lines), or go to the menus: ’Document’ -> ’Source’ -> ’Format andIndent’.

• Formatting and indenting your markup is not necessary, it could all be on one big long line, but itmakes it much easier for other people to read.

5.2.10 Saving Your WorkLet’s save our work:

• Is your work well-formed? Do you have a happy green square or an angry red one?

• From the ’File’ menu select ’Save’ or click on the Save icon (looks like an old-style 3.5" disk)

• Save the file using the name ’exercise01.xml’ or another name of your choice.

18

Page 19: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.2 Exercise 1: Create an XML Document

5.2.11 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions to yourself:

• How do you start a new XML document in oXygen?

• What is an XML declaration?

• What is a well-formed document?

• How do I ’Surround with tag’ and repeat that action quickly?

• Why might using the ’Split element’ approach be useful?

• What is the function of each element and attribute in your current file?

• What is the advantage of formatting and indenting your markup?

5.2.12 Next?Your XML file may be well-formed but it is not yet valid because it doesn’t validate against a particularschema (such as those which are customisations of the TEI). Next we will have a short introduction tothe structure of TEI documents and some of the most frequently used elements. If you are finished earlyyou may wish to browse through the TEI Guidelines online at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index-toc.html. In particular you might want to look at the Elements appendix of referencepages for individual elements. Consider looking up all the elements you’ve used in this file to see howthey are defined.

19

Page 20: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.3 Exercise 2: Create a TEI Document5.3.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• discern the elements and attributes needed for a minimum valid TEI XML file

• associate a TEI XML file with a schema

• have used the TEI namespace

• create a minimum TEI header and text body

• check for both validity and well-formedness

5.3.2 SummaryThis exercise will walk you through creating a TEI XML file and inserting the work you did previouslyinto it. You’ll learn about the required aspects of the <teiHeader> and the basic structure of a TEIfile.

5.3.3 Start a New XML FileFollow the same steps you did for the first exercise to start a new blank XML file. Although we couldstart a file with a TEI P5 template, for this particular exercise that would be cheating!

• Load up the oXygen XML Editor if it isn’t already loaded by using the Windows Start Menu, ordouble-clicking the icon on the desktop.

• Once the editor has fully loaded from the ’File’ menu select ’New’ and under ’New Document’select ’XML Document’. This should open up a blank file with an XML Declaration added.

• An XML Declaration looks like:

<?xml version="1.0" encoding="UTF-8"?>

5.3.4 Inserting a <TEI> ElementAll TEI files start either with a <TEI> element or a <teiCorpus> element. In most cases you’llwant a <TEI> element. These elements have a special psuedo-attribute called ’xmlns’ that indicates thenamespace a set of elements are from. This is inherited by any elements inside it (unless overridden).This is how we can be sure we’re talking about, say, a <title> element from the TEI rather than anyother schema.

• Add a <TEI> element and then add it to the TEI namespace (http://www.tei-c.org/ns/1.0). Maybeadd a few blanks line between the starting and closing tag. Your file should look like:

<?xml version="1.0" encoding="UTF-8"?><TEI xmlns="http://www.tei-c.org/ns/1.0">

</TEI>

• Notice what happens in oXygen and how it helps you input this. Also notice that your file maynow have an angry red square rather than a happy green one! Is your file well-formed? (yes, it is!)Why is this red then?

20

Page 21: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.3 Exercise 2: Create a TEI Document

• If it is red it is because your version of oXygen is prepackaged with all sorts of TEI goodness, andin this case it recognises that files starting with <TEI> in the TEI namespace are to be associatedautomatically with a TEI schema that it has stored. It is complaining that you do not have a<teiHeader> in your file because all valid TEI files must have this.

5.3.5 Adding a <teiHeader>

Inside the <TEI> element we need to add a <teiHeader> element.

• Put the cursor between the starting and closing <TEI> element and type in a <teiHeader>element. Notice that oXygen provides the closing </teiHeader> element. If the correct optionis set in oXygen, it understands the TEI schema and knows that certain content is required insidea <teiHeader>. It can automatically provide that markup. If not, you’ll have to type it in. Theresulting file should look like:

<?xml version="1.0" encoding="UTF-8"?><TEI xmlns="http://www.tei-c.org/ns/1.0"><teiHeader><fileDesc><titleStmt><title> </title></titleStmt><publicationStmt/><sourceDesc/></fileDesc></teiHeader></TEI>

• Notice that your file still has an angry red square rather than a happy green square. This isbecause there is still stuff needed even though you’ve added some markup. First, add a titleof something like "My ’Strange Meeting’ document" by adding this text between the starting andclosing <title> tags. There are other elements which are allowed here in <titleStmt>such as <author> (Wilfred Owen), <editor> (Jon Stallworthy), that you could add but aren’treally required for this exercise. You could use the more general <respStmt> (with a <name>element with your name and a <resp> element with something like ’TEI P5 Encoding’ in it) torecord your own work if you wish, but as with the other embellishments this isn’t necessary forthis exercise.

• Then add a paragraph <p> inside the <publicationStmt> with some text to record what thisfile is for, perhaps something like "An exercise for learning TEI."

• Inside sourceDesc we should add a <p> with some text like: "The primary resourceof this file is Strange Meeting from Jon Stallworthy’s edition, available on the FirstWorld War Poetry Digital Archive." To make this even better, we might surround thetitle ’Strange Meeting’ with a <ref> element with a @target attribute with a value of’http://www.oucs.ox.ac.uk/ww1lit/collections/item/3350’ because that is URL from which wegot this text.

• Your <teiHeader> should now look something like:

<teiHeader><fileDesc><titleStmt><title>My ’Strange Meeting’ document</title>

</titleStmt><publicationStmt><p>An exercise for learning TEI.</p>

21

Page 22: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

</publicationStmt><sourceDesc><p> The primary resource of this file is <ref

target="http://www.oucs.ox.ac.uk/ww1lit/collections/item/3350">StrangeMeeting</ref> from Jon Stallworthy’s edition, available on the First

WorldWar Poetry Digital Archive. </p>

</sourceDesc></fileDesc>

</teiHeader>

• Notice that even though this is a complete <teiHeader> with all the required aspects, our fileas a whole isn’t valid.

5.3.6 Add a <text>

All TEI files, in addition to a <teiHeader> with <fileDesc> containing a <titleStmt>,<publicationStmt>, and <sourceDesc>, need to follow the header with at least one of:<sourceDoc>, <facsimile>, or <text>. In our case we’re going to add a <text> element.To do this:

• Add a couple of blank lines after the closing </teiHeader>.

• Insert a <text> element and inside that a <body> element. (The <text> element requires a<body> element because if you don’t have a text body, what are you encoding?)

• The <text> section of the file should look something like:

<text><body>

<!–We will put our poem here –></body>

</text>

5.3.7 Adding Our PoemThis is a good start but we need to put something inside the body. Luckily, we have already encoded apoem in the previous exercise, so we can use that!

• With the cursor in between the opening and closing <body> tags go to the ’Document’ menu onthe menu bar, and select ’File’, and ’Insert File’. Select the file you saved earlier if you finishedthe first exercise. If you didn’t then in the spoilers directory there is a file called ’ex01.xml’ whichhas the completed first exercise.

• But wait, as soon as you’ve added this we get a bit of a problem! oXygen will complain that we’vegot an XML declaration in the middle of our file. Delete this redundant XML declaration!

• Your document should now be valid and have a happy green square in the upper right-hand corner!If it isn’t, try to solve the problem by looking at the error message that is provided.

5.3.8 Saving Your WorkLet’s save our work:

• Have you formatted and indented your work automatically?

• Is your work well-formed? Do you have a happy green square or an angry red one?

• From the ’File’ menu select ’Save’ or click on the Save icon (looks like an old-style 3.5" disk)

• Save the file using the name ’exercise02.xml’ or another name of your choice.

22

Page 23: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.3 Exercise 2: Create a TEI Document

5.3.9 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• Which elements and attributes do you need for a minimum valid TEI XML document?

• What three parts of the <teiHeader> are required in all TEI conformant documents?

• Where are these elements and attributes allowed?

• What is the function of each element and attribute you’ve used?

• Why do you think these elements and attributes are required in TEI XML?

5.3.10 Next and More ReadingThis exercise and the previous one should have given you some experience editing XML and making avalid TEI file. Next time we’ll get an more in-depth introduction to various other TEI modules and learnmore about the <teiHeader>.

• If you are finished early you may wish to browse through the TEI Guidelines online athttp://www.tei-c.org/release/doc/tei-p5-doc/en/html/index-toc.html.

• In particular you might want to look at the Elements appendix of reference pages for individualelements. Consider looking up all the elements you’ve used in this file to see how they are defined.

• What other elements are allowed inside the <text> element? What would you use them for?

• What other parts of the <teiHeader> are there? What are they for?

• You may wish to read the chapters on Default Text Structure http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DS.html or Elements Available to All TEI Doccuments http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html.

23

Page 24: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.4 Exercise 3: Improving a <teiHeader>5.4.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• read through and analyse encoding in an existing TEI file.

• improve the structure and metadata of a <teiHeader>.

• understand the components of a <fileDesc> including:

– <titleStmt> for title and intellectual responsibility.

– <publicationStmt> for information about the publication and distribution of theelectronic item.

– <sourceDesc> to record metadata about the source document.

• use the <encodingDesc> to record the markup used in the file.

• use the <profileDesc> to record non-biliographic aspects of the file.

• record major changes to the file in the <revisionDesc>.

5.4.2 SummaryThis exercise gives you a chance to read through a TEI XML file you have not encoded and understandits markup and structure. It walks you through improvements to various aspects of the <teiHeader>and how to record additional metadata about the electronic file and its sources.

5.4.3 Starting UpIn this case we’re starting with a sample file that we have created for you. Load up the file called: ’letter-to-LG.xml’ in the oXygen XML editor. Check that the file is well-formed and valid. Note the line nearthe top of the file:

<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng"schematypens="http://relaxng.org/ns/structure/1.0"?>

This is what tells the oXygen editor that it should be validating this file with the tei_all schema.

5.4.4 Reading through the fileThis file contains a letter from Wilfred Owen to Leslie Gunston. It talks about a forthcoming addressto the Field Club, and contains a partial draft of ’The Wrestlers’. It was written in July 1917 fromCraiglockhart War Hospital, Edinburgh, Scotland. Images of this letter are available in your DHOXSSbooklet as well as in the materials we’ve provided.

• Note the very minimal <teiHeader>.

• Look at the structure of the document as three divisions and make sure you understand thesedivisions.

• Note the use of the <dateline> element.

• See how the encoder has recorded line-breaks in the prose.

• What other elements has the encoder included? Make sure you understand the meaning of them.If you are unsure of the meaning of them, look them up on the TEI-C website at: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/REF-ELEMENTS.html

24

Page 25: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.4 Exercise 3: Improving a teiHeader

5.4.5 Improving the <titleStmt>

As you can see the <teiHeader> is lacking a lot of information. Let’s improve it!

• Inside the <fileDesc> the <titleStmt> contains only a <title>. What else can<titleStmt> contain? (hint: typing ’<’ here will provoke oXygen into providing a dropdownlist of possibilities).

• Underneath the <title> add an <author> element. The content of this should be ’WilfredOwen’.

• Below this add an <editor> element with the content of ’Renée van Baalen’. (She transcribedthe letter for our teaching purposes.) How does one type in ’é’ in oXygen? Hint: the ’Edit’ menucontains a ’Insert from Character Map’ entry.

• After this add a <principal> element to record the person primarily responsible for the project.In this case, use your own name.

• Below this add a <meeting> element with the content of ’Digital Humanities at Oxford SummerSchool 2012’.

• After that add <respStmt>with a <resp> inside it saying ’Improved encoding’ and a <name>with your name.

• Your <titleStmt> should now look something like:

<titleStmt><title>Letter to Leslie Gunston</title><author>Wilfred Owen</author><editor>Renée van Baalen</editor><principal>[your name here]</principal><meeting>Digital Humanities at Oxford Summer School 2012</meeting><respStmt><resp>Improved encoding</resp><name>[Your name here]</name>

</respStmt></titleStmt>

If you do not understand what any of these elements are for, make sure to look them up on theTEI-C website at the URL given above.

5.4.6 Improving the <publicationStmt>

The <publicationStmt> is also fairly limited. It could contain a lot of structured information, butjust has a paragraph of prose. Let’s replace it!

• Delete the entire paragraph including the starting and ending <p> tags.

• Inside <publicationStmt> add a <publisher> element. In this case, ’TEI @ Oxford’ isthe publisher.

• Below the <publisher> add a <distributor> containing ’Digital Humanities at OxfordSummer School 2012’.

• After this add an <authority> element, to detail under who’s authority it is published. In thiscase let’s say it is under your authority, so add your name.

25

Page 26: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

• Next, inside a <pubPlace> element, which itself contains an <address> element withan <orgName> (Oxford University Computing Services), a <street> address (13 BanburyRoad), a <settlement> (Oxford), a <postCode> (OX2 6NN), and a <country> (UnitedKingdom).

• After the <pubPlace> element but still inside the <publicationStmt> add a <date>element with content of ’3 July 2012’. The <date> element can have a @when attribute to takea standardised YYYY-MM-DD form of the date, add this as well.

• Add an ID number after this using <idno>. This should be something like a catalogue number,or a URL at which this document will reside. In this case, make up what you think a sensible IDnumber would be for your edition of this letter.

• Next add an <availability> statement with a <p> containing a description of the licenceyou would want to distribute this under. We recommend you choose a Creative Commons licenseusing http://creativecommons.org/choose/. For bonus points you can include a link (using <ref>with a @target attribute to the license your chose.

• Your <publicationStmt> should now look something like:

<publicationStmt><publisher>TEI @ Oxford</publisher><distributor>Digital Humanities at Oxford Summer School 2012</distributor><authority>[Your name here]</authority><pubPlace><address><orgName>Oxford University Computing Services</orgName><street>13 Banbury Road</street><settlement>Oxford</settlement><postCode>OX2 6NN</postCode><country>United Kingdom</country>

</address></pubPlace><date when="2012-07-03">3 July 2012</date><idno>[Insert an ID number here]</idno><availability><p>Licensed with a <ref

target="http://creativecommons.org/licenses/by/3.0/">CreativeCommons Attribution</ref> licence.</p>

</availability></publicationStmt>

5.4.7 Improving the <sourceDesc>

Our <sourceDesc> is also fairly limited.

• Delete the entire paragraph that is currently in the <sourceDesc> and replace it with a<biblStruct>.

• The <biblStruct> should have an <analytic>with a <title> (Letter to Leslie Gunston),and <author> (Wilfred Owen).

• The <biblStruct> should also have a <monogr> for the collection containing:

– <title> (The Wilfred Owen Collection).

– A <ref> (First World War Poetry Digital Archive) containing a @target attribute pointingto ’http://www.oucs.ox.ac.uk/ww1lit/collections/document/5243’.

26

Page 27: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.4 Exercise 3: Improving a teiHeader

– An <imprint> element containing a <publisher> (The First World War Poetry DigitalArchive), a <pubPlace> (Oxford), and a <biblScope> (Two pages) with a @typeattribute of ’pp’, and a @n attribute of ’2’.

– Outside the <monogr> but inside the <biblStruct> add a <relatedItem> with a<bibl> containing ’The source of this digital resource is a copy from the Harry RansomCentre.’ You could also wrap ’Harry Ransom Centre’ in a <distributor> element. Thisis an example of a much less structured bibliographic citation inside a structured one.

– Your <sourceDesc> should now look something like:

<sourceDesc><biblStruct><analytic><title>Letter to Leslie Gunston</title><author>Wilfred Owen</author>

</analytic><monogr><title>The Wilfred Owen Collection</title><ref

target="http://www.oucs.ox.ac.uk/ww1lit/collections/document/5243/4769">First WorldWar Poetry Digital Archive</ref>

<imprint><publisher>The First World War Poetry Digital Archive</publisher><pubPlace>Oxford</pubPlace><biblScope type="pp" n="2">Two pages</biblScope>

</imprint></monogr><relatedItem><bibl>The source of this digital resource is a copy from the<distributor>Harry Ransom Centre</distributor>.</bibl>

</relatedItem></biblStruct>

</sourceDesc>

5.4.8 Other components of the <fileDesc>

There are other elements that could appear in your <fileDesc>.

• Immediately after the closing </fileDesc> tag you could add an <editionStmt> with an<edition> containing a descriptive phrase such as ’First Edition’ for the current edition of theelectronic file.

• Immediately after the closing </editionStmt> you could add an <extent> element withsome measure of the size of the text (e.g. ’260 words’).

• Immediately after the closing </publicationStmt> you could add a <notesStmt> withone or more <note> elements inside it. One could contain something saying ’Transcribed forDHOXSS TEI Workshop’.

5.4.9 Adding an <encodingDesc>

An <encodingDesc> element will give us a place to document the encoding practices in thedocument.

• After the closing </fileDesc> we should add an <encodingDesc> element.

• Inside the <encodingDesc> add a <projectDesc> with a <p> inside it saying somethinglike ’The TEI@Oxford project created teaching materials for DHOXSS’.

27

Page 28: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

• Next inside the <encodingDesc> add an <editorialDecl> with a <correction>inside that with a paragraph saying something like ’Apparent errors have been marked as <sic>but correct readings not provided’. Mark up <sic> as an element by using <gi> (genericidentifier).

• Also inside the <editorialDecl> add a <hyphenation> with a paragraph saying some-thing like ’Hyphens have been transcribed as they appear’.

• Look at the other options available to you inside <editorialDecl> and <encodingDesc>.

• Your <encodingDesc> should look something like:

<encodingDesc><projectDesc><p>The TEI@Oxford project created teaching materials for DHOXSS.</p>

</projectDesc><editorialDecl><correction><p>Apparent errors have been marked as <gi>sic</gi> but correct readings

not provided.</p></correction><hyphenation><p>Hyphens have been transcribed as they appear.</p>

</hyphenation></editorialDecl>

</encodingDesc>

5.4.10 Adding a <profileDesc>

A <profileDesc> is a place to store various non-bibliographic information concerning the text.

• After the closing </encodingDesc> add a <profileDesc>.

• Inside this add a <creation> with a <placeName> (Craiglockhart) and a <date> (July1917) perhaps with a @when attribute (’1917-07’).

• In the <profileDesc> next add a <handNotes> with a <handNote> inside it sayingsomething like ’Written in Wilfred Owen’s hand’.

• Next, add a <langUsage> inside the <profileDesc> with a <language> inside (’En-glish’) with an @ident attribute with a value of ’en’ for the English language code.

• Next add a <textClass> with a <classCode> with content of ’826’ and a @schemeattribute of "http://www.oclc.org/dewey/resources/summaries/default.htm". This is the Deweyclassification code for ’English Letters’.

• Your <profileDesc> should now look something like:

<profileDesc><creation><placeName>Craiglockhart</placeName><date when="1917-07">July 1917</date>

</creation><handNotes><handNote>Written in Wilfred Owen’s hand</handNote>

</handNotes><langUsage><language ident="en">English</language>

28

Page 29: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.4 Exercise 3: Improving a teiHeader

</langUsage><textClass><classCode

scheme="http://www.oclc.org/dewey/resources/summaries/default.htm">826</classCode>

</textClass></profileDesc>

5.4.11 Adding a <revisionDesc>

A <revisionDesc> gives you a way to record major stages in revision to a document.

• After the closing </profileDesc> add a <revisionDesc> element.

• Add two <change> elements. On the first one add a @when attribute with today’s date. Insidethe <change> add a <persName> containing your name, followed by the text ’improved theheader’.

• In the second <change> add a @when attribute of ’2012-02’, with a <persName> of ’Renéevan Baalen’ saying that she ’transcribed the Letter to Leslie Gunston document’. You may alsowish to mark ’Letter to Leslie Gunston’ as a <title>.

• It is standard practice for the most recent <change> to be first.

• Your <revisionDesc> should now look something like:

<revisionDesc><change when="2012-07-03"><persName>[Your name here]</persName> improved the header.</change>

<change when="2012-02"><persName>Renée van Baalen</persName> transcribed the <title>Letter to

Leslie Gunston</title> document. </change></revisionDesc>

5.4.12 Saving Your WorkLet’s save our work:

• Is your work well-formed? Do you have a happy green square or an angry red one?

• Have you formatted and indented your work automatically?

• From the ’File’ menu select ’Save’ or click on the Save icon (looks like an old-style 3.5" disk).

• Or if you prefer use the ’File’ then ’Save As’ menu item to save the file using the name’exercise03.xml’ or another name of your choice.

5.4.13 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• What kinds of metadata can you store in a <titleStmt>?

• What is a <publicationStmt> used for? What can it contain?

• How do you provide details of the source for the file?

• What is the difference between <bibl> and <biblStruct>?

• What is an <encodingDesc> for?

• What order should <change> elements be listed in a <revisionDesc>?

29

Page 30: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.4.14 Next and More ReadingNext we’ll learn to relate information in the body of the text to aspects of the header. There is lots ofinformation we could have put in our header which we didn’t.

• If you haven’t already, look up the main elements in the <teiHeader> on the TEI-C websiteand see what they are allowed to contain.

• You could also have a look at the TEI Guidelines chapter of the Header at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html.

30

Page 31: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.5 Exercise 4: Marking Up Names

5.5 Exercise 4: Marking Up Names5.5.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• encode personal, place, and organizational names

• store metadata concerning people, places, or organizations in the <teiHeader>

• link names in the document text to metadata stored in the header or another file

5.5.2 SummaryThis exercise will give you practical experience in marking up names of people, places, and organiza-tions. You’ll learn how to store richly structured metadata about these in the header, and how to link tothem from the document.

5.5.3 Starting UpLoad up the completed file from the previous exercise. If you did not complete the exercise you cancheat by loading up ’spoilers/ex03.xml’ and saving it under a new name.

5.5.4 Marking Up NamesIn addition to the general purpose <name> element which can take a @type attribute for classification,there are three types of names specifically catered for in the TEI. These are: organizational names(<orgName>), personal names (<persName>), and place names (<placeName>). Occasionallyyou might want to mark something like ’she’ which is not strictly a name but references an understoodnamed entity. To do this we use a reference string or <rs> element.

• In the first <salute> mark up ’L.’ as a <persName>.

• In the first paragraph encode ’Field Club’ as an <orgName>, and ’Berlitz, Edin.’ as an<orgName> with a <placeName> inside it (’Edin.’).

• In the second paragraph mark up Antaeus, Heracles, Mother Earth, and ’old Herk.’ as<persName> elements.

• In the verse encode ’Earth’ as a <persName> (because it is used anthropomorphically here).

• In the final division mark up ’Locke’s’ and ’Swinburne’ as a <persName> elements.

• Inside the <signed> element mark up ’WEO’ as a <persName>.

• There are more names we could mark up, such as the use of the names Leslie Gunston and WilfredOwen throughout the header, but that is optional.

5.5.5 Making PeopleThe names we find in documents are merely instances of names, they are not people, places,or organizations. Often we want to store canonical metadata about these many instances in our<teiHeader> and so we use the <person>, <place>, <org> elements as containers for thismetadata. We contain these in a <listPerson>, <listPlace>, or <listOrg> commonly (butnot always) stored inside the <sourceDesc> of the header.

• Just before the closing </sourceDesc> add a <listPerson> element.

• Add a <person> element with an @xml:id attribute, and at least a <persName> inside asfollows:

@xml:id <persName> Other Info

31

Page 32: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

LG Leslie Gunstonherc Heraclesearth Mother Earthant AntaeusWL William John Locke Birth: Cunningsbury St. George, 20th March 1863

Death: Paris, , 15th May 1930

AS Algernon CharlesSwinburne

Birth: London, 5th April 1837Death: London, 10th April 1909

WO Wilfred Edward SalterOwen

Birth: Oswestry, 18th March 1893Death: Ors, 4th November 1918

For bonus points, perhaps mark <forename>s and <surname> of real people inside the<persName> and also add a <birth> and <death> element for those with this information.These can have a @when attribute with a YYYY-MM-DD format of the date, and can alsothemselves contain <placeName> elements.

• Your <listPerson> might look something like:

<listPerson><person xml:id="LG"><persName><forename>Leslie</forename><surname>Gunston</surname>

</persName></person><person xml:id="herc"><persName>Heracles</persName>

</person><person xml:id="ant"><persName>Antaeus</persName>

</person><person xml:id="earth"><persName>Mother Earth</persName>

</person><person xml:id="WL"><persName><forename>William</forename><forename>John</forename><surname>Locke</surname>

</persName><birth when="1863-03-20"><placeName ref="#Cun">Cunningsbury St. George</placeName>, 20th March

1863</birth><death when="1930-05-20"><placeName ref="#Par">Paris</placeName>, 15th May 1930</death>

</person><person xml:id="AS"><persName><forename>Algernon</forename><forename>Charles</forename><surname>Swinburne</surname>

</persName><birth when="1837-04-05"><placeName ref="#Lon">London</placeName>, 5th April 1837</birth>

<death when="1909-04-10"><placeName ref="#Lon">London</placeName>, 10th April 1909</death>

32

Page 33: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.5 Exercise 4: Marking Up Names

</person><person xml:id="WO"><persName><forename>Wilfred</forename><forename>Edward</forename><forename>Salter</forename><surname>Owen</surname>

</persName><birth when="1893-03-18"><placeName ref="#Osw">Oswestry</placeName>, 18th March 1893</birth>

<death when="1918-11-04"><placeName ref="#Ors">Ors</placeName>, 4th November 1918</death>

</person></listPerson>

5.5.6 Building PlacesWe also refer to some places in our file, so let’s document those as well!

• After the closing </listPerson> create a <listPlace>.

• Add a <place> inside, with an @xml:id of ’edinburgh’, <placeName> of ’Edinburgh’, a<region> of ’Scotland’, and a <country> of ’United Kingdom’.

• Inside this <place> add a nested <place> element with an @xml:id of ’craiglockhart’.

• Inside this <place> add a <placeName> of ’Craiglockhart War Hospital’ and as a sibling tothis a <settlement> of ’Edinburgh’.

• Then add a <location> with a <geo> inside which contains the coordinates 55.91812, -3.24019.

• Your <listPlace> might look something like this:

<listPlace><place xml:id="edinburgh"><placeName>Edinburgh</placeName><region>Scotland</region><country>United Kingdom</country><place xml:id="craiglockhart"><placeName>Craiglockhart War Hospital</placeName><settlement>Edinburgh</settlement><location><geo>55.91812, -3.24019</geo>

</location></place>

</place></listPlace>

• By nesting the hospital’s location inside the place for Edinburgh, we record that the one placeis inside the other through the XML hierarchy. The nested <settlement> of ’Edinburgh’ istechnically redundant.

5.5.7 Creating OrganizationsThe principle is basically the same for creating an <org> inside a <listOrg>:

• After the closing </listPlace> create a <listOrg> with an <org> inside it with an@xml:id of ’Berlitz’.

33

Page 34: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

• Give this <org> an <orgName> of ’Berlitz’.

• Inside this <org> add a <place> element with a <location> containing an <address>with a <street> of ’14 Frederick Street’, a <postCode> of ’EH2 2HB’, a <settlement>of ’Edinburgh’, and a <country> of ’United Kingdom’.

• Your <listOrg> might look something like:

<listOrg><org xml:id="Berlitz"><orgName>Berlitz</orgName><place><location><address><street>14 Frederick Street</street><postCode>EH2 2HB</postCode><settlement>Edinburgh</settlement><country>United Kingdom</country>

</address></location>

</place></org>

</listOrg>

• Also inside this <listOrg> add an <org> for the ’Field Club’ with an <orgName> and a<note> recording that this is now known as the Edinburgh Natural History Society, and has awebsite at "http://www.edinburghnaturalhistorysociety.org.uk/".

5.5.8 Linking Names and MetadataHaving marked all these names, and created stored metadata about them, it seems a shame not to linkthe names to this metadata. So let’s do that!

• Go to the <persName> you put in the first <salute> around ’L.’. Put the cursor immediatelyafter the final ’e’ in the opening <persName> tag and press space. You should get a drop-downlist of attributes, select ’ref’, when you do so you should get a drop-down list of @xml:id valuespresent in the entire document. Scroll down and select ’#LG’.

• This <salute> now should look like:

<salute>Dear <persName ref="#LG">L.</persName></salute>

• The value of @ref is a URI, which includes URLs, and in this case a ’fragmentary URL’.It starts with a ’#’ to let us know it is in the same document. You could also have storedthe <listPerson> in a separate document, in which case we would put something like’people.xml#LG’, or stored this online somewhere ’http://www.example.com/people.xml#LG’.While it is best if this points to a TEI <person> element, it can in fact point to anything whichdocuments the name such as a wikipedia article. (One reason it is better for this to point to a<person> element is that inside that you could indeed point to more than one external source ofinformation.)

• For each <persName>, <placeName>, and <orgName> (for which you’ve created a<person>, <place> or <org> element) go through and add a @ref attribute pointing to thecorrect @xml:id.

34

Page 35: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.5 Exercise 4: Marking Up Names

• The benefit of doing all this work, is now for each instance of the name a standardised form of it,and other metadata is available during processing to other outputs. (e.g. for help in searching, ordisplaying this information)

5.5.9 Referencing StringsAs explained earlier the <rs> element can be used to mark things which aren’t strictly names inthemselves but are understood to reference named entities. For example ’I’ and ’you’ in this file refer toWilfred Owen and Leslie Gunston respectively.

• Depending on how much time you have left, mark as many of the instances of ’I’ and ’you’ as<rs> pointing to the appropriate <person> element in each case.

5.5.10 Saving Your WorkLet’s save our work:

• Is your work well-formed? Do you have a happy green square or an angry red one?

• Have you formatted and indented your work automatically?

• From the ’File’ menu select ’Save’ or click on the Save icon (looks like a old-style 3.5" disk).

• Or if you prefer use the ’File’ then ’Save As’ menu item to save the file using the name’exercise04.xml’ or another name of your choice.

5.5.11 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• Which elements are used to mark personal, place, and organizational names?

• How do you store metadata in the header about the entities these names refer to?

• What values does the @ref attribute allow? How can this be used to point to external files orURLs?

• How do you mark up strings of text which reference named entities, but aren’t themselves names?

5.5.12 Next and More ReadingNext we’ll be investigating more about the physical document itself. However, before that if you havetime you may wish to:

• Look up the reference pages for each of the new elements you’ve used.

• Read some of the chapter on Names, Dates, People, and Places: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html.

35

Page 36: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.6 Exercise 5: Creating a Manuscript Description5.6.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• Modify a basic manuscript description to provide more structure

• Understand the general categories of manuscript description

• Have more experience editing a complex <teiHeader>

5.6.2 SummaryIn this exercises you will add a manuscript description to the file you finished in the previousexercise. You’ll modify an existing <msDesc> element with a basic structure to categorise manuscriptdescription information into a more detailed structure.

5.6.3 Starting UpLoad up the completed file from the previous exercise. If you did not complete the exercise you cancheat by loading up ’spoilers/ex04.xml’ and saving it under a new name (perhaps ’exercise04.xml’).

5.6.4 Inserting a basic <msDesc>The information for our manuscript description will basically be taken from the document descriptionat http://www.oucs.ox.ac.uk/ww1lit/collections/document/5243. But let’s pretend that we already havea basic manuscript description. There is no requirement with TEI <msDesc> to divide it into allthe possible categories of information, instead all it requires is at least a <msIdentifier>, otherinformation could be stored in a few accompanying paragraphs. This is useful for the retrospectiveconversion of catalogues in other legacy formats to TEI XML.

• Move the cursor to immediately following the closing </listOrg> tag. At this point either cutand paste or insert (with the ’Document’ -> ’File’ -> ’Insert File’) the file ’msDesc.xml’.

• As you’ll notice, this contains a very basic <msDesc> with a minimal <msIdentifier>.

5.6.5 Filling out a <msIdentifier>

Let’s expand the <msIdentifier>. As you have a lot more experience editing XML files in oXygennow, the steps will sometimes be given in less detail.

• Notice that the first paragraph mostly contains information that tells us where the manuscript is,in other words it identifies it and so this text could go in a <msIdentifier>.

• Take the information in this paragraph and expand the <msIdentifier> until it lookssomething like this:

<msIdentifier><country>United States of America</country><region>Texas</region><settlement>Austin</settlement><institution> The University of Texas at Austin </institution><repository>Harry Ransom Centre</repository><collection>Wilfred Owen Collected Letters</collection><idno type="folio">ff504</idno><altIdentifier><idno>Letter no. 535 Ed. ’Wilfred Owen Collected Letters’</idno>

</altIdentifier><msName>Letter to Leslie Gunston</msName>

</msIdentifier>

36

Page 37: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.6 Exercise 5: Creating a Manuscript Description

• Note how elements are prescribed to appear in a particular order (from greatest level of granularityto more specific). Notice that most elements cannot be repeated (some like <collection> and<altIdentifier> can be).

• When you’ve finished creating the <msIdentifier> delete the remains of the first <p> fromthe basic manuscript description.

5.6.6 Providing some <msContents>

The second paragraph contains information that will be useful in compiling an <msContents>. Thisacts as a place to store structured information concerning the intellectual contents of a manuscript. Itgives a place for a summary of the contents of the manuscript and multiple <msItem> elements formsomething like a table of contents of works in the document.

• Rename the second paragraph element as <msContents> (your document will now not be valid)

• Highlight the text inside from the start to the end of "Collected Letters’.", press control-e to’surround with element’ and wrap this in a <summary>. This acts as a summary for theintellectual content

• Highlight the remaining text and surround it with a <msItem> element.

• Delete the ’Authored by’ and surround ’Wilfred Owen (1893-1918).’ with an <author>element.

• Surround ’English.’ with a <textLang> element.

• Add an @mainLang with a value of ’en’ (the ISO language code for ’English’)

• Add a @ref to the <author> and point to your <person> for Wilfred Owen.

• As this <msItem> is recording information for this particular item we also want to give it a<title>. Create an empty <title> element and cut and paste "Letter To Leslie Gunston /The Wrestlers." into it.

• Your <msContents> should now look something like:

<msContents><summary>"Letter To Leslie Gunston / The Wrestlers". Talks about forthcoming

address to the ’Field Club’. Includes a partial draft of ’The Wrestlers’.This is letter no. 535 in Ed. ’Wilfred Owen Collected Letters’.</summary><msItem><author>Wilfred Owen (1893-1918).</author><textLang mainLang="en"> English. </textLang>

</msItem></msContents>

5.6.7 Giving a <physDesc>

The next paragraph has a lot of information about the physical aspects of the manuscript. Let’s turn itinto a <physDesc>

• Rename the <p> to be a <physDesc>

• Immediately inside this create an <objectDesc> with a <supportDesc> inside that.

• Inside that <supportDesc> add a <support>, and inside this put the text "A single folio ofpaper in the collection as ff504 recto and verso"

37

Page 38: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

• You could wrap the element <material> around the word ’paper’, but also you could add a@material attribute to <supportDesc> with a value of ’paper’.

• You could also categorise the object’s form by adding a @form attribute on <objectDesc>with a value of ’folio’.

• After the closing </supportDesc> tag add a <layoutDesc> with a <layout> to recordinformation about the physical layout. In this case "Written full width as a single column, withapproximately 20 lines per page"

• To the <layout> element add a @columns attribute of ’1’, and a @writtenLines of ’20’.

• After the closing </objectDesc> add a <handDesc> with a @hands attribute with a valueof ’1’.

• Inside the <handDesc> add a <handNote> with the remaining text "Written in WilfredOwen’s handin pen.". You might want to mark Wilfred Owen as a <persName> with a @refpointing back to the <person> for Wilfred Owen.

• Your <physDesc> now might look something like:

<physDesc><objectDesc form="folio"><supportDesc material="paper"><support>A single folio of <material>paper</material> in the collection as

ff504 recto and verso</support></supportDesc><layoutDesc><layout columns="1" writtenLines="20">Written full width as a single

column, with approximately 20 lines per page</layout></layoutDesc>

</objectDesc><handDesc hands="1"><handNote>Written in <persName ref="#WO">Wilfred Owen’s</persName> hand in

pen.</handNote></handDesc>

</physDesc>

5.6.8 Detailing a <history>

The <history> element gives a place to detail the <origin>, <provenance>, and<acquisition> of the manuscript if available. In this case we have some minimal information aboutthe origin of the manuscript

• Rename the second-last paragraph to a <history> element.

• Select all the text of "This letter was written by Wilfred Owen in July 1917 at Craiglockhart WarHospital." and surround it with a <origin> element.

• Inside this mark ’July 1917’ as an <origDate> element. This is like the <date> element,but is specific to recording the origin date of the manuscript being described. Provide a @whenattribute of ’1917-07’.

• Similarly mark the ’Craiglockhart War Hospital’ as an <origPlace> with a @ref of’#craiglockhart’ to point to the <place> you made earlier. You could also surround the textwith an <orgName> if you want to indicate that this is an organizational name. As before youcould mark Wilfred Owen’s name.

38

Page 39: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.6 Exercise 5: Creating a Manuscript Description

• Your <history> element should look something like:

<history><origin>This letter was written by <persName ref="#WO">Wilfred

Owen</persName> in<origDate when="1917-07">July 1917</origDate> at <origPlace ref="#craiglockhart">

<orgName>Craiglockhart War Hospital</orgName></origPlace>

</origin></history>

5.6.9 Noting <additional> InformationAt the end of your <msDesc> you can include an <additional> element which stores otherinformation such as <adminInfo> (for recording administrative events of the object), <listBibl>(for listing bibliographic citations about the object), and <surrogates> (for listing additionalrepresentations of the object).

• Change the final paragraph to an <additional> element with a <surrogates> inside thatcontaining all the text.

• Modify the URL given to be a <ptr> with a @target attribute.

• Your <additional> element should look something like:

<additional><surrogates>A digital image is available from the First World War Poetry

Digital Archiveat <ptrtarget="http://www.oucs.ox.ac.uk/ww1lit/collections/document/5243/4770"/>.

</surrogates></additional>

5.6.10 Saving Your WorkLet’s save our work:

• Is your work well-formed? Do you have a happy green square or an angry red one?

• Have you formatted and indented your work automatically?

• From the ’File’ menu select ’Save’ or click on the Save icon (looks like a old-style 3.5" disk).

• Or if you prefer use the ’File’ then ’Save As’ menu item to save the file using the name’exercise05.xml’ or another name of your choice.

5.6.11 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• What is the only required aspect of a TEI manuscript description?

• How does one record the separate works of intellectual content present in the manuscript?

• Where does one describe the support which forms the object, or its layout?

• How does one record the origin, provenance, and acquisition of the object?

• Where might you record

39

Page 40: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.6.12 Next and More ReadingNext we’ll be looking at more encoding one can add to manuscripts, particularly for transcriptions.However, before that if you have time you may wish to:

• Look up the reference pages for each of the new elements you’ve used.

• Read some of the chapter on Manuscript Description: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/MS.html.

40

Page 41: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.7 Exercise 6: Transcribing with the TEI

5.7 Exercise 6: Transcribing with the TEI5.7.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• Undertake conversion of a transcription of a document

• Use <add>, <del> and <subst> effectively

• Use <choice> with <abbr> and <expan>

• Use <unclear> to note difficult to read passages

5.7.2 SummaryIn this exercise you will take a transcription, already partly in TEI but with a bespoke transcriptionnotation, and convert it fully to TEI P5 XML. You’ll learn how to use <add>, <del>, <subst>,<choice>,<abbr>, <expan>, and <unclear> in both simple and nested manners to indicate somecomplex transcriptional phenomena.

5.7.3 Starting UpIn this case we’re starting with a sample file that we have created for you. Load up the file called:’preface.xml’ in the oXygen XML editor. Check that the file is well-formed and valid. Note that:

• There is a minimal header with an <msDesc>.

• There is a commented-out plain text edition of this preface, edited by J. Stallworthy. (This is justto give you a reading copy of a clean text).

• There is a transcription with a bespoke notation of "(deleted: ’some text’)" by a transcriber toshow what was deleted, added, and other aspects. In some case additions and deletions are nestedtogether as a single act with an extra set of parentheses. In other cases deletions are made insideadditions.

• There is an image of the manuscript page in the ’preface-ms.jpg’ file. Have a look at this, thecommented out edited version, and the transcription of the manuscript.

5.7.4 Using @rendNote that the <head> inside the division has a content of:

<head>(underlined:’Preface.’)</head>

• Remove the "(underlined:’" and "’)" so that you are left with just the text.

• Add a @rend attribute with a value of ’underline’.

• You should have a <head> that looks something like:

<head rend="underline">Preface.</head>

5.7.5 Second ParagraphLet’s change some of this transcription notation to real markup in the second paragraph (the first has notranscription notation).

• Change (added below: ’about glory, honour,’) to be

41

Page 42: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

<add place="below">about glory, honour,</add>

• The next is an example of a substitution, a deletion and addition provided as a single act, followedby a deletion. Change ((deleted:’battles, and glory of battles’) (addedabove:’deeds or lands’)) (deleted ’or land,’) to be

<subst><del rend="stroked">battles, and glory of battles</del><add place="above">deeds or lands</add>

</subst><del rend="stroked">or land,</del>

• Replace ((deleted:’or’) (added below:’or anything about’)) with

<subst><del rend="stroked">or</del><add place="below">or anything about</add>

</subst>

• The transcriber has marked a passage that was unclear in transcription as (unclear,scribbled: ’majesty’). Replace this with

<unclear reason="scribbled">majesty</unclear>

• The transcriber has recorded an abbreviation with its expansion as domin(ion). This is meantto mean that ’domin’ is what it is on the page, but that it should be expanded to ’dominion’.Encode this as:

<choice><abbr>domin</abbr><expan>domin<ex>ion</ex></expan>

</choice>

The <ex> element gives the supplied letters in the expanded form.

• Replace (deleted:’whatever’) with

<del rend="stroked">whatever</del>

• This concludes your second paragraph, and has given you some experience in marking up<add>, <del>, and wrapping those in <subst>, also using <unclear> and <choice>with <abbr>, <expan> and <ex>. Your paragraph should look something like:

<p> Nor is it <add place="below">about glory, honour,</add> about<subst>

<del rend="stroked">battles, and glory of battles</del><add place="above">deeds or lands</add>

</subst>

42

Page 43: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.7 Exercise 6: Transcribing with the TEI

<del rend="stroked">or land,</del><subst><del rend="stroked">or</del><add place="below">or anything about</add>

</subst> any might, <unclear reason="scribbled">majesty</unclear>, <choice><abbr>domin</abbr><expan>domin<ex>ion</ex></expan>

</choice> or power <lb/><del rend="stroked">whatever</del> except War. </p>

5.7.6 Third ParagraphThe third paragraph introduces nesting additions/deletions inside additions for when someone has addedsomething, and then changed their mind by adding more, and/or deleting the addition.

• Change (added left: ’Above all’) (deleted: ’Its’) ((addedabove:’I am’) (deleted:’This book’)) to be:

<add place="left">Above all </add><del rend="stroked">Its </del><subst><add place="above">I am </add><del rend="stroked">This book</del>

</subst>

• Modify (added above:’(deleted: ’(unclear, unsure:’center’)’)’)which contains an unclear bit of text that has been deleted, inside an addition, to be somethinglike:

<add place="above"><del rend="stroked"><unclear>center</unclear>

</del></add>

• Modify ((added left:’My’) (deleted:"It’s")) (added above:’(deleted:’The’)’)which has a substitution with an addition and a deletion as well as an addition which is thendeleted.

<subst><add place="left">My </add><del rend="stroked">It’s </del>

</subst><add place="above"><del rend="stroked">The </del>

</add>

• Finish up the last couple of deletions in this paragraph and it should look something like:

<p><add place="left">Above all </add><del rend="stroked">Its </del>

43

Page 44: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

<subst><add place="above">I am </add><del rend="stroked">This book</del>

</subst> is <add place="above"><del rend="stroked"><unclear>center</unclear>

</del></add> not concerned with Poetry. <lb/><subst><add place="left">My </add><del rend="stroked">It’s </del>

</subst><add place="above"><del rend="stroked">The </del>

</add> subject <del rend="stroked">of</del> is War, and the pity of<del rend="stroked">it</del> War. <lb/> The Poetry is in the pity.

</p>

5.7.7 Fourth ParagraphThe fourth paragraph has more of the same, but also a deletion with a nested addition with both asubstitution and standalone deletion.

• Change (deleted:’I have no hesistation in’) (deleted:’makingpublic’)(deleted:’publishing such’)((deleted:’My’) (added above with caret:’Yet these’))(deleted:’Ihave no hesistation in’) (deleted:’making public’)(deleted:’publishing such’)((deleted:’My’) (added above with caret:’Yet these’)) into:

<del rend="stroked">I have no hesistation in </del><del rend="stroked">making public</del><lb/><del rend="stroked">publishing such</del><lb/><subst><del rend="stroked">My </del><add place="above" rend="caret">Yet these</add>

</subst>

• Change (added above:’to this (added above:’(deleted:’past’)’)generation’) (deleted: ’not further consolation’) into:

<add place="above">to this <add place="above"><del rend="stroked">past </del>

</add> generation </add><del rend="stroked">not further consolation</del>

• Change (deleted:’The’) to (deleted:’this’ (added above:’((deleted:’a’)(added above:’this’)) (deleted:’bereaved’)’) generation’) which is adeletion (of ’this generation’) with a nested addition with a substitution and a deletion to:

44

Page 45: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.7 Exercise 6: Transcribing with the TEI

<del rend="stroked">The</del> to<del rend="stroked">this <add place="above">

<subst><del rend="stroked">a</del><add place="above">this</add>

</subst><del rend="stroked">bereaved</del>

</add> generation</del>.

• (That is quite complicated in terms of additions and substitions!) Change the next couple deletionsinto markup, and then (deleted:’used proper names.’) (added above:’(Alla poet can do today is (added above:’(deleted:’to’)’) warn(deleted:’children’)’) into:

<del rend="stroked">used proper names.</del><add place="above">All a poet can do today is<add place="above">

<del rend="stroked">to</del></add> warn <del rend="stroked">children</del>

</add>

• Mark the last deletion ’War’, and then this paragraph should now look something like:

<p><del rend="stroked">I have no hesistation in </del><del rend="stroked">making public</del><lb/><del rend="stroked">publishing such</del><lb/><subst><del rend="stroked">My </del><add place="above" rend="caret">Yet these</add>

</subst> elegies are <add place="above">to this <add place="above"><del rend="stroked">past </del>

</add> generation </add><del rend="stroked">not further consolation</del><lb/> in no sense consolatory <lb/><del rend="stroked">The</del> to <del rend="stroked">this <add place="above">

<subst><del rend="stroked">a</del><add place="above">this</add>

</subst><del rend="stroked">bereaved</del>

</add> generation</del>. They may be to the <lb/> next. <del rend="stroked">IfI thought the letter of this</del><lb/><del rend="stroked">book would last, I now might have </del><lb/><del rend="stroked">used proper names.</del><add place="above">All a poet can do today is <add place="above">

<del rend="stroked">to</del></add> warn <del rend="stroked">children</del>

</add><lb/> That is why the true <del rend="stroked">War</del> Poets must be truthful.

<lb/></p>

45

Page 46: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.7.8 Final ParagraphThe final paragraph doesn’t really introduce anything new that you don’t know already, so finish it offquickly! It should end up looking something like:

<p>[If I thought the letter of this book would last, I <lb/><del rend="stroked">wo </del> might have used proper names; but if the spirit of

<lb/> it <add place="above">survives</add> - survives Prussia - <subst><del rend="stroked">I </del><add>my ambition</add>

</subst> and those mames will <lb/><del rend="stroked">be</del><del rend="stroked">content</del> have achieved <del rend="stroked">themselves</del><lb/><del>ourselves</del> fresher fields than Flanders, <lb/> for he, not of war,

would he <lb/> sing</p>

5.7.9 Transcription ComparisonCompare your transcription to the image in ’preface-ms.jpg’.

• Is there anything you haven’t noted that you think is important?

• Are there any mistakes in the transcription that you should correct?

• Are there things you might have marked up differently?

5.7.10 Saving Your WorkLet’s save our work:

• Is your work well-formed? Do you have a happy green square or an angry red one?

• Have you formatted and indented your work automatically?

• From the ’File’ menu select ’Save’ or click on the Save icon (looks like an old-style 3.5" disk).

• Or if you prefer use the ’File’ then ’Save As’ menu item to save the file using the name’exercise06.xml’ or another name of your choice.

5.7.11 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• If you want to indicate an abbreviation and expansion (or correction and error) are linked, whatelement do you wrap them in?

• If you want to indicate an addition and deletion are one editorial act, what do you surround themwith?

• How do you show that an addition is subsequently deleted?

5.7.12 Next and More ReadingNext we’ll be moving on to spoken texts and linguistic corpora. However, before that if you have timeyou may wish to:

• Look up the reference pages for each of the new elements you’ve used.

46

Page 47: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.7 Exercise 6: Transcribing with the TEI

• What we haven’t covered in this exercise is the genetic encoding, using <sourceDoc> andlinking transcriptions to the <facsimile> element. Read some of the chapter on Representationof Primary Sources: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html.

• At the bottom of that chapter you can find a list of elements added by the ’transcr’ module.It is interesting to note how many of the elements we used appear in the ’core’ module at:http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html.

47

Page 48: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.8 Exercise 7: Encoding Spoken Text5.8.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• Encode a transcription of spoken text

• Use a <recordingStmt> in the header to document an audio source

• Record participants in a linguistic interaction

• Mark utterances, pauses, and other incidents

• Understand basic use of a <timeline> element

• Use other markup in a spoken text transcription

5.8.2 SummaryThis exercise starts with a TEI template and quickly makes a transcription of an audio interview into afull valid TEI P5 file. The interviewer (Stuart Lee) is interviewing Ian Hislop, who had recently done atelevision programme ’Not Forgotten’ about the impact on British society of the First World War. Wewill mark up the utterances, pauses, and other aspects of a fragment of this interview.

5.8.3 Starting UpWith oXygen loaded, start a new TEI P5 document by going to ’File’ -> ’New’ -> ’FrameworkTemplates’ -> ’TEI P5’ -> ’All’, and modify the headers as below.

5.8.4 Creating a Better HeaderThe default template header isn’t very good, let’s make it better.

• Modify the default <fileDesc> element to have a better <title> and<publicationStmt>. This document will be a transcribed fragment of an interviewwith Ian Hislop for teaching purposes. Your <fileDesc> should look something like:

<fileDesc><titleStmt><title>Fragment of interview with Ian Hislop for teaching purposes</title>

</titleStmt><publicationStmt><p>Used for a teaching exercise at DHOXSS TEI Workshop</p>

</publicationStmt></fileDesc>

• Use the <recordingStmt> element inside <sourceDesc> to describe the source of thematerial as an (audio in this case) recording. This recording is 27 minutes and 9 seconds long, andwas made by OUCS on the 7th September 2007. So our <sourceDesc> looks something like:

<sourceDesc><recordingStmt><recording type="audio" dur="PT27M09S"><respStmt><resp>Recording by</resp><orgName ref="#OUCS">Oxford University

Computing Services</orgName></respStmt><date when="2007-09-07">7th September, 2007</date>

</recording></recordingStmt>

</sourceDesc>

48

Page 49: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.8 Exercise 7: Encoding Spoken Text

• After the end of the <fileDesc> element but before the closing </teiHeader> tag, inserta <profileDesc> element, and within that a <particDesc>. This is where we’re goingto record the participants in the interview. Inside here create a <listPerson> with two<person> elements. The first (interviewer) should have an xml:id attribute value of ‘SL’. Thisis the interviewer and his name is Stuart Lee. Create a <persName> inside the <person> forhim. Also inside the <person>, alongside the <persName>, create a <note> with a <ref>inside. The @target attribute’s value should be http://users.ox.ac.uk/~stuart/Site/About_Me.htmlwith the <ref> content being something like ’Stuart Lee’s home page’.

• In the second <person> element, add an @xml:id attribute value of ‘IH’. The person beinginterviewed’s name is Ian Hislop, a well known UK comedian and editor of the satirical PrivateEye magazine. Create a <note> for him with a <ref> that points to his wikipedia page:http://en.wikipedia.org/wiki/Ian_Hislop Your <profileDesc> should look something like:

<profileDesc><particDesc><listPerson><person xml:id="SL"><persName>Stuart Lee</persName><note><ref

target="http://users.ox.ac.uk/~stuart/Site/About_Me.html"> Stuart Lee’shome page</ref>

</note></person><person xml:id="IH"><persName>Ian Hislop</persName><note><ref

target="http://en.wikipedia.org/wiki/Ian_Hislop"> Ian Hislop’s entry inWikipedia</ref>

</note></person>

</listPerson></particDesc>

</profileDesc>

• Your document should be well-formed valid, and have a happy green square!

5.8.5 Adding the Transcription and Utterances• Inside the <body> element, delete any paragraph that is there, and insert the ’hislop.txt’ file by

going to the ’Document’ -> ’File’ -> ’Insert File’ menu. (Otherwise you could copy and paste itfrom notepad or similar).

• oXygen will complain that you have a bunch of text just inside <body>, let’s solve that first byadding some structure.

• Replace the [gap for sampling purposes] at the start and end of the text with a <gapreason=”sampling”/>.

• Around each line-break separated utterance (including the speaker name) wrap a <u> element.(highlight and press ’control-e’; or you could put it in one <u> element and split it with ’alt-shift-D’ in front of each one.)

• Your first utterance should look something like:

49

Page 50: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

<u>Lee (24.27-24.36):So em d-em having read [clicking sound: 0.17s] the Wipers Timesnow and and your [pause: 0.62s] view thatth th the thirties was that thatprisonment you say</u>

• Your document should be well-formed and valid now, with a happy green square, sort out anyproblems before progressing further.

5.8.6 Making Better UtterancesWhile our markup might be well-formed and valid it is a long way from the truth.

• The first person to speak didn’t really say ‘Lee (24.27-24.36):’.

• These are artifacts left by the transcriber to give a time stamp and indicate which person wasspeaking.

• Go through and comment out each of these lines by highlighting it and pressing ’control-shift-comma’ (or selecting ‘Toggle Comment’ from the right-click menu).

• Let’s use this information now to add some extra metadata to each of the utterances.

• For each utterance add a who attribute with a value of # followed byt the corresponding @xml:idvalue of the person you are pointing to.

• So your first utterance should look something like:

<u who="#SL"><!– Lee (24.27-24.36): –>So em d-em having read [clicking sound: 0.17s] the Wipers Timesnow and and your [pause: 0.62s] view thatth th the thirties was that thatprisonment you say

</u>

• Repeat this for all the utterances using ’#IH’ and ’#SL’ where appropriate.

• Note: we are not going to use the timestamps in this exercise, but do not delete them. Having XMLcomments in your file doesn’t cost you anything, but they can be deleted by other applicationsprocessing the files. So they are a good place to temporarily store information (such as where youare having a problem with some encoding, or how far you have got through a file).

5.8.7 Incidents and Pauses (and Regular Expressions)Our audio transcriber has rigorously recorded the number of seconds of a clicking sound that they heardand the pauses that speakers made in speaking. We want to turn these non-spoken notes into markup. Ifyou want to listen to the audio clip it is provided as ’hislop.mp3’. You probably don’t have timeto listen to the whole interview but it is provided as ’ian_hislop.mp3’. (The transcriber may havemis-heard some words here and there.)

• In the first utterance replace “[clicking sound: 0.17s]” with:

50

Page 51: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.8 Exercise 7: Encoding Spoken Text

<incident dur="PT0.17S"><desc>clicking sound</desc>

</incident>

and “[pause: 0.62s]” with

<pause dur="PT0.62S"/>

• We want to do this throughout the document, but doing this manually is a bit of hard work. If youcannot get Regular Expressions to work, you can always do this manually, but this might make ita bit quicker! Press ’control-f’ (or ’Find’ -> ’Find/Replace’ in the menus) to bring up the searchand replace dialog window.

• Make sure the ‘Regular expression’ option is ticked. We are going to use regular expressions(sometimes called ’regex’ or ’regexes’) to make the search and replacing more powerful.

• In this case put \[clicking sound: ([0-9.]*)s\] into the ‘Text to find:’ box. Thismeans that we’re looking for a literal square bracket, followed by the text ‘clicking sound:’ thenany combination of numbers and ‘.’ followed by an s and a closing bracket. We have to escapethe brackets because they are used as part of the regular expression language.

• In the Replace with: section put:

<incident dur="PT\1S"><desc>clicking sound</desc>

</incident>

which has us insert the incident element and adds the string of text that we found in parenthesesin the search (represented by \1) to the duration value we’re adding to the replacement.

• Click ‘Find’ a couple times to make sure this is finding the recorded clicking sounds. If so, click‘Replace All’. Check that this has done what you want.

• Do the same with the pauses that have been recorded by using \[pause: ([0-9.]*)s\]as the text to search for and

<pause dur="PT\1S"/>

as the text to replace it with.

• When regular expressions are used carefully, they can make text replacement and markup aquicker job. What do you think are the dangers of using regular expressions?

5.8.8 Encoding a TimelineHaving a timeline is optional, but gives you a way to relate one point in a spoken text to another. In ourcase it is a bit artificial so you can decide whether to encode it our not.

• Inside the <body> before the first <gap/> element put the following <timeline> construc-tion. We could put this in the header, or a number of places, but this is as good a place as any.

51

Page 52: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

<timeline unit="min" xml:id="recordingStart"><when xml:id="fragmentStart" interval="24.27"/><when xml:id="TS2" interval="24.36"/><when xml:id="TS3" interval="24.42"/><when xml:id="TS4" interval="25.13"/><when xml:id="TS5" interval="26.35"/><when xml:id="TS6" interval="26.36"/><when xml:id="TS7" interval="27.09"/>

</timeline>

• What this does is set up a timeline, with units in minutes and then has a series of <when> elementswith @xml:id attributes that we can then point at from the body of our transcription. The fragmentwe are using is 24.27 minutes through the interview. The next speaker says something at 24.36,etc. These correspond to the times in the comments inside your <u> elements.

• This means that we can add a @start and @end (where desirable) to the <u> elements to point tothese <when> elements.

• Our first utterance opening tag now looks like:

<u who="#SL" start="#fragmentStart" end="#TS2"> So em d-em having read...</u>

• For our second we’ve only indicated the start:

<u who="#IH" start="#TS2"><!–Hislop (24.36):–>Yeah.

</u>

• The third:

<u who="#SL" start="#TS2" end="#TS3"><!–Lee (24.36-24.42)–>...

</u>

• Fourth:

<u who="#IH" start="#TS3" end="#TS4"><!–Hislop (24.42-25.13):–><pause dur="PT0.76S"/> Not really I mean

...

</u>

• Fifth:

52

Page 53: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.8 Exercise 7: Encoding Spoken Text

<u who="#SL" start="#TS4"><!–Lee (25.13):–>Yes.

</u>

• Sixth:

<u who="#IH" start="#TS4" end="#TS5"><!–Hislop (25.13-26.35)–>um <pause dur="PT0.50S"/> which I saw again and...

</u>

• Seventh:

<u who="#SL" start="#TS5"><!–Lee (26.35)–>hmm

</u>

• Eighth:

<u who="#IH" start="#TS6" end="#TS7"><!–Hislop (26.36-27.09)–>I think it’s difficult to read the history of the century...

</u>

• If you have done all that your document should be well-formed and valid with a happy greensquare. If it isn’t, find the problem!

5.8.9 Other Things to EncodeThere are some other things we could encode like:<title>s, <note>s, and <persName>s.

• There are three titles mentioned ’Wiper Times’, ’The Gassed’, and ’Voices’. Mark them up astitles, removing any quotation marks that indicated they were titles if they exist.

• There are two notes recorded by the transcriber: one has ’John Singer Sargent, 1918’ in it (removethe square brackets before marking it as a <note>. The other is ’syllables -damental whilelaughing’, mark this as a note as well. This last one could also have been marked using the<shift> element, but let’s leave it as a note for simplicity.

• There are a number of personal names, mark these as <persName>.

53

Page 54: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.8.10 Saving Your WorkLet’s save our work:

• Is your work well-formed? Do you have a happy green square or an angry red one?

• Have you formatted and indented your work automatically?

• From the ’File’ menu select ’Save’ or click on the Save icon (looks like an old-style 3.5" disk).

• Or if you prefer use the ’File’ then ’Save As’ menu item to save the file using the name’exercise07.xml’ or another name of your choice.

5.8.11 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• What element in the <teiHeader> is used to document the details of a recording that acts as asource?

• How do you mark utterances in spoken texts?

• What useful attributes can this element have?

• How do you indicate pauses or other incidents?

• What does a <timeline> look like?

• How do you mark titles that someone has said?

5.8.12 Next and More ReadingNext we’ll be looking at some linguistic markup. However, before that if you have time you may wishto:

• Look up the reference pages for each of the new elements you’ve used.

• You might want to read the chapter on ’Transcriptions of Speech’ at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/TS.html.

54

Page 55: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.9 Exercise 8: Linguistic Markup

5.9 Exercise 8: Linguistic Markup5.9.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• Understand how to use a <taxonomy> in the header for hierarchical classifications

• Know how to mark up words, and their parts-of-speech

• Associate XSLT stylesheets with an XSLT file

5.9.2 SummaryWe’re going to mark up the parts of speech for individual words in the transcription of spoken text weencoded in the previous exercise. To do this we’ll first put in a <taxonomy> element to refer to withstandard linguistic parts of speech. We’ll then tag individual words, and use an attribute to refer back tothis taxonomy. We’ll then realise how hard this is to do manually, and we’ll find a way to cheat! Finallywe’ll transform our XML file, not only into a standard web page displaying the transcribed text, but alsoto a page grouping together the words we’ve marked up.

5.9.3 Starting UpIn oXygen, load up the file you created in the previous exercise. If you didn’t finish that exercise, youcan cheat by loading up ’spoilers/ex07.xml’.

5.9.4 Inserting a TaxonomyIn order to have something to refer back to we’re going to insert a <taxonomy> element into our file.

• Immediate after the closing </profileDesc> tag, add a <encodingDesc> with a<classDecl> inside it.

• Making sure the cursor is in-between the starting and ending <classDecl> tags, insert the’taxonomy.xml’ file (’Document’ -> ’File’ -> ’Insert File’).

• This should add a large taxonomy of linguistic categories, each with their own @xml:id anddescription in a <catDesc> element.

• Your <encodingDesc> should look something like:

<encodingDesc><classDecl><taxonomy><category xml:id="adje"><catDesc>adjectives</catDesc><category xml:id="AJ0"><catDesc>adjective (unmarked) (e.g. GOOD, OLD)</catDesc>

</category><category xml:id="AJC"><catDesc>comparative adjective (e.g. BETTER, OLDER)</catDesc>

</category><category xml:id="AJS"><catDesc>superlative adjective (e.g. BEST, OLDEST)</catDesc>

</category><category xml:id="AT0"><catDesc>article (e.g. THE, A, AN)</catDesc>

</category></category>

...</taxonomy>

</classDecl></encodingDesc>

55

Page 56: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.9.5 Marking Up Part of SpeechLet’s mark up some words!

• We need to wrap a <w> element around each of the words in the three utterances. Highlight thefirst one, and use ’control-e’ to then surround it with this tag. Then highlight the second one, andpress ’control-/’ to surround with the tag you just used. Do that until ’changed over time?’

• Each word though needs to get an @ana attribute added to it. Either do this manually or thinkhow you might search for <w> and replace it with <w ana="">.

• Now the problem is that each of those @ana attributes needs a value! It needs to be ’#’ followedby one of the @xml:id values in our <taxonomy>. We know that you might not automaticallyknow what category each word is, so we’ve listed what our first three utterances look like below:

<u who="#SL" start="#fragmentStart" end="#TS2"><!–Lee (24.27-24.36):–><w ana="#AV0">So</w><w ana="#UNC">em</w><w ana="#UNC">d-em</w><w ana="#VHG">having</w><w ana="#VVN">read</w><incident dur="PT0.17S"><desc>clicking sound</desc>

</incident><w ana="#AT0">the</w><title><w ana="#NN2">Wipers</w><w ana="#NN2">Times</w>

</title><w ana="#AV0">now</w><w ana="#CJC">and</w><w ana="#CJC">and</w><w ana="#DPS">your</w><pause dur="PT0.62S"/><w ana="#NN1">view</w><w ana="#CJT">that</w><w ana="#NN0">th</w><w ana="#NN0">th</w><w ana="#AT0">the</w><w ana="#CRD">thirties</w><w ana="#WBD">was</w><w ana="#CJT">that</w><w ana="#CJT">that</w><w ana="#NN1">prisonment</w><w ana="#PNP">you</w><w ana="#VVB">say</w>

</u><u who="#IH" start="#TS2"><!–Hislop (24.36):–><w ana="#ITJ">Yeah.</w>

</u><u who="#SL" start="#TS2" end="#TS3"><!–Lee (24.36-24.42)–><incident dur="PT1.28S"><desc>clicking sound</desc>

</incident><w ana="#VVG">looking</w><w ana="#AVP">back</w><w ana="#PRP">on</w><w ana="#AT0">a</w>

56

Page 57: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.9 Exercise 8: Linguistic Markup

<w ana="#AJ0">failed</w><w ana="#NN1">piece</w><pause dur="PT0.35S"/><w ana="#VHZ">has</w><w ana="#DPS">your</w><w ana="#NN1">attitude</w><w ana="#PRP">to</w><w ana="#AT0">the</w><w ana="#PRP">to</w><w ana="#AT0">the</w><w ana="#NN1">War</w><w ana="#NN2">Poets</w><persName><w ana="#NP0">Wilfred</w><w ana="#NP0">Owen</w>

</persName><persName><w ana="#NP0">Sassoon</w>

</persName><pause dur="PT0.30S"/><w ana="#VVD">changed</w><w ana="#PRP">over</w><w ana="#NN1">time?</w>

</u>

5.9.6 How to CheatThat was an awful lot of work. In fact, some of those entries might be wrong. Why is that? Well, it isbecause we fed a plain text version of the transcription to an automatic part-of-speech tagger for Englishat http://ucrel.lancs.ac.uk/claws/. This has some limitations, but makes good guesses. Go and skim-readthe web page about it quickly.

• So we cheated in determining which parts-of-speech these words were, so we can hardly stop youcheating if you don’t want to manually mark up the rest of the spoken text in this file!

• We’d highly recommend, instead, that you save your current file (perhaps calling it ’exer-cise08.xml’?), and open up ’spoilers/ex08.xml’ which has a finished version of the file! (Justto save time you understand).

• In a real-world situation you probably wouldn’t manually tag a corpus like this in any case. Youwould run scripts over it (as we did) in order to automatically process it and convert the output ofa part-of-speech tagger.

5.9.7 Transforming Your FileBut what can we do with this markup now that we have ... erm... added it? (Ok, loaded it by opening’spoilers/ex08.xml’.)

• Let’s transform this file with an XSLT stylesheet we have prepared! XSLT is a transformationlanguage for XML which allows us to turn our XML files into other things (such as other XML,HTML, DOCX, PDF, TXT, etc.) and control what happens to them.

• In order to relate the XML file to a stylesheet we have to associate the two together. Go to the’Document’ -> ’XML Document’ -> ’Associate XSLT/CSS Stylesheet’ menu.

• Click on the ’XSLT’ tab, and click the folder icon to browse for a file.

• Choose ’spoilers/parts-of-speech.xsl’ as the XSLT file to use.

57

Page 58: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

• You should notice that oXygen adds a new line to the top of your file that looks something like:

<?xml-stylesheet type="text/xsl" href="parts-of-speech.xsl"?>

• This allows the XML document to know what stylesheet it can use to transform the document.

• Select from the ’Document’ -> ’Transformation’ menu, ’Configure Transformation Scenario’.

• On the window that appears select ’XML Stylesheet Processing Instruction’, and then click’Transform Now’.

• If everything has worked perfectly (sometimes settings change across versions of oXygen), thenyour web browser should open a web page containing the text of this interview. It should havea table-of-contents which allows you to see two different versions of the text. One as you mightexpect, the other with words grouped by part of speech. (If for any reason it does not open, simplyopen up ’spoilers/parts-of-speech.html’ in a web browser as a demonstration.)

• Have a look at both of these. Hover the mouse over the words in both cases and note the extrainformation you should get in a tooltip.

5.9.8 Saving Your WorkYou don’t really have to save this exercise (though feel free to if you want) since we opened up’spoilers/ex08.xml’.

5.9.9 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• Where does a <taxonomy> go in the header?

• Can <category> elements nest inside each other?

• What element is used to mark words?

• How do you mark the part-of-speech of a word?

• How do you associate an XSLT stylesheet with an XML file?

5.9.10 Next and More ReadingNext we’ll move on to learning how to customise the TEI for your own purposes. However, before thatif you have time you may wish to:

• Look up the reference pages for each of the new elements you’ve used.

• Read more about linguistic markup in the TEI chapter on ’Simple Analytic Mechanisms’http://www.tei-c.org/release/doc/tei-p5-doc/en/html/AI.html.

• You may also be interested in the TEI chapter on ’Linking, Segmentation and Alignment’http://www.tei-c.org/release/doc/tei-p5-doc/en/html/SA.html.

58

Page 59: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.10 Exercise 9: Customise the TEI with Roma

5.10 Exercise 9: Customise the TEI with Roma5.10.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• Analyse the TEI elements, attributes and values you need for your TEI XML document

• Tailor a TEI schema to your TEI XML file in Roma

• Use a different schema in oXygen

• Generate human-readable specifications of your TEI schema in Roma

• Set the value of existing attributes

• Be aware of the underlying TEI ODD XML format

5.10.2 SummaryIn this exercise we will customise the TEI to remove those elements we do not think we’ll use. In orderto customise a TEI schema you need to know which elements you want to use, and which you don’t,which sometimes involves a lengthy document analysis process. In our case we’ll shortcut that by tellingyou what to include or not include. You will learn to create a new schema, and download and use it inoXygen. You’ll learn how to constrain the acceptable values for an attribute, and require its presence.You’ll have a look at the underlying TEI ODD XML format which enables this customisation.

5.10.3 Starting UpLoad up the file ’spoilers/ex06.xml’ in oXygen and save it under a new name. Open up a webbrowser and go to http://www.tei-c.org/Roma/. (There is also a development version of this athttp://tei.oucs.ox.ac.uk/Roma/.)

5.10.4 Your Current SchemaoXygen already knows about the TEI, it comes bundled with an open source TEI Framework (oxygen-tei) that helps it understand how TEI files are meant to work.

• In oXygen with ’spoilers/ex06.xml’ (or whatever you saved it as) loaded, move the cursor to justinside a paragraph after the opening <p>.

• If you type a ’<’ at this point, as you know, oXygen will give you a dropdown list of all theelements allowed inside a <p>.

• Scroll down the list of elements, referring to the pop-up tooltip if you want to know whatthe elements are for. Notice such elements as <address>, <camera>, <incident>,<metamark>, and <notatedMusic>.

• Hit escape to leave the dropdown menu and delete the ’<’ that you had added.

• You certainly have a lot of choices for elements you can add here! But in any project it is unlikelythat you are going to want all those choices. Also, increased choice of what elements to add canlead to greater human error and inconsistency, and we don’t want that!

5.10.5 Roma: Starting a New SchemaRoma enables you to customise the TEI schema and remove those bits you are not going to use.

• Go to http://www.tei-c.org/Roma/ in your browser and note that you are given four options fromwhich to start:

1. Build up: this allows you to create a new customisation by adding elements and modules tothe smallest recommended schema

59

Page 60: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

2. Reduce: this allows you to create a new customisation by removing elements and modulesfrom the full tei_all (largest) schema

3. Template: this allows you to create a customisation from a template provided by the TEI asa starting point

4. Open: this allows you to open an existing customisation that you have saved previously.

• In our case, let’s start by choosing ’reduce’, and clicking ’start’.

• Set your parameters, change the following things:

– Title: ’TEI with maximal setup’ is kind of boring, why not call it something like ’My specialTEI customisation’.

– Filename: change ’tei_all’ for something like ’myTEI’ (don’t include spaces).

– Author name: You aren’t Sebastian! Change this to your name!

– You can leave the description as it is for now.

• Click ’Save’ at the bottom of the page. Notice how the box in the upper right tells you whichcustomisation you are working on.

5.10.6 Adding and Deleting ModulesModules are groupings of TEI elements for structural or semantic reasons. For example there is a’dictionary’ module which contains most of the elements needed for writing dictionaries. If you aren’twriting a dictionary, you probably don’t need that module. Below is a list of all the TEI modules:

Table 3: List of TEI Modules

analysis Simple analytic mechanismscertainty Certainty and uncertaintycore Elements common to all TEI documentscorpus Corpus textsdictionaries Dictionariesdrama Performance textsfigures Tables, formulæ, notated music, and figuresgaiji Character and glyph documentationheader The TEI Headeriso-fs Feature structureslinking Linking, segmentation and alignmentmsdescription Manuscript Descriptionnamesdates Names and datesnets Graphs, networks, and treesspoken Transcribed Speechtagdocs Documentation of TEI modulestextcrit Critical Apparatustextstructure Default text structuretranscr Transcription of primary sourcesverse Verse structures

• Click on the ’Modules’ tab to go to the page that allows you to add/delete modules from yourschema.

• Notice that because we’ve started with a ’maximal’ schema, the list of selected modules on theright is completely the same as the list of TEI modules on the left.

60

Page 61: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.10 Exercise 9: Customise the TEI with Roma

• Click ’remove’ next to ’analysis’ on the right-hand side. Note that it vanishes from this list, butremains on the left-hand side where you could add it back if you wanted it.

• Remove ’analysis’, ’certainty’, ’corpus’, ’dictionaries’, ’drama’, ’figures’, ’gaiji’, ’iso-fs’, ’link-ing’, ’nets’, ’spoken’, ’textcrit’, ’verse’, and ’tagdocs’!

• Well! With removing that many maybe we should have started by building up instead of reducingdown? You should be left with: ’tei’ (you can’t remove this one in Roma), ’core’, ’header’,’msdescription’, ’namesdates’, ’textstructure’, and ’transcr’. Why do you think we have left thesemodules?

5.10.7 Including or Excluding ElementsWe have shrunk down the TEI to just a few modules, but those modules contain elements that we don’twant.

• Click on ’core’ (note: not ’remove’ but the word ’core’) on the right-hand side. This should takeyou to a page listing all of the elements in the ’core’ module.

• Each row of this table has:

– the element

– whether it is Included or Excluded

– the name being used for the element

– a question mark linking to the reference page for this element

– a description of the element

– a link to change its attributes

• It is possible to Include or Exclude all the elements by clicking this word in the table header.

• From ’core’ exclude the following elements: ’addrLine’, ’address’, ’analytic’, ’biblStruct’, ’bina-ryObject’, ’distinct’, ’divGen’, ’gb’, ’headItem’, ’headLabel’, ’imprint’, ’index’, ’listBibl’, ’mea-sure’, ’measureGrp’, ’meeting’, ’mentioned’, ’monogr’, ’postBox’, ’postCode’, ’relatedItem’,’rs’, ’said’, ’series’, ’sp’, ’speaker’, ’stage’, ’street’, ’teiCorpus’, ’term’, ’textLang’, and ’time’.

• Wow! That’s a lot less elements in your TEI schema. Remember to click ’Save’ at the bottomof the page!

• We could go through to each of the other modules removing elements from there, but you get theidea. In a real life situation you would work through carefully only including elements that youreally needed. The tighter your schema, the more consistent your data!

5.10.8 Saving Your Schema• If you click on the ’Schema’ tab you will see a drop down menu listing various schema formats to

generate. The TEI uses a meta-schema format of its own called ODD which allows it to generatethese different formats.

• Generate a schema either in Relax NG Compact Syntax, or Relax NG XML Syntax. These reallyare the best choice.

• When you click generate your browser should automatically download the schema file. Findwherever it has saved it, and move it (not, not copy, move) it to the place you have saved the’ex06.xml’ (or whatever you saved it as) file. They should be in the same directory.

• Do not close down your browser window or you’ll have to do that all again.

61

Page 62: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.10.9 Associating Your Schema in oXygenoXygen has been using the tei_all schema by default because it recognises (from the TEI element in theTEI namespace) what kind of files we have been creating.

• Go to oXygen and the file you have previously loaded (’ex06.xml’ or whatever you saved it as).

• With this file open go to the ’Document’ -> ’Schema’ menu and note the icon next to ’AssociateSchema’. This icon should also be on your oXygen toolbar. Click either the icon, or ’AssociateSchema’.

• Click on the little folder icon next to ’URL’ in order to ’Browse for local file’. Find the schemafile you saved earlier, select it, and then click ’OK’ when back in the oXygen dialog box.

• When you click ’Ok’ then oXygen should add a line that looks something lke this:

<?xml-model href="myTEI.rng" type="application/xml"schematypens="http://relaxng.org/ns/structure/1.0"?>

at the top of your file.

5.10.10 Trying It OutRemember those elements like ’address’ and ’camera’ that you could add within a paragraph?

• Go to somewhere just after a <p> opening tag, and insert an ’<’ to get a dropdown list fromoXygen.

• Are any of the elements you excluded available? No? Good! If they are, then chances are youdidn’t click ’Save’ after Including/Excluding them, go back and do it again!

5.10.11 Constraining the @type Attribute on <div>

Removing elements is all well and good and is the first step in customising your schema, but we want todo more. Let’s customise the @type attribute on <div> to only allow certain values.

• Go back to Roma in your browser (hopefully you didn’t shut it and lose all your work?)

• Click on the ’Modules’ tab.

• Click on the ’textstructure’ module name.

• On the row containing ’div’ click on ’Change Attributes’ on the far right-hand side.

• This should take you to a page listing all the possible attributes on <div>. This is also where youwould include/exclude use of those attributes if we wanted to change that.

• Scroll down to ’type’ and click on it. This should take you to a page allowing you to set variousoptions for the @type attribute. Set them as follows:

– Is it optional? This allows us to control whether the attribute is required or not. Let’s makeour @type attribute required, so click ’no’ it is not optional.

– Contents This would allow us to change what type of datatype is allowed and how manytimes it should appear. Let’s leave that just as it is as ’Text’.

– Default value would allow us to set a default value for the attribute if you didn’t supply one.Let’s force ourselves to supply one and so leave this blank.

– Closed list? enables us to say whether our list of values is fixed, or merely a suggestion.Let’s be rigorous and say that it is a closed list. Answer yes!

62

Page 63: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.10 Exercise 9: Customise the TEI with Roma

– List of values is where we give the values we want to supply to the schema as valid valuesfor the @type attribute on <div>. We give this as a comma-separated list. So write in:prose,verse,drama,chapter,somethingElse.

– Description allows us to change the description of this attribute. Add the phrase ’Ourmodified type attribute ’ to the start of the description.

• Click ’Save’ at the bottom of the page.

5.10.12 Trying It Out AgainLet’s go try out the changes we made. You know how to do this now:

• Click on the ’Schema’ tab.

• Choose one of the Relax NG formats from the dropdown list.

• Click ’Generate’

• Find where the file has downloaded it and copy it over the previous version you had there.

• Do not close down your browser!

• Go back to oXygen, and the ’ex06.xml’ (or whatever you renamed it as) file, and go to the’Document’ -> ’Validate’ -> ’Reset Cache and Validate’ menu item.

• You document should validate fine and you should have a happy green square.

• Go to the first <div> tag in the document that looks like <div type="prose"> and changeit to be just <div>.

• Your document should not be valid. You should have an angry red square. If it is still valid’Reset Cache and Validate’ again, and ensure that it is pointing to the correct schema. Theerror message it should be providing is that that element ’div’ missing requiredattribute ’type’ or similar.

• Put your cursor immediately after the ’v’ in <div> and press space. oXygen should provide adropdown list of attributes available on <div>. Scroll down until you find @type and note that itis in bold. This is because we made it required.

• Select @type and notice that oXygen gives you another dropdown list of the possible values. Thisis because we provided the values and said that this was a closed value list.

• Choose one of the values, perhaps ’prose’. Your document should again be valid and have a happygreen square.

5.10.13 Saving Your CustomisationThis is great, but what if you want to save your customisation, and come back later to do more work?

• Go back to your web browser and click on the ’Save Customization’ tab.

• Your browser should automatically start downloading an XML file. Move it to somewhereconvenient, for example where you put the schema.

• Do not shut your web browser yet!

• This is the file that you could upload when going to the ’New’ tab on Roma (the very first pagewith the four choices), if you had selected ’Open existing customization’. (Don’t do this nowthough!)

63

Page 64: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

• Open this XML file in oXygen. It might not be formatted or indented properly. If not go to the’Document’ -> ’Source’ -> ’Format and Indent’ menu, or click the Format and Indent icon on thetoolbar, or press ’control-shift-p’.

• Read through the file to get a sense of how it relates to your customisation. Note how<moduleRef> includes those modules you have asked for, and how the ’core’ module isincluded except for the list of attributes you excluded.

• Look at the <elementSpec> for <div> and see how we’ve changed it.

• Note that this file is a TEI file just like the ones you’ve been editing, it just uses special elementsfrom the ’tagdocs’ module.

5.10.14 Generating Reference DocumentationRoma does not only generate schemas, but also customised reference documentation.

• If you return to your web browser and click on the ’Documentation’ tab.

• Choose HTML web page from the dropdown menu and click ’Generate’.

• If your browser has downloaded the file, instead of opening it, open the saved file with your webbrowser.

• You should get a web page starting with a table of contents listing the elements. Scroll down andclick on <div>.

• Notice that this has the @type attribute as required, and lists the legal values. Notice, however,that the example has not changed and it says type="poetry" in that.

• Try generating some PDF documentation as well. Which do you prefer?

5.10.15 More About RomaRoma the web front-end is a bit of a dated interface to a command line script and the OxGarage webservice. When you generated the documentation this used OxGarage and you didn’t even notice!

Some people write their TEI ODD customisation files entirely in XML and do not use the Roma webinterface at all. There are a number of things that the Roma web interface can’t do which the TEI ODDlanguage underneath is capable of. Notice, for example, that you weren’t able to provide descriptionsof each of the attribute values you entered for @type? You can do that in the underlying XML. Somepeople do a combination of both Roma and hand editing.

There is also a ’Sanity Checker’ tab... click that and find out what happens! (It might warn about theelement <term> being used in <keywords> but not being defined. That is fine!)

5.10.16 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• What is Roma?

• How do you add and remove TEI modules using Roma?

• How do you include/exclude individual elements using Roma?

• How can you change attributes using Roma?

• Is it possible to save your customisation in Roma?

• What kinds of documentation can you generate in Roma?

64

Page 65: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.10 Exercise 9: Customise the TEI with Roma

• What kinds of schemas can you generate in Roma?

• What does an underlying TEI ODD customisation file look like? Is it a TEI file like the onesyou’ve been working with?

5.10.17 Next and More ReadingNext we’ll move on to some of the other tools and utilities offered by the TEI Consortium. But firstconsider reading more about TEI ODD at:

• http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#IM.

• and also the Documentation Elements chapter at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/TD.html.

• See also, http://tbe.kantl.be/TBE/modules/TBED08v00.htm.

65

Page 66: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.11 Exercise 10: OxGarage and the TEI Community5.11.1 Learning OutcomesWhen you successfully complete this exercise you should be able to:

• Us OxGarage to convert files from all sorts to and from TEI

• Understand the limitations of automatic conversion

• Explore the TEI Guidelines, Website, and Wiki

• Know how to read and submit bugs and feature requests on the TEI Sourceforge site

• Subscribe to the TEI-L mailing list

• Visit the TEI By Examples website for more self-directed training

5.11.2 SummaryThis exercise is designed to give you some exposure to some of the other TEI resources available online.Not only will you use OxGarage to convert files to and from TEI, but be shown the limitations ofsuch conversions. You’ll be directed to more parts of the TEI Guidelines, the TEI-C Website, and thecommunity-developed Wiki. The process for submitting and reviewing bugs and feature requests willbe reviewed, along with how to subscribe to the TEI-L mailing list. The TEI By Examples website willbe suggested as a good place for further self-directed study.

5.11.3 Starting UpThis exercise will primarily use a web browser, we recommend Google Chrome or a recent version ofMozilla Firefox.

5.11.4 OxGarage: Have a quick playOxGarage is a pipelining transformation engine with a RESTful Web Service (which in this case meansit can be used by programs as well as on the web) that converts documents from one format to another.

• Go to http://oxgarage.oucs.ox.ac.uk:8080/ege-webclient or if this isn’t working http://www.tei-c.org/oxgarage/.

• Click on ’Documents’ and select "TEI P5 XML Document" as your input. When you do so a listof possible conversion targets should appear on the right. Choose "Microsoft Word Document(.docx)".

• When you’ve done this a ’Choose File’ button should appear on the upper left. Click the buttonand navigate to your finished (if well-formed and valid) file from way back in Exercise 2. (If youdidn’t finish this, choose ’spoilers/ex02.xml’ instead).

• Click convert and open the document in Microsoft Word. Note the information that is retainedand the information that is lost.

• Try this again, but use a more complex file such the one from Exercise 6 (if you didn’t finish this,choose ’spoilers/ex06.xml’ instead). Note how in conversion to DOCX format that it attemptsto interpret additions, deletions, unclear, expansions, and representing them in presentationalmarkup.

• Try converting to a variety of other formats and see the results you get. (Note: Not all conversionsare equal!)

66

Page 67: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.11 Exercise 10: OxGarage and the TEI Community

5.11.5 More on OxGarageOf course, OxGarage isn’t just about converting TEI files to other formats, it can also convert otherformats to TEI! See http://www.oucs.ox.ac.uk/oxgarage/ for more information.

• Take one of the formats you have converted (e.g. DOCX) a file in, and edit that file in MicrosoftWord. Add some new text, and divisions to them. Use ’heading2’ style (or similar) to add a newdivision at some point and add a few lines of text.

• If they are available, use some of the in-line styles the Word provides, and mark some text withthem.

• Try converting the file back to TEI and seeing what is preserved (and what isn’t!)

• Although most conversions are ’lossy’, this is a good mechanism for getting a large number ofdocuments into a basic TEI P5 XML structure, to then do further conversion work on this. Oneof the things we do for funded research projects is take on the work of ’up-converting’ their filesfrom Word to TEI P5, but deducing as much additional structure and markup as we can. Usuallythis kind of conversion is different for every project, but builds on the common base that we haveput into OxGarage.

5.11.6 The TEI GuidelinesThe TEI Guidelines are the main output of the TEI Consortium and contain chapters on a wide varietyof TEI recommendations. Hopefully you’ve had a chance to read a bit of them already.

• Go to http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index-toc.html.

• Note the division into ’Front Matter’, ’Text Body’, and ’Back Matter’, look at the kinds of thingspresent in each section.

• Choose a chapter from the ’Text Body’ section and note how the left-hand table of contents showsthe general divisions (and previous/next chapters) and how the small right-hand navigation allowsyou to move forward/backwards through the sections.

• Notice also that greyed out ’¶’ character after any sub-division (or sub-sub-division) heading.This is a link to that section in particular. This is useful if you want to cite a section of the TEIGuidelines in conversation. (e.g. on the TEI-L mailing list)

• Notice that all elements, attributes, classes, and datatypes are links through to the reference pagesabout that object.

• If you look at the examples provided, most of them will have green backgrounds, which meansthat this snippet is valid in a TEI file (assuming it was put in the right place). Some examplesmight have amber (feasibly valid if some missing elements were provided), or red (invalid againstthe default schema). Sometimes it is necessary to show examples which are invalid to demonstratemixing of namespaces or when discussing XML itself.

• Click on any element name in the prose to go to the reference page for that element. Theinformation here can be very useful when you want to look up what the definition of an elementis, where it is allowed, or what is allowed inside it. At the bottom are one or more examples, anda link through to a list of all the examples in the TEI Guidelines which use this element.

5.11.7 The TEI Consortium WebsiteThe TEI-C Website http://www.tei-c.org/ is the central location leading to all things TEI related.

• Click the ’Home’ link on the menu (if you are still on the TEI Guidelines), and look at the homepage. Note the newsfeed that is provided on the left, and the menu bar.

67

Page 68: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

• If you hover your mouse over the menus a drop-down menu should appear, look at each menu andfamiliarise yourself with the kinds of resources on the TEI-C website.

• If you explore enough, you should also find items like the minutes of meetings of the Boardand Technical Council. The TEI, where possible, attempts to conduct a large part of its businesspublicly since it is really a product of its own community (e.g. the Technical Council mailinglist archives are openly available for public viewing, and most of the work is freely available onSourceforge).

• The TEI consortium provides XSLT stylesheets to transform TEI to various formats. These arewhat underlie the OxGarage conversions above. They are freely available for anyone to use fromthe TEI Sourceforge Subversion repository. You can read about them at Tools -> Stylesheets, orat: http://www.tei-c.org/Tools/Stylesheets/.

• Explore the rest of the TEI-C Website!

5.11.8 The TEI WikiThe TEI Wiki is a community developed location for all sorts of TEI-related information.

• Go to http://wiki.tei-c.org/ and read the main page.

• How many different XSLT Stylesheets are provided on the TEI wiki?

• How many pages on ’Tools’ are there?

• What Special Interest Groups (SIGs) have pages on the TEI Wiki. Look at the Manuscripts SIGfor an example.

• Look on the Technical Council page, to see its last agenda.

• In order to edit the wiki you need to request an account. If you think this is something you’ll need,why not do so now! (It may take a couple days, it is approved by volunteers!)

5.11.9 The TEI Sourceforge SiteThe TEI Sourceforge site is currently used to manage the work of the TEI Technical Council inmaintaining the TEI Guidelines and various tools that accompany them (e.g. Roma and the Stylesheets).It is also the location that allows the community to submit bug and feature reports.

• Go to http://tei.sourceforge.net/ and notice that there are links to the project summary, bug reports,feature requests, file downloads and code repository. Explore each of these!

• Click feature requests or go to http://purl.org/tei/fr (with no trailing slash).

• Look through some recently submitted feature requests and click on one of them.

• Read the details of the tick noting whether it has been assigned to anyone, and whether there areany comments at the bottom. If there are, read them.

• Anyone who has registered for a Sourceforge account is able to comment on tickets, and the TEIencourages community participation. (So if you see a ticket you want to comment on, register/login and comment!

• Returning to http://tei.sourceforge.net/ click on ’Code Repository’. This allows you openly tobrowse the Subversion repository used by the TEI for its development.

• If you click on ’P5’, ’Source’, then ’Specs’, you’ll find the folder of all of the specifications forindividual elements. If you choose one of these you’ll see the revision history for this element.

• The Sourceforge site is a useful repository for the TEI that allows it to undertake ongoingdevelopment in an open and transparent manner. Being able to post bugs and feature requesthere makes you part of this development effort.

68

Page 69: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.11 Exercise 10: OxGarage and the TEI Community

5.11.10 The TEI-L Mailing ListThe TEI-L mailing list is the main form of communication within the TEI Community. Questions onthere range from those entirely new to the TEI to those who have been using the TEI for a coupledecades. You should not be afraid to post straightforward sounding questions, just make sure you’vechecked the TEI Guidelines and Website first and are clear about what you want to do and why you areconfused. You will almost certainly be guaranteed an answer, sometimes several competing ones!

• Go to the TEI-L Archives at: http://listserv.brown.edu/archives/cgi-bin/wa?A0=TEI-L (or clickon the link in the last paragraph on the TEI-C Website home page).

• Read through some TEI-L messages from June 2012, choose some which sound interesting basedon their subject line.

• Try sorting by Date (they are defaultly sorted by Subject).

• Try searching (on the right-hand side) for some keyword that interests you.

• Consider subscribing to the mailing list!

• Another archive of the TEI-L messages are available from:http://blog.gmane.org/gmane.text.tei.general and another one at:http://tei-l.970651.n3.nabble.com/.

5.11.11 TEI By ExampleTEI By Example provides a variety of freely-available online tutorials the demonstrate a number ofdifferent stages in encoding a TEI file. There is a general introduction to text encoding and step-by-steptutorials provide introductions to eight different aspects of TEI markup with lots of examples. Real lifeexamples are provided for each tutorial and the theory provided can be tested with tests and exercises.A tools section gives an annotated overview of XML encoding technology and a validator for fragmentsof TEI.

• If you have not already done so, visit TEI By Example at: http://tbe.kantl.be/TBE/.

• Choose a Tutorial that is interesting to you and skim read the tutorial (you do not have to read itin depth at this point, you may choose to do so later).

• Look at the corresponding Examples section for that Tutorial, and see what things you do notunderstand. (Look them up in the Tutorial section).

• Have a look at the corresponding Exercise for the Tutorial section. (You don’t need to do it, justget a sense of the kind of exercises provided.)

• TEI By Example is a good resource that it would benefit you to return to at a later date and workthrough.

5.11.12 Self-AssessmentCheck if you understand some of the core principles of this exercise by answering the followingquestions:

• What is OxGarage good at? What is it not good at?

• How do you get to an element’s reference page on the TEI Website?

• What kind of information do you find on the TEI Wiki?

• How do you submit a bug or feature request to the TEI?

• Have you joined the TEI-L mailing list?

• What do you think of TEI-By-Example?

69

Page 70: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.11.13 Next?This is the last exercise of this workshop, and we hope you feel like you’ve had a (quick!) but broadoverview of some of the things the TEI can do. Your learning is by no means complete! Read the TEIGuidelines! Use TEI-By-Example! Join the TEI-L mailing list and ask questions! If you have Oxfordspecific TEI questions you can email us on [email protected], but you are more likely to get a widerrange of answers on the TEI-L mailing list. All of the exercises will be made available from a link onthe DHOXSS website, and http://tei.oucs.ox.ac.uk/.

70

Page 71: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

5.12 TEI reference material: summary of elementsThe sections in this document summarize all the TEI elements as of June 2012.

TEI textstructure (TEI document) contains a single TEI-conformant document, comprisinga TEI header and a text, either in isolation or as part of a teiCorpuselement.

ab linking (anonymous block) contains any arbitrary component-level unit of text,acting as an anonymous container for phrase or inter level elementsanalogous to, but without the semantic baggage of, a paragraph.

abbr core (abbreviation) contains an abbreviation of any sort.accMat msdescription (accompanying material) contains details of any significant additional

material which may be closely associated with the manuscript beingdescribed, such as non-contemporaneous documents or fragments boundin with the manuscript at some earlier historical period.

acquisition msdescription contains any descriptive or other information concerning the process bywhich a manuscript or manuscript part entered the holding institution.

activity corpus contains a brief informal description of what a participant in a languageinteraction is doing other than speaking, if anything.

actor drama Name of an actor appearing within a cast list.add core (addition) contains letters, words, or phrases inserted in the text by an

author, scribe, annotator, or corrector.addName namesdates (additional name) contains an additional name component, such as a

nickname, epithet, or alias, or any other descriptive phrase used within apersonal name.

addSpan transcr (added span of text) marks the beginning of a longer sequence of textadded by an author, scribe, annotator or corrector (see also add).

additional msdescription groups additional information, combining bibliographic informationabout a manuscript, or surrogate copies of it with curatorial or admin-istrative information.

additions msdescription contains a description of any significant additions found within amanuscript, such as marginalia or other annotations.

addrLine core (address line) contains one line of a postal address.address core contains a postal address, for example of a publisher, an organization, or

an individual.adminInfo msdescription (administrative information) contains information about the present cus-

tody and availability of the manuscript, and also about the record descrip-tion itself.

affiliation namesdates (affiliation) contains an informal description of a person’s present or pastaffiliation with some organization, for example an employer or sponsor.

age namesdates (age) specifies the age of a person.alt linking (alternation) identifies an alternation or a set of choices among elements

or passages.altGrp linking (alternation group) groups a collection of alt elements and possibly

pointers.altIdent tagdocs (alternate identifier) supplies the recommended XML name for an ele-

ment, class, attribute, etc. in some language.altIdentifier msdescription (alternative identifier) contains an alternative or former structured identi-

fier used for a manuscript, such as a former catalogue number.am transcr (abbreviation marker) contains a sequence of letters or signs present in

an abbreviation which are omitted or replaced in the expanded form ofthe abbreviation.

71

Page 72: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

analytic core (analytic level) contains bibliographic elements describing an item (e.g.an article or poem) published within a monograph or journal and not asan independent publication.

anchor linking (anchor point) attaches an identifier to a point within a text, whether ornot it corresponds with a textual element.

app textcrit (apparatus entry) contains one entry in a critical apparatus, with anoptional lemma and at least one reading.

appInfo header (application information) records information about an application whichhas edited the TEI file.

application header provides information about an application which has acted upon thedocument.

arc nets encodes an arc, the connection from one node to another in a graph.argument textstructure A formal list or prose description of the topics addressed by a subdivision

of a text.att tagdocs (attribute) contains the name of an attribute appearing within running

text.attDef tagdocs (attribute definition) contains the definition of a single attribute.attList tagdocs contains documentation for all the attributes associated with this element,

as a series of attDef elements.attRef tagdocs (attribute pointer) points to the definition of an attribute or group of

attributes.author core in a bibliographic reference, contains the name(s) of an author, personal

or corporate, of a work; for example in the same form as that providedby a recognized bibliographic name authority.

authority header (release authority) supplies the name of a person or other agency respon-sible for making an electronic file available, other than a publisher ordistributor.

availability header supplies information about the availability of a text, for example anyrestrictions on its use or distribution, its copyright status, any licenceapplying to it, etc.

back textstructure (back matter) contains any appendixes, etc. following the main part of atext.

bibl core (bibliographic citation) contains a loosely-structured bibliographic cita-tion of which the sub-components may or may not be explicitly tagged.

biblFull header (fully-structured bibliographic citation) contains a fully-structured bibli-ographic citation, in which all components of the TEI file description arepresent.

biblScope core (scope of citation) defines the scope of a bibliographic reference, forexample as a list of page numbers, or a named subdivision of a largerwork.

biblStruct core (structured bibliographic citation) contains a structured bibliographiccitation, in which only bibliographic sub-elements appear and in aspecified order.

bicond iso-fs (bi-conditional feature-structure constraint) defines a biconditionalfeature-structure constraint; both consequent and antecedent are specifiedas feature structures or groups of feature structures; the constraint issatisfied if both subsume a given feature structure, or if both do not.

binary iso-fs (binary value) represents the value part of a feature-value specificationwhich can contain either of exactly two possible values.

72

Page 73: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

binaryObject core provides encoded binary data representing an inline graphic or otherobject.

binding msdescription contains a description of one binding, i.e. type of covering, boards, etc.applied to a manuscript.

bindingDesc msdescription (binding description) describes the present and former bindings of amanuscript, either as a series of paragraphs or as a series of distinctbinding elements, one for each binding of the manuscript.

birth namesdates (birth) contains information about a person’s birth, such as its date andplace.

bloc namesdates (bloc) contains the name of a geo-political unit consisting of two or morenation states or countries.

body textstructure (text body) contains the whole body of a single unitary text, excludingany front or back matter.

broadcast spoken describes a broadcast used as the source of a spoken text.byline textstructure contains the primary statement of responsibility given for a work on its

title page or at the head or end of the work.c analysis (character) represents a character.cRefPattern header (canonical reference pattern) specifies an expression and replacement

pattern for transforming a canonical reference into a URI.caesura verse marks the point at which a metrical line may be divided.calendar header describes a calendar or dating system used in a dating formula in the text.calendarDesc header (calendar description) contains a description of the calendar system used

in any dating expression found in the text.camera drama describes a particular camera angle or viewpoint in a screen play.caption drama contains the text of a caption or other text displayed as part of a film

script or screenplay.case dictionaries contains grammatical case information given by a dictionary for a given

form.castGroup drama (cast list grouping) groups one or more individual castItem elements

within a cast list.castItem drama (cast list item) contains a single entry within a cast list, describing either

a single role or a list of non-speaking roles.castList drama (cast list) contains a single cast list or dramatis personae.catDesc header (category description) describes some category within a taxonomy or text

typology, either in the form of a brief prose description or in terms of thesituational parameters used by the TEI formal textDesc.

catRef header (category reference) specifies one or more defined categories within sometaxonomy or text typology.

catchwords msdescription describes the system used to ensure correct ordering of the quires makingup a codex or incunable, typically by means of annotations at the foot ofthe page.

category header contains an individual descriptive category, possibly nested within asuperordinate category, within a user-defined taxonomy.

cb core (column break) marks the boundary between one column of a text andthe next in a standard reference system.

cell figures contains one cell of a table.certainty certainty indicates the degree of certainty associated with some aspect of the text

markup.

73

Page 74: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

change header documents a change or set of changes made during the production of asource document, or during the revision of an electronic file.

channel corpus (primary channel) describes the medium or channel by which a textis delivered or experienced. For a written text, this might be print,manuscript, e-mail, etc.; for a spoken one, radio, telephone, face-to-face,etc.

char gaiji (character) provides descriptive information about a character.charDecl gaiji (character declarations) provides information about nonstandard charac-

ters and glyphs.charName gaiji (character name) contains the name of a character, expressed following

Unicode conventions.charProp gaiji (character property) provides a name and value for some property of the

parent character or glyph.choice core groups a number of alternative encodings for the same point in a text.cit core (cited quotation) contains a quotation from some other document, to-

gether with a bibliographic reference to its source. In a dictionary it maycontain an example text with at least one occurrence of the word form,used in the sense being described, or a translation of the headword, or anexample.

cl analysis (clause) represents a grammatical clause.classCode header (classification code) contains the classification code used for this text in

some standard classification system.classDecl header (classification declarations) contains one or more taxonomies defining

any classificatory codes used elsewhere in the text.classRef tagdocs points to the specification for an attribute or model class which is to be

included in a schemaclassSpec tagdocs (class specification) contains reference information for a TEI element

class; that is a group of elements which appear together in contentmodels, or which share some common attribute, or both.

classes tagdocs specifies all the classes of which the documented element or class is amember or subclass.

climate namesdates (climate) contains information about the physical climate of a place.closer textstructure groups together salutations, datelines, and similar phrases appearing as a

final group at the end of a division, especially of a letter.code tagdocs contains literal code from some formal language such as a programming

language.collation msdescription contains a description of how the leaves or bifolia are physically ar-

ranged.collection msdescription contains the name of a collection of manuscripts, not necessarily located

within a single repository.colloc dictionaries (collocate) contains a collocate of the headword.colophon msdescription contains the colophon of a manuscript item: that is, a statement providing

information regarding the date, place, agency, or reason for productionof the manuscript.

cond iso-fs (conditional feature-structure constraint) defines a conditional feature-structure constraint; the consequent and the antecedent are specifiedas feature structures or feature-structure collections; the constraint issatisfied if both the antecedent and the consequent subsume a givenfeature structure, or if the antecedent does not.

74

Page 75: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

condition msdescription contains a description of the physical condition of the manuscript.constitution corpus describes the internal composition of a text or text sample, for example

as fragmentary, complete, etc.constraint tagdocs (constraint rules) the formal rules of a constraintconstraintSpec tagdocs (constraint on schema) contains a constraint, expressed in some formal

syntax, which cannot be expressed in the structural content modelcontent tagdocs (content model) contains the text of a declaration for the schema docu-

mented.corr core (correction) contains the correct form of a passage apparently erroneous

in the copy text.correction header (correction principles) states how and under what circumstances correc-

tions have been made in the text.country namesdates (country) contains the name of a geo-political unit, such as a nation,

country, colony, or commonwealth, larger than or administratively su-perior to a region and smaller than a bloc.

creation header contains information about the creation of a text.custEvent msdescription (custodial event) describes a single event during the custodial history of

a manuscript.custodialHist msdescription (custodial history) contains a description of a manuscript’s custodial

history, either as running prose or as a series of dated custodial events.damage transcr contains an area of damage to the text witness.damageSpan transcr (damaged span of text) marks the beginning of a longer sequence of text

which is damaged in some way but still legible.datatype tagdocs specifies the declared value for an attribute, by referring to any datatype

defined by the chosen schema language.date core contains a date in any format.dateline textstructure contains a brief description of the place, date, time, etc. of production

of a letter, newspaper story, or other work, prefixed or suffixed to it as akind of heading or trailer.

death namesdates (death) contains information about a person’s death, such as its date andplace.

decoDesc msdescription (decoration description) contains a description of the decoration of amanuscript, either as a sequence of paragraphs, or as a sequence oftopically organized decoNote elements.

decoNote msdescription (note on decoration) contains a note describing either a decorativecomponent of a manuscript, or a fairly homogenous class of suchcomponents.

def dictionaries (definition) contains definition text in a dictionary entry.default iso-fs (default feature value) represents the value part of a feature-value speci-

fication which contains a defaulted value.defaultVal tagdocs (default value) specifies the default declared value for an attribute.del core (deletion) contains a letter, word, or passage deleted, marked as deleted,

or otherwise indicated as superfluous or spurious in the copy text by anauthor, scribe, annotator, or corrector.

delSpan transcr (deleted span of text) marks the beginning of a longer sequence oftext deleted, marked as deleted, or otherwise signaled as superfluous orspurious by an author, scribe, annotator, or corrector.

75

Page 76: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

depth msdescription contains a measurement measured across the spine of a book or codex, or(for other text-bearing objects) perpendicular to the measurement givenby the width element.

derivation corpus describes the nature and extent of originality of this text.desc core (description) contains a brief description of the object documented by

its parent element, including its intended usage, purpose, or applicationwhere this is appropriate.

dictScrap dictionaries (dictionary scrap) encloses a part of a dictionary entry in which otherphrase-level dictionary elements are freely combined.

dim msdescription contains any single measurement forming part of a dimensional specifi-cation of some sort.

dimensions msdescription contains a dimensional specification.distinct core identifies any word or phrase which is regarded as linguistically distinct,

for example as archaic, technical, dialectal, non-preferred, etc., or asforming part of a sublanguage.

distributor header supplies the name of a person or other agency responsible for thedistribution of a text.

district namesdates contains the name of any kind of subdivision of a settlement, such as aparish, ward, or other administrative or geographic unit.

div textstructure (text division) contains a subdivision of the front, body, or back of a text.div1 textstructure (level-1 text division) contains a first-level subdivision of the front, body,

or back of a text.div2 textstructure (level-2 text division) contains a second-level subdivision of the front,

body, or back of a text.div3 textstructure (level-3 text division) contains a third-level subdivision of the front, body,

or back of a text.div4 textstructure (level-4 text division) contains a fourth-level subdivision of the front,

body, or back of a text.div5 textstructure (level-5 text division) contains a fifth-level subdivision of the front, body,

or back of a text.div6 textstructure (level-6 text division) contains a sixth-level subdivision of the front,

body, or back of a text.div7 textstructure (level-7 text division) contains the smallest possible subdivision of the

front, body or back of a text, larger than a paragraph.divGen core (automatically generated text division) indicates the location at which a

textual division generated automatically by a text-processing applicationis to appear.

docAuthor textstructure (document author) contains the name of the author of the document, asgiven on the title page (often but not always contained in a byline).

docDate textstructure (document date) contains the date of a document, as given (usually) on atitle page.

docEdition textstructure (document edition) contains an edition statement as presented on a titlepage of a document.

docImprint textstructure (document imprint) contains the imprint statement (place and date ofpublication, publisher name), as given (usually) at the foot of a title page.

docTitle textstructure (document title) contains the title of a document, including all itsconstituents, as given on a title page.

76

Page 77: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

domain corpus (domain of use) describes the most important social context in whichthe text was realized or for which it is intended, for example private vs.public, education, religion, etc.

eLeaf nets (leaf or terminal node of an embedding tree) provides explicitly for aleaf of an embedding tree, which may also be encoded with the eTreeelement.

eTree nets (embedding tree) provides an alternative to tree element for representingordered rooted tree structures.

edition header (edition) describes the particularities of one edition of a text.editionStmt header (edition statement) groups information relating to one edition of a text.editor core secondary statement of responsibility for a bibliographic item, for exam-

ple the name of an individual, institution or organization, (or of severalsuch) acting as editor, compiler, translator, etc.

editorialDecl header (editorial practice declaration) provides details of editorial principles andpractices applied during the encoding of a text.

education namesdates contains a description of the educational experience of a person.eg tagdocs (example) contains any kind of illustrative example.egXML tagdocs (example of XML) contains a single well-formed XML fragment demon-

strating the use of some XML element or attribute, in which the egXMLelement itself functions as the root element.

elementRef tagdocs points to the specification for some element which is to be included in aschema

elementSpec tagdocs (element specification) documents the structure, content, and purpose ofa single element type.

email core (electronic mail address) contains an e-mail address identifying a loca-tion to which e-mail messages can be delivered.

emph core (emphasized) marks words or phrases which are stressed or emphasizedfor linguistic or rhetorical effect.

encodingDesc header (encoding description) documents the relationship between an electronictext and the source or sources from which it was derived.

entry dictionaries contains a single structured entry in any kind of lexical resource, such asa dictionary or lexicon.

entryFree dictionaries (unstructured entry) contains a single unstructured entry in any kind oflexical resource, such as a dictionary or lexicon.

epigraph textstructure contains a quotation, anonymous or attributed, appearing at the start orend of a section or on a title page.

epilogue drama contains the epilogue to a drama, typically spoken by an actor out ofcharacter, possibly in association with a particular performance or venue.

equipment spoken provides technical details of the equipment and media used for an audioor video recording used as the source for a spoken text.

equiv tagdocs (equivalent) specifies a component which is considered equivalent to theparent element, either by co-reference, or by external link.

etym dictionaries (etymology) encloses the etymological information in a dictionary entry.event namesdates (event) contains data relating to any kind of significant event associated

with a person, place, or organization.ex transcr (editorial expansion) contains a sequence of letters added by an editor or

transcriber when expanding an abbreviation.exemplum tagdocs groups an example demonstrating the use of an element along with

optional paragraphs of commentary.

77

Page 78: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

expan core (expansion) contains the expansion of an abbreviation.explicit msdescription contains the explicit of a manuscript item, that is, the closing words of

the text proper, exclusive of any rubric or colophon which might followit.

extent header describes the approximate size of a text as stored on some carriermedium, whether digital or non-digital, specified in any convenient units.

f iso-fs (feature) represents a feature value specification, that is, the associationof a name with a value of any of several different types.

fDecl iso-fs (feature declaration) declares a single feature, specifying its name,organization, range of allowed values, and optionally its default value.

fDescr iso-fs (feature description (in FSD)) describes in prose what is represented bythe feature being declared and its values.

fLib iso-fs (feature library) assembles a library of feature elements.facsimile transcr contains a representation of some written source in the form of a set of

images rather than as transcribed or encoded text.factuality corpus describes the extent to which the text may be regarded as imaginative or

non-imaginative, that is, as describing a fictional or a non-fictional world.

faith namesdates specifies the faith, religion, or belief set of a person.figDesc figures (description of figure) contains a brief prose description of the appear-

ance or content of a graphic figure, for use when documenting an imagewithout displaying it.

figure figures groups elements representing or containing graphic information such asan illustration, formula, or figure.

fileDesc header (file description) contains a full bibliographic description of an electronicfile.

filiation msdescription contains information concerning the manuscript’s filiation, i.e. its rela-tionship to other surviving manuscripts of the same text, its protographs,antigraphs and apographs.

finalRubric msdescription contains the string of words that denotes the end of a text division, oftenwith an assertion as to its author and title, usually set off from the textitself by red ink, by a different size or type of script, or by some othersuch visual device.

floatingText textstructure contains a single text of any kind, whether unitary or composite, whichinterrupts the text containing it at any point and after which the surround-ing text resumes.

floruit namesdates contains information about a person’s period of activity.foliation msdescription describes the numbering system or systems used to count the leaves or

pages in a codex.foreign core (foreign) identifies a word or phrase as belonging to some language other

than that of the surrounding text.forename namesdates contains a forename, given or baptismal name.forest nets provides for groups of rooted trees.form dictionaries (form information group) groups all the information on the written and

spoken forms of one headword.formula figures contains a mathematical or other formula.front textstructure (front matter) contains any prefatory matter (headers, title page, prefaces,

dedications, etc.) found at the start of a document, before the main body.

78

Page 79: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

fs iso-fs (feature structure) represents a feature structure, that is, a collection offeature-value pairs organized as a structural unit.

fsConstraints iso-fs (feature-structure constraints) specifies constraints on the content of validfeature structures.

fsDecl iso-fs (feature structure declaration) declares one type of feature structure.fsDescr iso-fs (feature system description (in FSD)) describes in prose what is repre-

sented by the type of feature structure declared in the enclosing fsDecl.fsdDecl iso-fs (feature system declaration) provides a feature system declaration com-

prising one or more feature structure declarations or feature structuredeclaration links.

fsdLink iso-fs (feature structure declaration link) associates the name of a typed featurestructure with a feature structure declaration for it.

funder header (funding body) specifies the name of an individual, institution, or organi-zation responsible for the funding of a project or text.

fvLib iso-fs (feature-value library) assembles a library of reusable feature valueelements (including complete feature structures).

fw transcr (forme work) contains a running head (e.g. a header, footer), catchword,or similar material appearing on the current page.

g gaiji (character or glyph) represents a non-standard character or glyph.gap core (gap) indicates a point where material has been omitted in a transcription,

whether for editorial reasons described in the TEI header, as part ofsampling practice, or because the material is illegible, invisible, orinaudible.

gb core (gathering begins) marks the point in a transcribed codex at which a newgathering or quire begins.

gen dictionaries (gender) identifies the morphological gender of a lexical item, as givenin the dictionary.

genName namesdates (generational name component) contains a name component used todistinguish otherwise similar names on the basis of the relative ages orgenerations of the persons named.

geo namesdates (geographical coordinates) contains any expression of a set of geographiccoordinates, representing a point, line, or area on the surface of the earthin some notation.

geoDecl header (geographic coordinates declaration) documents the notation and thedatum used for geographic coordinates expressed as content of the geoelement elsewhere within the document.

geogFeat namesdates (geographical feature name) contains a common noun identifying somegeographical feature contained within a geographic name, such as valley,mount, etc.

geogName namesdates (geographical name) a name associated with some geographical featuresuch as Windrush Valley or Mount Sinai.

gi tagdocs (element name) contains the name (generic identifier) of an element.gloss core identifies a phrase or word used to provide a gloss or definition for some

other word or phrase.glyph gaiji (character glyph) provides descriptive information about a character

glyph.glyphName gaiji (character glyph name) contains the name of a glyph, expressed follow-

ing Unicode conventions for character names.

79

Page 80: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

gram dictionaries (grammatical information) within an entry in a dictionary or a termino-logical data file, contains grammatical information relating to a term,word, or form.

gramGrp dictionaries (grammatical information group) groups morpho-syntactic informationabout a lexical item, e.g. pos, gen, number, case, or iType (inflectionalclass).

graph nets encodes a graph, which is a collection of nodes, and arcs which connectthe nodes.

graphic core indicates the location of an inline graphic, illustration, or figure.group textstructure contains the body of a composite text, grouping together a sequence of

distinct texts (or groups of such texts) which are regarded as a unit forsome purpose, for example the collected works of an author, a sequenceof prose essays, etc.

handDesc msdescription (description of hands) contains a description of all the different kinds ofwriting used in a manuscript.

handNote header (note on hand) describes a particular style or hand distinguished within amanuscript.

handNotes transcr contains one or more handNote elements documenting the differenthands identified within the source texts.

handShift transcr marks the beginning of a sequence of text written in a new hand, or thebeginning of a scribal stint.

head core (heading) contains any type of heading, for example the title of a section,or the heading of a list, glossary, manuscript description, etc.

headItem core (heading for list items) contains the heading for the item or gloss columnin a glossary list or similar structured list.

headLabel core (heading for list labels) contains the heading for the label or term columnin a glossary list or similar structured list.

height msdescription contains a measurement measured along the axis at right angles to thebottom of the written surface, i.e. parallel to the spine for a codex orbook.

heraldry msdescription contains a heraldic formula or phrase, typically found as part of a blazon,coat of arms, etc.

hi core (highlighted) marks a word or phrase as graphically distinct from thesurrounding text, for reasons concerning which no claim is made.

history msdescription groups elements describing the full history of a manuscript or manuscriptpart.

hom dictionaries (homograph) groups information relating to one homograph within anentry.

hyph dictionaries (hyphenation) contains a hyphenated form of a dictionary headword, orhyphenation information in some other form.

hyphenation header summarizes the way in which hyphenation in a source text has beentreated in an encoded version of it.

iNode nets (intermediate (or internal) node) represents an intermediate (or internal)node of a tree.

iType dictionaries (inflectional class) indicates the inflectional class associated with alexical item.

ident tagdocs (identifier) contains an identifier or name for an object of some kindin a formal language. ident is used for tokens such as variable names,class names, type names, function names etc. in formal programminglanguages.

80

Page 81: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

idno header (identifier) supplies any form of identifier used to identify some object,such as a bibliographic item, a person, a title, an organization, etc. in astandardized way.

if iso-fs defines a conditional default value for a feature; the condition is specifiedas a feature structure, and is met if it subsumes the feature structure inthe text for which a default value is sought.

iff iso-fs (if and only if) separates the condition from the consequence in a bicondelement.

imprimatur textstructure contains a formal statement authorizing the publication of a work,sometimes required to appear on a title page or its verso.

imprint core groups information relating to the publication or distribution of a biblio-graphic item.

incident spoken any phenomenon or occurrence, not necessarily vocalized or commu-nicative, for example incidental noises or other events affecting commu-nication.

incipit msdescription contains the incipit of a manuscript item, that is the opening words of thetext proper, exclusive of any rubric which might precede it, of sufficientlength to identify the work uniquely; such incipts were, in fomer times,frequently used a means of reference to a work, in place of a title.

index core (index entry) marks a location to be indexed for whatever purpose.institution msdescription contains the name of an organization such as a university or library, with

which a manuscript is identified, generally its holding institution.interaction corpus describes the extent, cardinality and nature of any interaction among

those producing and experiencing the text, for example in the form ofresponse or interjection, commentary, etc.

interp analysis (interpretation) summarizes a specific interpretative annotation whichcan be linked to a span of text.

interpGrp analysis (interpretation group) collects together a set of related interpretationswhich share responsibility or type.

interpretation header describes the scope of any analytic or interpretive information added tothe text in addition to the transcription.

item core contains one component of a list.join linking identifies a possibly fragmented segment of text, by pointing at the

possibly discontiguous elements which compose it.joinGrp linking (join group) groups a collection of join elements and possibly pointers.keywords header contains a list of keywords or phrases identifying the topic or nature of a

text.kinesic spoken any communicative phenomenon, not necessarily vocalized, for example

a gesture, frown, etc.l core (verse line) contains a single, possibly incomplete, line of verse.label core contains any label or heading used to identify part of a text, typically but

not exclusively in a list or glossary.lacunaEnd textcrit indicates the end of a lacuna in a mostly complete textual witness.lacunaStart textcrit indicates the beginning of a lacuna in the text of a mostly complete

textual witness.lang dictionaries (language name) name of a language mentioned in etymological or other

linguistic discussion.langKnowledge namesdates (language knowledge) summarizes the state of a person’s linguistic

knowledge, either as prose or by a list of langKnown elements.

81

Page 82: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

langKnown namesdates (language known) summarizes the state of a person’s linguistic compe-tence, i.e., knowledge of a single language.

langUsage header (language usage) describes the languages, sublanguages, registers, di-alects, etc. represented within a text.

language header characterizes a single language or sublanguage used within a text.layout msdescription describes how text is laid out on the page, including information about

any ruling, pricking, or other evidence of page-preparation techniques.layoutDesc msdescription (layout description) collects the set of layout descriptions applicable to a

manuscript.lb core (line break) marks the start of a new (typographic) line in some edition

or version of a text.lbl dictionaries (label) contains a label for a form, example, translation, or other piece

of information, e.g. abbreviation for, contraction of, literally, approxi-mately, synonyms:, etc.

leaf nets encodes the leaves (terminal nodes) of a tree.lem textcrit (lemma) contains the lemma, or base text, of a textual variation.lg core (line group) contains one or more verse lines functioning as a formal unit,

e.g. a stanza, refrain, verse paragraph, etc.licence header contains information about a licence or other legal agreement applicable

to the text.line transcr contains the transcription of a topographic line in the source documentlink linking defines an association or hypertextual link among elements or passages,

of some type not more precisely specifiable by other elements.linkGrp linking (link group) defines a collection of associations or hypertextual links.list core (list) contains any sequence of items organized as a list.listBibl core (citation list) contains a list of bibliographic citations of any kind.listChange header groups a number of change descriptions associated with either the

creation of a source text or the revision of an encoded text.listEvent namesdates (list of events) contains a list of descriptions, each of which provides

information about an identifiable event.listForest nets provides for lists of forests.listNym namesdates (list of canonical names) contains a list of nyms, that is, standardized

names for any thing.listOrg namesdates (list of organizations) contains a list of elements, each of which provides

information about an identifiable organization.listPerson namesdates (list of persons) contains a list of descriptions, each of which provides

information about an identifiable person or a group of people, forexample the participants in a language interaction, or the people referredto in a historical source.

listPlace namesdates (list of places) contains a list of places, optionally followed by a list ofrelationships (other than containment) defined amongst them.

listRef tagdocs (list of references) supplies a list of significant references to places wherethis element is discussed, in the current document or elsewhere.

listRelation namesdates provides information about relationships identified amongst people,places, and organizations, either informally as prose or as formally ex-pressed relation links.

listTranspose transcr supplies a list of transpositions, each of which is indicated at some pointin a document typically by means of metamarks.

82

Page 83: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

listWit textcrit (witness list) lists definitions for all the witnesses referred to by a criticalapparatus, optionally grouped hierarchically.

localName gaiji (locally-defined property name) contains a locally defined name for someproperty.

locale corpus contains a brief informal description of the kind of place concerned, forexample: a room, a restaurant, a park bench, etc.

location namesdates defines the location of a place as a set of geographical coordinates, interms of other named geo-political entities, or as an address.

locus msdescription defines a location within a manuscript or manuscript part, usually as a(possibly discontinuous) sequence of folio references.

locusGrp msdescription groups a number of locations which together form a distinct but dis-continuous item within a manuscript or manuscript part, according toa specific foliation.

m analysis (morpheme) represents a grammatical morpheme.macroRef tagdocs points to the specification for some pattern which is to be included in a

schemamacroSpec tagdocs (macro specification) documents the function and implementation of a

pattern.mapping gaiji (character mapping) contains one or more characters which are related

to the parent character or glyph in some respect, as specified by the typeattribute.

material msdescription contains a word or phrase describing the material of which the objectbeing described is composed.

measure core contains a word or phrase referring to some quantity of an object orcommodity, usually comprising a number, a unit, and a commodity name.

measureGrp core (measure group) contains a group of dimensional specifications which re-late to the same object, for example the height and width of a manuscriptpage.

meeting core contains the formalized descriptive title for a meeting or conference,for use in a bibliographic description for an item derived from such ameeting, or as a heading or preamble to publications emanating from it.

memberOf tagdocs specifies class membership of the documented element or class.mentioned core marks words or phrases mentioned, not used.metDecl verse (metrical notation declaration) documents the notation employed to

represent a metrical pattern when this is specified as the value of a met,real, or rhyme attribute on any structural element of a metrical text (e.g.lg, l, or seg).

metSym verse (metrical notation symbol) documents the intended significance of aparticular character or character sequence within a metrical notation,either explicitly or in terms of other symbol elements in the samemetDecl.

metamark transcr contains or describes any kind of graphic or written signal within adocument the function of which is to determine how it should be readrather than forming part of the actual content of the document.

milestone core marks a boundary point separating any kind of section of a text, typicallybut not necessarily indicating a point at which some part of a standardreference system changes, where the change is not represented by astructural element.

mod transcr represents any kind of modification identified within a single document.

83

Page 84: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

moduleRef tagdocs (module reference) references a module which is to be incorporated intoa schema.

moduleSpec tagdocs (module specification) documents the structure, content, and purpose of asingle module, i.e. a named and externally visible group of declarations.

monogr core (monographic level) contains bibliographic elements describing an item(e.g. a book or journal) published as an independent item (i.e. as aseparate physical object).

mood dictionaries contains information about the grammatical mood of verbs (e.g. indica-tive, subjunctive, imperative).

move drama (movement) marks the actual entrance or exit of one or more characterson stage.

msContents msdescription (manuscript contents) describes the intellectual content of a manuscriptor manuscript part, either as a series of paragraphs or as a series ofstructured manuscript items.

msDesc msdescription (manuscript description) contains a description of a single identifiablemanuscript or other text-bearing object.

msIdentifier msdescription (manuscript identifier) contains the information required to identify themanuscript being described.

msItem msdescription (manuscript item) describes an individual work or item within the intel-lectual content of a manuscript or manuscript part.

msItemStruct msdescription (structured manuscript item) contains a structured description for anindividual work or item within the intellectual content of a manuscriptor manuscript part.

msName msdescription (alternative name) contains any form of unstructured alternative nameused for a manuscript, such as an ocellus nominum, or nickname.

msPart msdescription (manuscript part) contains information about an originally distinctmanuscript or part of a manuscript, now forming part of a compositemanuscript.

musicNotation msdescription contains description of type of musical notation.name core (name, proper noun) contains a proper noun or noun phrase.nameLink namesdates (name link) contains a connecting phrase or link used within a name but

not regarded as part of it, such as van der or of.namespace header supplies the formal name of the namespace to which the elements

documented by its children belong.nationality namesdates contains an informal description of a person’s present or past nationality

or citizenship.node nets encodes a node, a possibly labeled point in a graph.normalization header indicates the extent of normalization or regularization of the original

source carried out in converting it to electronic form.notatedMusic figures encodes the presence of music notation in a textnote core contains a note or annotation.notesStmt header (notes statement) collects together any notes providing information about

a text additional to that recorded in other parts of the bibliographicdescription.

num core (number) contains a number, written in any form.number dictionaries indicates grammatical number associated with a form, as given in a

dictionary.numeric iso-fs (numeric value) represents the value part of a feature-value specification

which contains a numeric value or range.

84

Page 85: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

nym namesdates (canonical name) contains the definition for a canonical name or namecomponent of any kind.

oRef dictionaries (orthographic-form reference) in a dictionary example, indicates a refer-ence to the orthographic form(s) of the headword.

oVar dictionaries (orthographic-variant reference) in a dictionary example, indicates areference to variant orthographic form(s) of the headword.

objectDesc msdescription contains a description of the physical components making up the objectwhich is being described.

objectType msdescription contains a word or phrase describing the type of object being refered to.occupation namesdates contains an informal description of a person’s trade, profession or

occupation.offset namesdates that part of a relative temporal or spatial expression which indicates the

direction of the offset between the two place names, dates, or timesinvolved in the expression.

opener textstructure groups together dateline, byline, salutation, and similar phrases appear-ing as a preliminary group at the start of a division, especially of a letter.

org namesdates (organization) provides information about an identifiable organizationsuch as a business, a tribe, or any other grouping of people.

orgName namesdates (organization name) contains an organizational name.orig core (original form) contains a reading which is marked as following the

original, rather than being normalized or corrected.origDate msdescription (origin date) contains any form of date, used to identify the date of origin

for a manuscript or manuscript part.origPlace msdescription (origin place) contains any form of place name, used to identify the place

of origin for a manuscript or manuscript part.origin msdescription contains any descriptive or other information concerning the origin of a

manuscript or manuscript part.orth dictionaries (orthographic form) gives the orthographic form of a dictionary head-

word.p core (paragraph) marks paragraphs in prose.pRef dictionaries (pronunciation reference) in a dictionary example, indicates a reference

to the pronunciation(s) of the headword.pVar dictionaries (pronunciation-variant reference) in a dictionary example, indicates a

reference to variant pronunciation(s) of the headword.particDesc corpus (participation description) describes the identifiable speakers, voices, or

other participants in any kind of text.pause spoken a pause either between or within utterances.pb core (page break) marks the boundary between one page of a text and the next

in a standard reference system.pc analysis (punctuation character) a character or string of characters regarded as

constituting a single punctuation mark.per dictionaries (person) contains an indication of the grammatical person (1st, 2nd, 3rd,

etc.) associated with a given inflected form in a dictionary.performance drama contains a section of front or back matter describing how a dramatic piece

is to be performed in general or how it was performed on some specificoccasion.

persName namesdates (personal name) contains a proper noun or proper-noun phrase referringto a person, possibly including one or more of the person’s forenames,surnames, honorifics, added names, etc.

85

Page 86: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

person namesdates provides information about an identifiable individual, for example aparticipant in a language interaction, or a person referred to in a historicalsource.

personGrp namesdates (personal group) describes a group of individuals treated as a singleperson for analytic purposes.

phr analysis (phrase) represents a grammatical phrase.physDesc msdescription (physical description) contains a full physical description of a manuscript

or manuscript part, optionally subdivided using more specialized ele-ments from the model.physDescPart class.

place namesdates contains data about a geographic locationplaceName namesdates contains an absolute or relative place name.population namesdates contains information about the population of a place.pos dictionaries (part of speech) indicates the part of speech assigned to a dictionary

headword such as noun, verb, or adjective.postBox core (postal box or post office box) contains a number or other identifier for

some postal delivery point other than a street address.postCode core (postal code) contains a numerical or alphanumeric code used as part of

a postal address to simplify sorting or delivery of mail.postscript textstructure contains a postscript, e.g. to a letter.precision certainty indicates the numerical accuracy or precision associated with some

aspect of the text markup.preparedness corpus describes the extent to which a text may be regarded as prepared or

spontaneous.principal header (principal researcher) supplies the name of the principal researcher

responsible for the creation of an electronic text.profileDesc header (text-profile description) provides a detailed description of non-

bibliographic aspects of a text, specifically the languages and sublan-guages used, the situation in which it was produced, the participants andtheir setting.

projectDesc header (project description) describes in detail the aim or purpose for which anelectronic file was encoded, together with any other relevant informationconcerning the process by which it was assembled or collected.

prologue drama contains the prologue to a drama, typically spoken by an actor out ofcharacter, possibly in association with a particular performance or venue.

pron dictionaries (pronunciation) contains the pronunciation(s) of the word.provenance msdescription contains any descriptive or other information concerning a single identi-

fiable episode during the history of a manuscript or manuscript part, afterits creation but before its acquisition.

ptr core (pointer) defines a pointer to another location.pubPlace core (publication place) contains the name of the place where a bibliographic

item was published.publicationStmt header (publication statement) groups information concerning the publication or

distribution of an electronic or other text.publisher core provides the name of the organization responsible for the publication or

distribution of a bibliographic item.purpose corpus characterizes a single purpose or communicative function of the text.

86

Page 87: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

q core (quoted) contains material which is distinguished from the surroundingtext using quotation marks or a similar method, for any one of a variety ofreasons including, but not limited to: direct speech or thought, technicalterms or jargon, authorial distance, quotations from elsewhere, andpassages that are mentioned but not used.

quotation header specifies editorial practice adopted with respect to quotation marks in theoriginal.

quote core (quotation) contains a phrase or passage attributed by the narrator orauthor to some agency external to the text.

rdg textcrit (reading) contains a single reading within a textual variation.rdgGrp textcrit (reading group) within a textual variation, groups two or more readings

perceived to have a genetic relationship or other affinity.re dictionaries (related entry) contains a dictionary entry for a lexical item related to the

headword, such as a compound phrase or derived form, embedded insidea larger entry.

recordHist msdescription (recorded history) provides information about the source and revisionstatus of the parent manuscript description itself.

recording spoken (recording event) details of an audio or video recording event used as thesource of a spoken text, either directly or from a public broadcast.

recordingStmt spoken (recording statement) describes a set of recordings used as the basis fortranscription of a spoken text.

redo transcr indicates one or more cancelled interventions in a document which havesubsequently been marked as reaffirmed or repeated.

ref core (reference) defines a reference to another location, possibly modified byadditional text or comment.

refState header (reference state) specifies one component of a canonical reference de-fined by the milestone method.

refsDecl header (references declaration) specifies how canonical references are con-structed for this text.

reg core (regularization) contains a reading which has been regularized or normal-ized in some sense.

region namesdates contains the name of an administrative unit such as a state, province, orcounty, larger than a settlement, but smaller than a country.

relatedItem core contains or references some other bibliographic item which is related tothe present one in some specified manner, for example as a constituent oralternative version of it.

relation namesdates (relationship) describes any kind of relationship or linkage amongst aspecified group of objects, places, events or people.

relationGrp namesdates (relation group) provides information about relationships identifiedamongst people, places, and organizations, either informally as prose oras formally expressed relation links.

remarks tagdocs contains any commentary or discussion about the usage of an element,attribute, class, or entity not otherwise documented within the containingelement.

rendition header supplies information about the rendition or appearance of one or moreelements in the source text.

repository msdescription contains the name of a repository within which manuscripts are stored,possibly forming part of an institution.

residence namesdates (residence) describes a person’s present or past places of residence.

87

Page 88: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

resp core (responsibility) contains a phrase describing the nature of a person’sintellectual responsibility, or an organization’s role in the production ordistribution of a work.

respStmt core (statement of responsibility) supplies a statement of responsibility forthe intellectual content of a text, edition, recording, or series, wherethe specialized elements for authors, editors, etc. do not suffice or donot apply. May also be used to encode information about individuals ororganizations which have played a role in the production or distributionof a bibliographic work.

respons certainty (responsibility) identifies the individual(s) responsible for some aspect ofthe markup of particular element(s).

restore transcr indicates restoration of text to an earlier state by cancellation of aneditorial or authorial marking or instruction.

retrace transcr contains a sequence of writing which has been retraced, for example byover-inking, to clarify or fix it.

revisionDesc header (revision description) summarizes the revision history for a file.rhyme verse marks the rhyming part of a metrical line.role drama the name of a dramatic role, as given in a cast list.roleDesc drama (role description) describes a character’s role in a drama.roleName namesdates contains a name component which indicates that the referent has a

particular role or position in society, such as an official title or rank.root nets (root node) represents the root node of a tree.row figures contains one row of a table.rs core (referencing string) contains a general purpose name or referring string.rubric msdescription contains the text of any rubric or heading attached to a particular

manuscript item, that is, a string of words through which a manuscriptsignals the beginning of a text division, often with an assertion as to itsauthor and title, which is in some way set off from the text itself, usuallyin red ink, or by use of different size or type of script, or some other suchvisual device.

s analysis (s-unit) contains a sentence-like division of a text.said core (speech or thought) indicates passages thought or spoken aloud, whether

explicitly indicated in the source or not, whether directly or indirectlyreported, whether by real people or fictional characters.

salute textstructure (salutation) contains a salutation or greeting prefixed to a foreword,dedicatory epistle, or other division of a text, or the salutation in theclosing of a letter, preface, etc.

samplingDecl header (sampling declaration) contains a prose description of the rationale andmethods used in sampling texts in the creation of a corpus or collection.

schemaSpec tagdocs (schema specification) generates a TEI-conformant schema and docu-mentation for it.

scriptDesc msdescription contains a description of the scripts used in a manuscript or similarsource.

scriptNote header describes a particular script distinguished within the description of amanuscript or similar resource.

scriptStmt spoken (script statement) contains a citation giving details of the script used fora spoken text.

seal msdescription contains a description of one seal or similar attachment applied to amanuscript.

88

Page 89: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

sealDesc msdescription (seal description) describes the seals or other external items attached to amanuscript, either as a series of paragraphs or as a series of distinct sealelements, possibly with additional decoNotes.

secFol msdescription (second folio) The word or words taken from a fixed point in a codex(typically the beginning of the second leaf) in order to provide a uniqueidentifier for it.

seg linking (arbitrary segment) represents any segmentation of text below the chunklevel.

segmentation header describes the principles according to which the text has been segmented,for example into sentences, tone-units, graphemic strata, etc.

sense dictionaries groups together all information relating to one word sense in a dictionaryentry, for example definitions, examples, and translation equivalents.

series core (series information) contains information about the series in which abook or other bibliographic item has appeared.

seriesStmt header (series statement) groups information about the series, if any, to which apublication belongs.

set drama (setting) contains a description of the setting, time, locale, appearance,etc., of the action of a play, typically found in the front matter of a printedperformance text (not a stage direction).

setting corpus describes one particular setting in which a language interaction takesplace.

settingDesc corpus (setting description) describes the setting or settings within which alanguage interaction takes place, either as a prose description or as aseries of setting elements.

settlement namesdates contains the name of a settlement such as a city, town, or village identifiedas a single geo-political or administrative unit.

sex namesdates specifies the sex of a person.shift spoken marks the point at which some paralinguistic feature of a series of

utterances by any one speaker changes.sic core (Latin for thus or so) contains text reproduced although apparently

incorrect or inaccurate.signatures msdescription contains discussion of the leaf or quire signatures found within a codex.signed textstructure (signature) contains the closing salutation, etc., appended to a foreword,

dedicatory epistle, or other division of a text.soCalled core contains a word or phrase for which the author or narrator indicates a

disclaiming of responsibility, for example by the use of scare quotes oritalics.

socecStatus namesdates (socio-economic status) contains an informal description of a person’sperceived social or economic status.

sound drama describes a sound effect or musical sequence specified within a screenplay or radio script.

source msdescription describes the original source for the information contained with amanuscript description.

sourceDesc header (source description) describes the source from which an electronic textwas derived or generated, typically a bibliographic description in the caseof a digitized text, or a phrase such as "born digital" for a text which hasno previous existence.

sourceDoc transcr contains a transcription or other representation of a single sourcedocument potentially forming part of a dossier génétique or collectionof sources.

89

Page 90: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

sp core (speech) An individual speech in a performance text, or a passagepresented as such in a prose or verse text.

spGrp drama (speech group) A group of speeches or songs in a performance textpresented in a source as constituting a single unit or number.

space transcr indicates the location of a significant space in the copy text.span analysis associates an interpretative annotation directly with a span of text.spanGrp analysis (span group) collects together span tags.speaker core A specialized form of heading or label, giving the name of one or more

speakers in a dramatic text or fragment.specDesc tagdocs (specification description) indicates that a description of the specified

element or class should be included at this point within a document.specGrp tagdocs (specification group) contains any convenient grouping of specifications

for use within the current module.specGrpRef tagdocs (reference to a specification group) indicates that the declarations con-

tained by the specGrp referenced should be inserted at this point.specList tagdocs (specification list) marks where a list of descriptions is to be inserted into

the prose documentation.sponsor header specifies the name of a sponsoring organization or institution.stage core (stage direction) contains any kind of stage direction within a dramatic

text or fragment.stamp msdescription contains a word or phrase describing a stamp or similar device.state namesdates contains a description of some status or quality attributed to a person,

place, or organization often at some specific time or for a specific daterange.

stdVals header (standard values) specifies the format used when standardized date ornumber values are supplied.

street core a full street address including any name or number identifying a buildingas well as the name of the street or route on which it is located.

stress dictionaries contains the stress pattern for a dictionary headword, if given separately.string iso-fs (string value) represents the value part of a feature-value specification

which contains a string.subc dictionaries (subcategorization) contains subcategorization information (transi-

tive/intransitive, countable/non-countable, etc.)subst transcr (substitution) groups one or more deletions with one or more additions

when the combination is to be regarded as a single intervention in thetext.

substJoin transcr (substitution join) identifies a series of possibly fragmented additions,deletions or other revisions on a manuscript that combine to make up asingle intervention in the text

summary msdescription contains an overview of the available information concerning someaspect of an item (for example, its intellectual content, history, layout,typography etc.) as a complement or alternative to the more detailedinformation carried by more specific elements.

superEntry dictionaries groups a sequence of entries within any kind of lexical resource, such asa dictionary or lexicon which function as a single unit, for example a setof homographs.

supplied transcr signifies text supplied by the transcriber or editor for any reason, typicallybecause the original cannot be read because of physical damage or lossto the original.

90

Page 91: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

support msdescription contains a description of the materials etc. which make up the physicalsupport for the written part of a manuscript.

supportDesc msdescription (support description) groups elements describing the physical support forthe written part of a manuscript.

surface transcr defines a written surface as a two-dimensional coordinate space, option-ally grouping one or more graphic representations of that space, zones ofinterest within that space, and transcriptions of the writing within them.

surfaceGrp transcr defines any kind of useful grouping of written surfaces, for example therecto and verso of a single leaf, which the encoder wishes to treat as asingle unit.

surname namesdates contains a family (inherited) name, as opposed to a given, baptismal, ornick name.

surplus transcr marks text present in the source which the editor believes to be superflu-ous or redundant.

surrogates msdescription contains information about any representations of the manuscript beingdescribed which may exist in the holding institution or elsewhere.

syll dictionaries (syllabification) contains the syllabification of the headword.symbol iso-fs (symbolic value) represents the value part of a feature-value specification

which contains one of a finite list of symbols.table figures contains text displayed in tabular form, in rows and columns.tag tagdocs contains text of a complete start- or end-tag, possibly including attribute

specifications, but excluding the opening and closing markup delimitercharacters.

tagUsage header supplies information about the usage of a specific element within a text.tagsDecl header (tagging declaration) provides detailed information about the tagging

applied to a document.taxonomy header defines a typology either implicitly, by means of a bibliographic citation,

or explicitly by a structured taxonomy.tech drama (technical stage direction) describes a special-purpose stage direction that

is not meant for the actors.teiCorpus core contains the whole of a TEI encoded corpus, comprising a single corpus

header and one or more TEI elements, each containing a single textheader and a text.

teiHeader header (TEI Header) supplies the descriptive and declarative information mak-ing up an electronic title page prefixed to every TEI-conformant text.

term core contains a single-word, multi-word, or symbolic designation which isregarded as a technical term.

terrain namesdates contains information about the physical terrain of a place.text textstructure contains a single text of any kind, whether unitary or composite, for

example a poem or drama, a collection of essays, a novel, a dictionary,or a corpus sample.

textClass header (text classification) groups information which describes the nature ortopic of a text in terms of a standard classification scheme, thesaurus,etc.

textDesc corpus (text description) provides a description of a text in terms of its situationalparameters.

textLang core (text language) describes the languages and writing systems identifiedwithin the bibliographic work being described, rather than its description.

91

Page 92: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

then iso-fs separates the condition from the default in an if, or the antecedent andthe consequent in a cond element.

time core contains a phrase defining a time of day in any format.timeline linking (timeline) provides a set of ordered points in time which can be linked to

elements of a spoken text to create a temporal alignment of that text.title core contains a title for any kind of work.titlePage textstructure (title page) contains the title page of a text, appearing within the front or

back matter.titlePart textstructure contains a subsection or division of the title of a work, as indicated on a

title page.titleStmt header (title statement) groups information about the title of a work and those

responsible for its content.tns dictionaries (tense) indicates the grammatical tense associated with a given inflected

form in a dictionary.trailer textstructure contains a closing title or footer appearing at the end of a division of a

text.trait namesdates contains a description of some status or quality attributed to a person,

place, or organization typically, but not necessarily, independent of thevolition or action of the holder and usually not at some specific time orfor a specific date range.

transpose transcr describes a single textual transposition as an ordered list of at least twopointers specifying the order in which the elements indicated should bere-combined.

tree nets encodes a tree, which is made up of a root, internal nodes, leaves, andarcs from root to leaves.

triangle nets (underspecified embedding tree, so called because of its characteristicshape when drawn) Provides for an underspecified eTree, that is, an eTreewith information left out.

typeDesc msdescription contains a description of the typefaces or other aspects of the printing ofan incunable or other printed source.

typeNote header describes a particular font or other significant typographic feature distin-guished within the description of a printed resource.

u spoken (utterance) a stretch of speech usually preceded and followed by silenceor by a change of speaker.

unclear core contains a word, phrase, or passage which cannot be transcribed withcertainty because it is illegible or inaudible in the source.

undo transcr indicates one or more marked-up interventions in a document which havesubsequently been marked for cancellation.

unicodeName gaiji (unicode property name) contains the name of a registered Unicodenormative or informative property.

usg dictionaries (usage) contains usage information in a dictionary entry.vAlt iso-fs (value alternation) represents the value part of a feature-value specifica-

tion which contains a set of values, only one of which can be valid.vColl iso-fs (collection of values) represents the value part of a feature-value specifi-

cation which contains multiple values organized as a set, bag, or list.

92

Page 93: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.12 TEI reference material: summary of elements

vDefault iso-fs (value default) declares the default value to be supplied when a featurestructure does not contain an instance of f for this name; if unconditional,it is specified as one (or, depending on the value of the org attribute ofthe enclosing fDecl) more fs elements or primitive values; if conditional,it is specified as one or more if elements; if no default is specified, or nocondition matches, the value none is assumed.

vLabel iso-fs (value label) represents the value part of a feature-value specificationwhich appears at more than one point in a feature structure.

vMerge iso-fs (merged collection of values) represents a feature value which is theresult of merging together the feature values contained by its children,using the organization specified by the org attribute.

vNot iso-fs (value negation) represents a feature value which is the negation of itscontent.

vRange iso-fs (value range) defines the range of allowed values for a feature, in the formof an fs, vAlt, or primitive value; for the value of an f to be valid, it mustbe subsumed by the specified range; if the f contains multiple values (assanctioned by the org attribute), then each value must be subsumed bythe vRange.

val tagdocs (value) contains a single attribute value.valDesc tagdocs (value description) specifies any semantic or syntactic constraint on the

value that an attribute may take, additional to the information carried bythe datatype element.

valItem tagdocs documents a single attribute-value within a list of possible or mandatoryitems.

valList tagdocs (value list) contains one or more valItem elements defining possiblevalues for an attribute.

value gaiji (value) contains a single value for some property, attribute, or otheranalysis.

variantEncoding textcrit declares the method used to encode text-critical variants.view drama describes the visual context of some part of a screen play in terms of what

the spectator sees, generally independent of any dialogue.vocal spoken any vocalized but not necessarily lexical phenomenon, for example

voiced pauses, non-lexical backchannels, etc.w analysis (word) represents a grammatical (not necessarily orthographic) word.watermark msdescription contains a word or phrase describing a watermark or similar device.when linking indicates a point in time either relative to other elements in the same

timeline tag, or absolutely.width msdescription contains a measurement measured along the axis parallel to the bottom

of the written surface, i.e. perpendicular to the spine of a book or codex.wit textcrit contains a list of one or more sigla of witnesses attesting a given reading,

in a textual variation.witDetail textcrit (witness detail) gives further information about a particular witness, or

witnesses, to a particular reading.witEnd textcrit (fragmented witness end) indicates the end, or suspension, of the text of

a fragmentary witness.witStart textcrit (fragmented witness start) indicates the beginning, or resumption, of the

text of a fragmentary witness.witness textcrit contains either a description of a single witness referred to within the

critical apparatus, or a list of witnesses which is to be referred to by asingle sigil.

93

Page 94: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

writing spoken a passage of written text revealed to participants in the course of a spokentext.

xr dictionaries (cross-reference phrase) contains a phrase, sentence, or icon referring thereader to some other location in this or another text.

zone transcr defines any two-dimensional area within a surface element.

94

Page 95: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.13 Wilfred Owen: Letter To Leslie Gunston

5.13 Wilfred Owen: Letter To Leslie Gunston

95

Page 96: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

96

Page 97: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.14 Wilfred Owen: Preface MS

5.14 Wilfred Owen: Preface MS

97

Page 98: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

5.15 Stuart Lee interviews Ian Hislop (fragment)

[gap for sampling purposes]

Lee (24.27-24.36):So em d-em having read [clicking sound: 0.17s] the Wipers Timesnow and and your [pause: 0.62s] view thatth th the thirties was that thatprisonment you say

Hislop (24.36):Yeah.

Lee (24.36-24.42)[clicking sound: 1.28s] looking backon a failed piece [pause: 0.35s]has your attitudeto the to the War PoetsWilfred Owen Sassoon [pause: 0.30s]changed over time?

Hislop (24.42-27.09):[pause: 0.76s]Not really I meanI’ve-I read the Owen again, em [pause: 0.20s]very recentlyand just thought how brilliant [laughs in -iant syllable][pause: 0.40s] um those poems are [pause: 0.50s]um [pause: 0.65s] and it’s [pause: 0.70s] it’s not that um [pause: 0.70s]I don’t think they’re any goodI just think there were other voices [pause: 0.50s]um and [pause: 0.50s]they have been takenas all there wasI mean I think they doexpress the horror [pause: 0.63s]quite wonderfully [pause: 0.50s]um it’s like the [pause: 0.45s]the painting of the um [pause: 0.55s]’The Gassed’ [Note: John Singer Sargent, 1918][pause: 0.60s] um that’s um Sargent

Lee (25.13):Yes.

Hislop (25.13-26.35)um [pause: 0.50s] which I saw again andjust was completely blown awayby how brilliant he wasand that [pause: 1.15s] that is what theFirst World War was about [pause: 0.75s][clicking sound:0.06s]but [pause: 0.95s]

98

Page 99: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5.15 Stuart Lee interviews Ian Hislop (fragment)

there were some other [clicking sound: 0.5s] bits [pause: 0.20s]um, and it is [pause: 0.50s] I thinkMacDonald has done it really well inhis ’Voices’ and againin those pictures you you [pause: 0.75s]yes there is the horror [pause: 0.40s]um and um horror like you haven’t seen beforeand that’s expressedbut [pause: 0.60s] um [pause: 0.65s]you knowit lasted for four yearsand [pause: 0.81s] not every day was the first dayof the Battle of the Somme [pause: 0.70s]and there are [pause: 1.05s]other things that were happening [pause: 0.90s]in the trenches and whatever that wereum [pause: 0.70s] worth rememberingA and it was so [pause: 0.40s]certainly vital in changinghow Britain was [pause: 1.80s]And when I came [pause: 0.65s]I was doing that programme about the memorialsand I thought [pause: 1.30s]the first one I saw where [pause: 0.60s]there were no ranks [pause: 1.40s]the dead were all just listed [pause: 0.80s]um and this is the first time in British history [pause: 0.40s] um [pause: 0.35s]and it was because of the people in the trenches said"we fought together, [pause: 0.30s] we’re gonna die together" [pause: 0.35s] um [pause: 0.40s]and it was that feeling of [pause: 0.80s]well [pause: 0.75s] something reallyfundamental had changed [Note: syllables -damental while laughing] [[pause: 0.45s]um, and when they came back [pause: 0.65s][clicking sound: 0.8s] you know [pause: 0.82s]flawed for the rest of the century [pause: 0.45s]that’s when it all happened [pause: 0.30s]

Lee (26.35)hmm

Hislop (26.36-27.09)I think it’s difficult to read the history of the centurycertainly the middle bits of it [pause: 0.30s]without remembering that [pause: 1.20s]someone said there was an enormous black cloud over [pause: 1.23s]Britain [pause: 1.12s] of grief [pause: 1.00s]um [pause: 0.50s] and it’s just the fact thateveryone had lost someone [pause: 0.90s]and um [pause: 0.60s]we tend to assume that everyone goes onand it’s normal and they do their thingsbut actually for almost the entire country [pause: 0.83s]

99

Page 100: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to XML and the Text Encoding Initiative

everybody was spending every daysome bit of it thinking [pause: 0.92s]"they’re dead" [pause: 1.45s] I always find that extraordinary [pause: 0.30s]

[gap for sampling purposes]

100

Page 101: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.1 Timetable

6 Workshop: Working with TEI TextsThis advanced workshop will teach how to do something practical with your TEI XML texts beyondsimply converting them to HTML and putting them on the web. A mixture of talks and practical exerciseswill take participants through:

• Advanced validation and integrity checking using TEI ODD, Schematron and XSLT

• Transforming your TEI XML to formats other than HTML (Word, ePub, LaTeX etc)

• Extracting data from TEI texts for further analysis (eg names and places)

• Processing some more complex TEI documents (eg genetic encoding and timelines)

• Storing TEI documents in an XML database and querying them

Our document sets will consist of some 18th century ECCO texts (full set from http://www.ota.ox.ac.uk/catalogue/), the diaries of William Godwin of the years 1788-1791(full set from http://godwindiary.bodleian.ox.ac.uk/godwindiary.zip), a setof Greek epigraphical records (full set from http://irt.kcl.ac.uk/irt2009/redist/inscr/irt2009-P5.zip), and a poem of Wilfred Owen.

6.1 Timetable

When SubjectMonam

The ODD system and using constraints in Schematron

Monpm

a) write ODD from scratch with embedded Schematron constraint and check doc-ument instances; b) write an XSLT stylesheet which analyzes some aspect of thedocument set.

Tuesam

Processing some more complex TEI documents, and XSLT / XPath techniquesneeded

Tuespm

Write XSLT stylesheet to display facsimile encoding (<facsimile> and<sourceDoc> elements) in web page, using combination of HTML and CSS.

Wedam

Extracting and summarizing data in TEI texts, looking for names, dates and places.Understanding functions, grouping and sorting techniques in XSLT.

Wedpm

Extract catalogue of names and dates, and visualize the results by creating CSV fileand loading it into spreadsheet.

Thursam

XML databases and techniques for managing large-scale collections. Demonstrationof systems including eXist, BaseX, Cocoon, Solr, etc. Understanding XQuery.

Thurspm

Set up BaseX database, import XML files, and make XQueries against the system togenerate HTML files

Friam

Transforming TEI XML to and from formats other than HTML (Word, ePub, LaTeXetc)

Fripm

Set up a local instance of the TEI stylesheet family; define a Word template; createa Word document; develop stylesheet to turn the Word into TEI XML. Experimentwith round-tripping.

Requirements: You must already have a good basic knowledge of XML, TEI and some familiarity withprogramming/scripting ideas. Most of the work will be based on XSLT and XPath.

101

Page 102: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

6.2 Data samples6.2.1 ECCO

<TEI xmlns="http://www.tei-c.org/ns/1.0"><teiHeader><fileDesc><titleStmt><title>The carpenter: or, the danger of evil

company.</title><author>More, Hannah, 1745-1833.</author>

</titleStmt><publicationStmt><publisher>University of Michigan Library</publisher><pubPlace>Ann Arbor, Michigan</pubPlace><date>2009 April</date><availability><p>These pages may be freely searched and displayed.

Permission must be received for subsequentdistribution in print or electronically. Please goto http://www.lib.umich.edu/tcp/ecco for moreinformation.</p>

</availability><idno type="STC">n001762</idno><idno type="TCP">K001133.000</idno><idno type="BIBNO">cw3317776505</idno><idno type="ECRP">0098301900</idno>

</publicationStmt><sourceDesc><biblFull><titleStmt><title>The carpenter: or, the danger of evil

company.</title><author>More, Hannah, 1745-1833.</author>

</titleStmt><extent>1 sheet : ill. ; 10.</extent><publicationStmt><pubPlace>[Bath] :</pubPlace><publisher>Sold by S. Hazard, at Bath; by J.

Marshall, Cheap-side, and Aldermary church-yard;R. White, London; and by all booksellers, newsmen,and hawkers, in town and country,</publisher>

<date>[1795?]</date></publicationStmt><notesStmt><note>Signed at end: Z, i.e. Hannah More.</note><note>Verse.</note><note>At head of title: Cheap repository.</note><note>Reproduction of original from the Harvard

University Houghton Library.</note><note>English Short Title Catalog, ESTCN1762.</note><note>Electronic data. Farmington Hills, Mich. :

Thomson Gale, 2003. Page image (PNG). Digitizedimage of the microfilm version produced inWoodbridge, CT by Research Publications, 1982-2002(later known as Primary Source Microfilm, animprint of the Gale Group).</note>

</notesStmt></biblFull>

</sourceDesc></fileDesc><encodingDesc><projectDesc>

102

Page 103: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

<p>Created by converting TCP files to TEI P5 usingtcp2tei.xsl, TEI @ Oxford. </p>

</projectDesc><editorialDecl n="4"><p>This electronic text file was keyed from page images and

partially proofread for accuracy. Character capture andencoding have been done following the guidelines of theECCO Text Creation Partnership, which correspond roughlyto the recommendations found in Level 4 of the TEI inLibraries Guidelines. Digital page images are linked tothe text file.</p>

</editorialDecl></encodingDesc><profileDesc><langUsage><language ident="eng">eng</language>

</langUsage></profileDesc>

</teiHeader><text xml:lang="eng"><body><div type="poem"><pb facs="1" rend="none"/><head>THE CARPENTER; Or, the DANGER of EVIL COMPANY.</head><lg><l>THERE was a young West-country man,</l><l>A Carpenter by trade;</l><l>A skilful wheelwright too was he,</l><l>And few such Waggons made.</l>

</lg><lg><l>No Man a tighter Barn cou’d build,</l><l>Throughout his native town,</l><l>Thro’ many a village round was he,</l><l>The best of workmen known.</l>

</lg><lg><l>His father left him what he had,</l><l>In sooth it was enough;</l><l>His shining pewter, pots of brass,</l><l>And all his household stuff.</l>

</lg><lg><l>A little cottage too he had,</l><l>For ease and comfort plann’d,</l><l>And that he might not lack for ought,</l><l>An acre of good land.</l>

</lg></div>

</body><back><div type="colophon"><p>Sold by S. HAZARD, (PRINTER to the CHEAP REPOSITORY for

Religious and Moral Tracts) at BATH; By J. MARSHALL,PRINTER to the CHEAP REPOSITORIES No. 17, Queen-Street,Cheap-Side, and No. 4, Aldermary Church-Yard; R. WHITE,Piccadilly, LONDON; and by all Booksellers, Newsmen, andHawkers, in Town and Country.--Great Allowance will bemade to Shopkeepers and Hawkers.</p>

<p>Price an Half-penny, or 2s. 3d, per 100. 1s. 3d, for 50,9d. for 25.</p>

</div>

103

Page 104: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

</back></text>

</TEI>

6.2.2 Godwin

<div xml:id="g1798" type="dYear"><div type="dMonth" xml:id="g1798-01"><ab type="dDay" xml:id="g1798-01-01"><date when="1798-01-01">Jan. 1. 1798. M.</date><ref type="dText" subtype="read" target="/bibl/te0807.html">Burke’s

3<hi rend="sup">rd</hi> Letter, p. 34</ref>:<ref type="dText" subtype="read" target="/bibl/te0808.html">Rival

Queens, acts 1, 2, 3</ref>. <seg type="dMeeting" subtype="CG"><persName ref="/people/FAW01.html">Fawcet</persName>

calls</seg>: <seg type="dMeal" subtype="SG"><persName ref="/people/MAR01.html">M</persName> sups</seg>.

<seg type="dMeeting" subtype="M">meet<persName>Barnes</persName></seg>.</ab>

<ab type="dDay" xml:id="g1798-01-02"><date when="1798-01-02">2. Tu.</date><ref type="dWrote" subtype="write" target="/works/leon01.html">O. M., p. 2,

3</ref>. <ref type="dText" subtype="read" target="/bibl/te0810.html">Burke’sMemorials, p. 40</ref>.

<seg type="dMeeting" subtype="CG"><persName ref="/people/COO05.html">Miss Cooper</persName>

</seg>, <seg type="dMeeting" subtype="CG"><persName ref="/people/HOL10.html">mrs Cole</persName>

</seg>, <seg type="dMeeting" subtype="CG"><persName ref="/people/HOL06.html">F Ht</persName>

</seg> & <seg type="dMeeting" subtype="CG"><persName ref="/people/FEN01.html">F</persName> call</seg>:

<seg type="dMeal" subtype="D">dine at <persName ref="/people/JOH01.html"><placeName type="venue">Johnson’s</placeName>

</persName>, w. <persName ref="/people/FUS01.html">Fuseli</persName> &<persName>Wilkinson</persName>. </seg>

<seg type="dMeeting" subtype="See"><persName ref="/people/CAR01.html">Carlisle</persName> &

<persName ref="/people/COM01.html">Combe</persName></seg>.</ab>

<ab type="dDay" xml:id="g1798-01-03"><date when="1798-01-03">3. W.</date><ref type="dText" subtype="read" target="/bibl/te0810.html">Memorials, p.

122</ref>. <seg type="dMeeting" subtype="CG"><persName ref="/people/COM01.html">Combe</persName>

</seg> & <seg type="dMeeting" subtype="CG"><persName ref="/people/WHI03.html">White</persName>

call</seg>: <seg type="dMeeting" subtype="C">call on<persName ref="/people/LES02.html" type="nah">

<placeName type="venue">Leslie</placeName></persName> n</seg>, <seg type="dMeeting" subtype="C"><persName ref="/people/KEA01.html" type="nah"><placeName type="venue">Kearsley</placeName>

</persName> n</seg>, & <seg type="dMeeting" subtype="C"><persName ref="/people/NIC01.html" type="nah"><placeName type="venue">Nicholson</placeName>

</persName> n</seg>. <ref type="dEntertainment" subtype="Theat" target="/plays/cast01.html"><placeName type="DL"/>Theatre, 3/10 Castle

Spectre</ref>.</ab>

104

Page 105: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

</div></div>

6.2.3 IRT

<TEI xmlns="http://www.tei-c.org/ns/1.0"><teiHeader><fileDesc><titleStmt><title><rs type="textType">Dedication</rs> to Geta </title>

<editor>J. M. Reynolds</editor><editor>J. B. Ward-Perkins</editor>

</titleStmt><publicationStmt><authority>Centre for Computing in the Humanities, King’s

College London</authority><idno type="filename">IRT036</idno><availability><p>Creative Commons licence Attribution UK 2.0

(<ref>http://creativecommons.org/licenses/by/2.0/uk/</ref>). </p><p>All reuse or distribution of this work must contain

somewhere a link back to the URL<ref>http://irt.kcl.ac.uk/</ref></p>

</availability></publicationStmt><sourceDesc><bibl xml:id="irt1952"><author>J. M. Reynolds</author> and <author>J. B.

Ward-Perkins</author>, <title level="m">TheInscriptions of Roman Tripolitania</title>,

<pubPlace>Rome</pubPlace>: <publisher>British Schoolat Rome</publisher>, <date>1952</date>. </bibl>

<msDesc><msIdentifier/><physDesc><objectDesc><supportDesc><support><p> Impression left by a fragment from the lower

part of a lost <material>marble</material><rs type="objectType">panel</rs> (approx. <dimensions><width unit="metre">0.34</width><height unit="metre">0.39</height>

</dimensions>).</p></support>

</supportDesc><layoutDesc><layout>Original text inscribed within a moulded

border. </layout></layoutDesc>

</objectDesc><handDesc><handNote>Capitals: <height unit="metre">0.07</height>. </handNote>

</handDesc></physDesc><history><origin><p>Unknown</p><origDate notBefore="0198-12-10" notAfter="0199-12-09" evidence="titulature">Between

105

Page 106: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

10th Dec. A.D. 198 and 9th Dec. 199.(titulature)</origDate>

</origin><provenance><listEvent><event type="found"><p><placeName

type="ancientFindspot"ref="http://atlantides.org/batlas/abrotonum-sabratha-35-e2"key="db659">Sabratha</placeName>: <rs type="monuList" key="db853">Office

Baths</rs>,re-used in the fourth century pavement of the SCaldarium. </p>

</event><event type="observed"><p>Findspot</p>

</event></listEvent>

</provenance></history>

</msDesc></sourceDesc>

</fileDesc><encodingDesc><p>Marked-up according to the EpiDoc Guidelines version 8</p>

</encodingDesc><profileDesc><langUsage><language ident="ar">Arabic</language><language ident="en">English</language><language ident="fr">French</language><language ident="de">German</language><language ident="grc">Ancient Greek</language><language ident="grc-Latn">Transliterated Greek</language><language ident="el">Modern Greek</language><language ident="he">Hebrew</language><language ident="it">Italian</language><language ident="la">Latin</language><language ident="phn-LY">Punic</language><language ident="ber-Latn">Native Libyan language in Latin

script</language></langUsage><textClass/><textClass><keywords scheme="IRCyr"><term><geogName type="ancientRegion" key="Tripolitania">Tripolitania</geogName>

</term><term><geogName type="modernCountry" key="LY">Libya</geogName>

</term><term><placeName

type="modernFindspot"key="http://www.geonames.org/2208578/marsa-zawaghah.html">Marsa

Zawaghah</placeName></term>

</keywords></textClass>

</profileDesc></teiHeader>

106

Page 107: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

<text><body><div type="bibliography"><head>Bibliography</head><p>Not previously published.</p>

</div><div subtype="text-constituted-from" type="history"><head>Text constituted from</head><p>Transcription (Reynolds, Ward-Perkins)</p>

</div><div type="edition" xml:lang="la" xml:space="preserve"><head xml:lang="en">Text</head><ab><lb n="1"/><persName type="emperor"><supplied reason="lost"><name type="praenomen" nymRef="Publius"><expan><abbr>P</abbr><ex>ublio</ex>

</expan></name><name type="gentilicium" nymRef="Septimius">Septimio</name><name type="cognomen" nymRef="Geta">Getae</name><w lemma="nobilis">nobilissimo</w><name type="cognomen" nymRef="Caesar">Caesari</name>

</supplied></persName><lb n="2"/><supplied reason="lost"><w lemma="imperator"><expan><abbr>Imp</abbr><ex>eratoris</ex>

</expan></w>

</supplied><persName type="emperor"><supplied reason="lost"><name type="cognomen" nymRef="Caesar"><expan><abbr>Caes</abbr><ex>aris</ex>

</expan></name><name type="praenomen" nymRef="Lucius"><expan><abbr>L</abbr><ex>uci</ex>

</expan></name><name type="gentilicium" nymRef="Septimius">Septimi</name><name type="cognomen" nymRef="Seuerus">Seueri</name><name type="cognomen" nymRef="Pius">Pii</name><name type="cognomen" nymRef="Pertinax">Pertinacis</name>

</supplied><lb n="3"/><supplied reason="lost"><name type="cognomen" nymRef="Augustus"><expan><abbr>Aug</abbr><ex>usti</ex>

107

Page 108: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

</expan></name>

</supplied><name><supplied reason="lost">Arab</supplied>ici</name>

<name type="cognomen" nymRef="Adiabenicus">Ad<supplied reason="lost">iabenici</supplied></name><supplied reason="lost"><name type="cognomen" nymRef="Parthicus">Parthici</name><w lemma="magnus">maximi</w>

</supplied></persName><lb n="4"/><supplied reason="lost"><w lemma="tribunicius"><expan><abbr>trib</abbr><ex>unicia</ex>

</expan></w><w lemma="potestas"><expan><abbr>pot</abbr><ex>estate</ex>

</expan></w>

</supplied><num value="7">VII</num><w lemma="imperator"><expan><abbr>imp</abbr><ex>eratoris</ex>

</expan></w><num value="11">X<supplied reason="lost">I</supplied></num><supplied reason="lost"><w lemma="consul"><expan><abbr>co</abbr><ex>n</ex><abbr>s</abbr><ex>ulis</ex>

</expan></w><num value="2">II</num><w lemma="pater"><expan><abbr>p</abbr><ex>atris</ex>

</expan></w><w lemma="patria"><expan><abbr>p</abbr><ex>atriae</ex>

</expan></w><w lemma="proconsul"><expan><abbr>proco</abbr><ex>n</ex>

108

Page 109: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

<abbr>s</abbr><ex>ulis</ex>

</expan></w>

</supplied><supplied reason="lost"><w lemma="filius">filio</w><w lemma="et">et</w>

</supplied><supplied reason="lost"><w lemma="imperator"><expan><abbr>Imp</abbr><ex>eratoris</ex>

</expan></w>

</supplied><lb n="5"/><persName type="emperor"><supplied reason="lost"><name type="gentilicium" nymRef="Antoninus">Antonini</name>

</supplied><name type="cognomen" nymRef="Augustus"><expan><abbr><supplied reason="lost">Au</supplied>g</abbr>

<ex>usti</ex></expan>

</name></persName><w lemma="frater">fratri</w><gap reason="lost" extent="unknown" unit="character"/></ab>

</div><div type="translation" xml:space="preserve"><head>Translation</head><p><supplied reason="lost">To Publius Septimius Geta,

most noble Caesar, son of Emperor Caesar Lucius Septimius Severus,Pius, Pertinax, Augustus</supplied> Victor in Arabia,

Victor in Adiabene, <supplied reason="lost">greatest Victor in Parthia,holding tribunician power for the</supplied> seventh time, acclaimed

victor <supplied reason="lost">eleven times, consul twice,father of the country, proconsul and</supplied> brother

<supplied reason="lost">of Emperor Antoninus</supplied> Augustus<gap reason="lost"/></p>

</div><div type="commentary"><head>Commentary</head><p>l. 4. trib. pot. VII. 10 Dec. 198 to 9 Dec. 199.</p>

</div></body>

</text></TEI>

6.2.4 Wilfred Owen

<TEI xmlns="http://www.tei-c.org/ns/1.0"><teiHeader type="text"><fileDesc><titleStmt><title>

109

Page 110: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

<orgName ref="#TEI_Consortium">TEI</orgName> LearningEnvironment - <persName ref="#Wilfred_Owen">Wilfred

Owen</persName></title><principal>Encoded by <persName ref="#Elena_Pierazzo">Elena

Pierazzo</persName> and <persName ref="#Renée_van_Baalen">Renee vanBaalen</persName>

</principal></titleStmt><publicationStmt><publisher><orgName ref="#TEI_Consortium">TEI Consortium</orgName>

</publisher><distributor><orgName ref="#OUCS"><placeName ref="#Oxford">Oxford</placeName>

University Computing Services</orgName></distributor><authority><persName ref="#Sebastian_Rahtz">Sebastian

Rahtz</persName></authority><pubPlace><placeName ref="#Oxford">Oxford</placeName>

</pubPlace><address><street>13 Banbury Road</street><postCode>OX2 6NN</postCode><placeName><settlement ref="#Oxford">Oxford</settlement><country ref="#UK">United Kingdom</country>

</placeName></address><availability><p><ref

target="http://creativecommons.org/licenses/by-nc-sa/3.0/"><orgName ref="#Creative_Commons">Creative

Commons</orgName>Attribution-NonCommercial-ShareAlike 3.0 UnportedLicense.</ref>

</p><p>First draft <orgName ref="#TEI_Consortium">TEI</orgName> Learning

Environment by <persName ref="#Renée_van_Baalen">Renee vanBaalen</persName>, 2012-01-10.</p>

</availability><date when="2012-01-10">10th January 2012</date>

</publicationStmt><sourceDesc><biblFull xml:id="poems"><titleStmt><title type="full"><title type="main">Joint Information System’s

Committee Technology Applications Program(JTAP)</title>

<title type="sub">Project ’Virtual Seminars forTeaching Literature’</title>

<title type="sub">Tutorial 4</title><title type="sub">’Strange Meeting’</title>

</title><author><persName ref="#Wilfred_Owen">Wilfred

110

Page 111: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

Owen</persName></author><editor role="director"><persName ref="#Stuart_Lee">Stuart Lee</persName>

</editor><editor role="project_officer"><persName>Paul Groves</persName>

</editor><editor role="encoder"><persName ref="#Elena_Pierazzo">Elena

Pierazzo</persName></editor>

</titleStmt><publicationStmt><publisher><orgName ref="#University_of_Oxford">University of<placeName ref="#Oxford">Oxford</placeName></orgName>

</publisher><pubPlace><placeName ref="#Oxford">Oxford</placeName>

</pubPlace><date from="1996-10" to="1998-10">From October 1996

to October 1998</date></publicationStmt><sourceDesc><biblFull><titleStmt><title type="full"><title type="main">The First World War Poetry

Digital Archive</title><title type="sub">The <persName ref="#Wilfred_Owen">Wilfred

Owen</persName>Collection</title>

<title type="sub">Poems by <persName ref="#Wilfred_Owen">WilfredOwen</persName>

</title><title type="sub">Strange Meeting</title>

</title><author><persName ref="#Wilfred_Owen">Wilfred

Owen</persName></author><editor role="director"><persName ref="#Stuart_Lee">Stuart Lee</persName>

</editor><editor role="project_manager"><persName ref="#Kate_Lindsay">Kate

Lindsay</persName></editor><editor role="technical_specialist"><persName ref="#Michael_Loizou">Michael

Loizou</persName></editor><editor role="cataloguer"><persName ref="#Everett_Sharp">Everett

Sharp</persName></editor><editor role="cataloguer"><persName ref="#Alisa_Miller">Alisa

Miller</persName></editor>

111

Page 112: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

<editor role="research_officer"><persName ref="#Alun_Edwards">Alun

Edwards</persName></editor><editor role="web_developer"><persName ref="#Richard_Doe">Richard

Doe</persName></editor><editor role="web_designer"><persName ref="#Joseph_Talbot">Joseph

Talbot</persName></editor>

</titleStmt><editionStmt><edition>First edition</edition>

</editionStmt><publicationStmt><publisher><orgName ref="#University_of_Oxford">University of<placeName ref="#Oxford">Oxford</placeName></orgName>

</publisher><pubPlace><placeName ref="#Oxford">Oxford</placeName>

</pubPlace><date when="2008-11-11">11th November 2008</date>

</publicationStmt><sourceDesc><msDesc xml:id="strange_meeting"><msIdentifier><placeName ref="#London"><settlement>London</settlement>, <country ref="#UK">United

Kingdom</country></placeName><institution>The British Library</institution><repository>The Wilfred Owen Literary

Estate</repository><idno>This is no. 148 in ed. ’The Complete Poems

and Fragments’.</idno></msIdentifier><msContents><msItem><title>Strange Meeting</title><author><persName ref="#Wilfred_Owen">Wilfred

Owen</persName></author><docImprint><pubPlace><placeName ref="#Scarborough">Scarborough</placeName>

</pubPlace><address><placeName><country ref="#UK">United Kingdom</country>

</placeName></address><date from="1918-01" to="1918-03">January to March

1918</date></docImprint>

</msItem></msContents><physDesc>

112

Page 113: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

<objectDesc form="poem"><supportDesc><support><dimensions type="leaves"/>

</support><extent>One leaf, handwritten double

sided.</extent></supportDesc>

</objectDesc><handDesc><p>Handwritten by <persName ref="#Wilfred_Owen">Wilfred

Owen</persName>.</p></handDesc>

</physDesc></msDesc>

</sourceDesc></biblFull>

</sourceDesc></biblFull><listPlace><place xml:id="London"><placeName><settlement>London</settlement><country ref="#UK">United Kingdom</country>

</placeName></place><place xml:id="Oxford"><placeName><settlement>Oxford</settlement><country ref="#UK">United Kingdom</country>

</placeName></place><place xml:id="Belgium"><placeName><country ref="#Belgium">Belgium</country>

</placeName></place><place xml:id="Flanders"><placeName><region ref="#Flanders">Flanders</region><country ref="#Belgium">Belgium</country>

</placeName></place><place xml:id="France"><placeName><country>France</country>

</placeName></place><place xml:id="Germany"><placeName><country ref="#Germany">Germany</country>

</placeName></place><place xml:id="Hell"><placeName><geogName>Hades</geogName><geogName>Hell</geogName><geogName>Underworld</geogName>

</placeName><note><p>Fictional geographic location.</p>

</note>

113

Page 114: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

</place><place xml:id="Heytesbury"><placeName><settlement ref="#Heytesbury">Heytesbury</settlement><country ref="#UK">United Kingdom</country>

</placeName></place><place xml:id="Ireland"><placeName>Ireland</placeName><note>Kingdom of the <placeName>United

Kingdom</placeName></note>

</place><place xml:id="Ors"><placeName><settlement>Ors</settlement><country ref="#France">France</country>

</placeName></place><place xml:id="Oswestry"><placeName><settlement>Oswestry</settlement><country>United Kindom</country>

</placeName></place><place xml:id="Prussia"><placeName><region ref="#Prussia">Prussia</region><country ref="#Germany">Germany</country>

</placeName></place><place xml:id="Ripon"><placeName><settlement ref="#Ripon">Ripon</settlement><country ref="#UK">United Kingdom</country>

</placeName></place><place xml:id="Scarborough"><placeName><settlement>Scarborough</settlement><country ref="#UK">United Kingdom</country>

</placeName></place><place xml:id="Tipperary"><placeName><settlement ref="#Tipperary">Tipperary</settlement><country ref="#Ireland">Ireland</country>

</placeName></place><place xml:id="UK"><placeName><country ref="#UK">United Kingdom</country>

</placeName><placeName type="alt">British colonies and

kingdoms</placeName></place><place xml:id="USA"><placeName><country ref="#USA">United Stated of

America</country></placeName>

</place>

114

Page 115: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

<place xml:id="Weirleigh"><placeName><settlement>Weirleigh</settlement><country ref="#UK">United Kingdom</country>

</placeName></place>

</listPlace><listPerson><person role="cataloguer" xml:id="Alisa_Miller"><persName>Alisa Miller</persName><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/about/staff.html">Alisa Miller’sbiography at the First World War

Poetry Digital Archive</ref></note>

</person><person role="research_officer" xml:id="Alun_Edwards"><persName>Alun Edwards</persName><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/about/staff.html">Alun Edwards’biography at the First World War

Poetry Digital Archive</ref></note>

</person><person role="publisher" sex="1" xml:id="Andrew_Chatto"><persName>Chatto</persName><persName>Andrew Chatto</persName><note><ref

target="http://www.randomhouse.co.uk/about-us/about-us/companies/uk-companies-and-imprints/vintage-publishing/chatto-windus">AndrewChatto’s biography at Random House</ref>

</note></person><person role="mythological_hero" sex="1" xml:id="Antaeus"><persName xml:lang="latin" ref="#Antaeus">Antaeus</persName><persName xml:lang="greek" ref="#Antaeus">Antaios</persName><note><ref

target="http://www.theoi.com/Gigante/GiganteAntaios.html">Antaeus attheoi.com</ref>

</note></person><person role="editor" xml:id="Elena_Pierazzo"><persName>Elena Pierazzo</persName><note><ref

target="http://www.kcl.ac.uk/artshums/depts/ddh/people/core/pierazzo/index.aspx">ElenaPierazzo’s biography at King’s

College</ref></note>

</person><person role="cataloguer" xml:id="Everett_Sharp"><persName>Everett Sharp</persName><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/about/staff.html">Everett Sharp’sbiography at the First World War

Poetry Digital Archive</ref></note>

</person>

115

Page 116: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

<person role="mythological_hero" sex="1" xml:id="Heracles"><persName xml:lang="greek" ref="#Heracles">Heracles</persName><persName xml:lang="latin" ref="#Heracles">Hercules</persName><persName>Herk</persName><note><ref

target="http://www.theoi.com/greek-mythology/heracles.html">Heracles attheoi.com</ref>

</note></person><person role="manager" xml:id="James_Cummings"><persName>James Cummings</persName><note><ref

target="http://digital.humanities.ox.ac.uk/PeopleProfile/person_profile_page.aspx?pid=119">JamesCummings’ biography at Oxford

University</ref></note>

</person><person role="philosopher" sex="1" xml:id="John_Locke"><persName>John Locke</persName><persName>Locke</persName><note><ref

target="http://plato.stanford.edu/entries/locke/">John Locke at StanfordEncyclopedia of

Philosophy</ref></note>

</person><person role="editor" xml:id="Jon_Stallworthy"><persName>Jon Stallworthy</persName><note><ref

target="http://www.english.ox.ac.uk/about-faculty/faculty-members/other-members/stallworthy-professor-jon">JonStallworthy’s biography at Oxford

University</ref></note>

</person><person role="web_designer" xml:id="Joseph_Talbot"><persName>Joseph Talbot</persName><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/about/staff.html">Joseph Talbot’sbiography at the First World War

Poetry Digital Archive</ref></note>

</person><person role="project_manager" xml:id="Kate_Lindsay"><persName>Kate Lindsay</persName><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/about/staff.html">Kate Lindsy’sbiography at the First World War

Poetry Digital Archive</ref></note>

</person><person role="technical_specialist" xml:id="Michael_Loizou"><persName>Michael Loizou</persName><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/about/staff.html">MichaelLoizou’s biography at the First World War

116

Page 117: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

Poetry Digital Archive</ref></note>

</person><person role="god" xml:id="Pluto"><persName xml:lang="latin" ref="#Pluto">Pluto</persName><persName xml:lang="greek" ref="#Pluto">Hades</persName><note><ref

target="http://www.theoi.com/Khthonios/Haides.html">Pluto attheoi.com</ref>

</note></person><person role="editor" xml:id="Renée_van_Baalen"><persName>Renée van Baalen</persName>

</person><person role="web_developer" xml:id="Richard_Doe"><persName>Richard Doe</persName><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/about/staff.html">Richard Doe’sbiography at the First World War

Poetry Digital Archive</ref></note>

</person><person xml:id="SG_Partridge"><persName>Captain S.G. Partridge</persName><note><ref

target="http://alihollington.typepad.com/historic_battlefields/2008/01/how-did-they-do.html">HistoricBattlefields on the S.S. 143 adn S.G.

Partridge</ref></note>

</person><person role="head" xml:id="Sebastian_Rahtz"><persName>Sebastian Rahtz</persName><note><ref

target="http://digital.humanities.ox.ac.uk/PeopleProfile/person_profile_page.aspx?pid=157">SebastianRahtz’ biography at Oxford

University</ref></note>

</person><person role="director" xml:id="Stuart_Lee"><persName>Stuart Lee</persName><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/about/staff.html">Stuart Lee’sbiography at the First World War

Poetry Digital Archive</ref></note>

</person><person role="god" xml:id="Titan"><persName xml:lang="greek" ref="#Titan">Titanes</persName><note><ref

target="http://www.theoi.com/Titan/Titanes.html">Titans ontheoi.com</ref>

</note></person><person role="author" xml:id="Victor_Hugo"><persName>Victor Hugo</persName><birth when="1802-02-26">

117

Page 118: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

<placeName><settlement>Besançon</settlement>, <country ref="#France">France</country>,

<date>26thFebruary, 1802</date>. </placeName>

</birth><death when="1882-05-22"><placeName>Paris</placeName>, <country ref="#France">France</country>,

<date>22nd May,1882</date>. </death>

<note> Author of the essay <reftarget="http://www.gavroche.org/vhugo/shakespeare/">

<title ref="#Shakespeare_by_Victor_Hugo">’Shaksperés’, or ’WilliamShakespeare’</title>

</ref>. </note></person><person role="publisher" sex="1" xml:id="W.E._Windus"><persName>Windus</persName><persName>W.E. Windus</persName><note><ref

target="http://www.randomhouse.co.uk/about-us/about-us/companies/uk-companies-and-imprints/vintage-publishing/chatto-windus">W.E.Windus’ biography at Random House</ref>

</note></person><person sex="1" role="poet" xml:id="Wilfred_Owen"><persName>Wilfred Edward Salter Owen</persName><persName>Wilfred Owen</persName><birth when="1893-03-18"><placeName><settlement ref="#Oswestry">Oswestry</settlement>,

<country ref="#UK">United Kindom</country>, <date when="1893-03-18">18thMarch, 1893</date>.

</placeName></birth><death when="1918-11-04"><placeName><settlement ref="#Ors">Ors</settlement>, <country ref="#France">France</country>,

<date when="1918-11-04">4th November, 1918</date>.</placeName>

</death><note><ref

target="http://www.oucs.ox.ac.uk/ww1lit/collections/owen">The WilfredOwen Collection - Biography</ref>

</note></person>

</listPerson><listOrg><org xml:id="Chatto_and_Windus"><orgName>Chatto & Windus</orgName><note><ref

target="http://www.randomhouse.co.uk/about-us/about-us/companies/uk-companies-and-imprints/vintage-publishing/chatto-windus">Chatto& Windus’ about page at Random

House</ref></note>

</org><org xml:id="Creative_Commons"><orgName>Creative Commons</orgName><note><ref target="http://creativecommons.org/">Creative

Commons home page</ref>

118

Page 119: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

</note></org><org xml:id="OUCS"><orgName>Oxford University Computing

Services</orgName><orgName>OUCS</orgName><note><ref target="http://www.oucs.ox.ac.uk/">OUCS home

page</ref></note>

</org><org xml:id="TEI_Consortium"><orgName>TEI Consortium</orgName><orgName>TEI</orgName><note><ref target="http://www.tei-c.org/index.xml">TEI

Consortium home page</ref></note>

</org><org xml:id="University_of_Oxford"><orgName>University of Oxford</orgName><note><ref target="http://www.ox.ac.uk/">University of

Oxford’s home page</ref></note>

</org></listOrg><listBibl><bibl xml:id="Shakespeare_by_Victor_Hugo"><title level="m">Shakespeare</title><author><persName ref="#Victor_Hugo">Victor

Hugo</persName></author><note><ref

target="http://www.gavroche.org/vhugo/shakespeare/">Victor Hugo’s essay’Shakespeare’</ref>

</note></bibl>

</listBibl></sourceDesc>

</fileDesc><encodingDesc><charDecl><glyph xml:id="v_stroke"><glyphName>V stroke</glyphName><charProp><localName>entity</localName><value>V shaped stroke signifying an

addition.</value></charProp>

</glyph></charDecl>

</encodingDesc></teiHeader><facsimile><surfaceGrp n="leaf1"><surface xml:id="leaf1_surface1_Strange_Meeting"><graphic url="Strange-Meeting-manuscript-1.jpg"/>

</surface><surface xml:id="leaf1_surface2_Strange_Meeting">

119

Page 120: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

<graphic url="Strange-Meeting-manuscript-2.jpg"/></surface>

</surfaceGrp></facsimile><sourceDoc><surfaceGrp n="leaf1"><surface

facs="#leaf1_surface1_Strange_Meeting"type="letter"subtype="handwritten">

<zone><zone><line>Strange Meeting.</line>

</zone><zone><line>3</line>

</zone><line>It seemed that <del rend="stroked">from my

dug-out</del><add place="above">out of <del rend="stroked">the</del> battle</add> I

escaped</line><line>Down some profound<del rend="waving">er</del><del rend="stroked"><add place="above">earth-</add>

</del><add place="below">dull</add> tunnel, <del rend="stroked">older</del><del rend="stroked"><add place="above">nether</add>

</del><add place="below">long since</add> scooped</line>

<line>Through granites which <del rend="stroked">thenether flames</del>

<del rend="stroked"><add place="below">plutonic</add>

</del><add place="below">titanic wars</add> had

groined.</line><line>Yet also there <metamark place="inline" function="add" target="#ad1">

<g ref="#v_stroke" rend="v_stroke"/></metamark><del rend="stroked"><add place="above" xml:id="ad1">Down all its

length</add></del> encumbered sleepers groaned,</line>

<line>Too fast in thought or death to bebestirred.</line>

<line>Then, as I probed them, one sprang up, andstared</line>

<line>With piteous recognition in fixed eyes,</line><line>Lifting <del rend="stroked">his</del> distressful

hands, as if to bless.</line><line>And by his smile, I knew that sullen hall,--</line><line xml:id="alt0">By his dead smile <metamark function="join" target="#ad2">

<g rend="arrow"/></metamark><metamark function="join" target="#ad2"><g rend="arrow"/>

</metamark><hi rend="circled">I knew we stood in

hell</hi>.</line><line xml:id="alt1"><del rend="waving">And</del>

120

Page 121: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

<del rend="waving"><del>b</del><add place="overwritten">B</add>y</del> his <del rend="oblique_strokes">dead</del>

smile <seg xml:id="ad2">I knew that sullen hall</seg></line><line xml:id="alt2"><del rend="stroked">Yet slumber droned <seg xml:id="alt3">all</seg><seg xml:id="alt4" rend="above">pains</seg> down

that sullen hall</del></line><alt targets="#alt0 #alt1 #alt2" mode="excl" weights="1 0 0"/><line>With a thousand <del rend="stroked">fears</del><del rend="stroked"><add place="above"><unclear>ways</unclear>

</add></del><add place="above_above"><ptr target="#alt4"/>

</add> that <del rend="stroked">creature</del><add place="above">vision</add>’s face was

grained;</line><line>Yet no blood <del rend="stroked">sumped</del><add place="above">reached <del rend="stroked">him</del></add><del rend="stroked">here</del><add place="above">there</add> from the upper

ground,</line><line>And no <del rend="stroked">shell</del><add place="above">guns</add> thumped, or down the

flues made moan.</line><line><del rend="stroked">But all was sleep. And no voice

called for men.</del></line><line>“<del rend="stroked">My</del><add place="inspace">Strange</add> friend,” I said,

“here is no cause to mourn.”</line><line>“None”, said that other, “save the undone

years,</line><line>The hopelessness.<add place="above">

<del rend="stroked">unachieved.</del></add><add place="below"><del rend="stroked"><gap reason="unreadable"/>

</del></add> Whatever hope is yours,</line>

<line>Was my life also; <del>comrade.</del><add place="above"><del rend="stroked">for</del>

</add><add place="below">I went</add><del rend="stroked">I ran</del><add place="above">hunting</add> wild.</line>

<line>After the wildest beauty in the world,</line><line>Which lies not calm in eyes, or braided

hair,</line></zone>

</surface><surface

facs="#leaf1_surface2_Strange_Meeting"

121

Page 122: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

type="letter"subtype="handwritten">

<zone><zone>4</zone><line>But mocks the steady running of the hour.</line><line>And if it grieves, grieves richlier than

here.</line><line>For by my glee might many men have laughed,</line><line>And of my weeping something had been left,</line><line>Which must die now. I mean the truth

untold,</line><line>The pity of war, the <del rend="stroked">one

thing</del><add place="above">pity</add> war distilled.</line>

<line>Now men will go content with what wespoiled,</line>

<line>Or, discontent, boil bloody, and bespilled.</line>

<line>They will be swift with swiftness of thetigress.</line>

<line>None will break ranks, though nations trek from<add place="below">progress</add>.</line><line>Courage was mine, and I had mystery,</line><line>Wisdom was mine, and I had mastery:</line><line>To miss the march of this retreating world</line><line>Into vain citadels that are not walled.</line><line>Then, when much blood had clogged their

chariot-<add place="above"><metamark function="add"><g rend="stroke"/>

</metamark>wheels</add></line><line>I would go up and wash them from sweet

wells,</line><line>Even <del rend="stroked">

<unclear>the wells</unclear></del><add place="above"><del rend="stroked">the truths</del>

</add><add place="below">with truths</add><del rend="stroked">I sank</del><add place="above"><del rend="stroked">that</del> lie</add> too deep

for taint.</line><line>I would have poured my spirit without stint</line><line>But not <del rend="waving">by my blood into</del><add place="below">through wounds; not on</add> the <restore><metamark function="restore" target="#del1"/><del xml:id="del1">cess</del>

</restore><add place="below"><del rend="stroked"><unclear>mure</unclear>

</del></add>of war.</line>

<line><hi rend="circled"><metamark function="point" target="#l1"><g rend="arrow"/>

</metamark>Foreheads of men have bled where nowounds <add place="below">were</add>.</hi>

122

Page 123: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.2 Data samples

</line><line>I <del rend="stroked">I was a German conscript,

and your </del><add place="above">am the <del rend="stroked">German</del><add place="above">enemy</add><del rend="stroked">when</del> you killed,

my</add> friend.</line><line>I knew you in this dark: for so you frowned</line><line>Yesterday through me as you jabbed and

killed.</line><line><del rend="stroked" xml:id="del2"><undo target="#del2"/>I parried; but my hands were

loath and</del> cold.</line><line xml:id="l1">Let us sleep now....</line>

</zone></surface>

</surfaceGrp></sourceDoc><text><body><div type="verse"><head><hi rend="capitalize"> STRANGE MEETING</hi>

</head><lg type="stanza"><l> It seemed that out of battle I escaped</l><l> Down some profound dull tunnel, long since scooped</l><l> Through granites which titanic wars had groined. </l>

</lg><lg type="stanza"><l> Yet also there encumbered sleepers groaned, </l><l> Too fast in thought or death to be bestirred. </l><l> Then, as I probed them, one sprang up, and stared</l><l> With piteous recognition in fixed eyes, </l><l> Lifting distressful hands, as if to bless. </l><l> And by his smile, I knew that sullen hall,- </l><l> By his <seg>dead</seg> smile I knew we stood in Hell. </l>

</lg><lg type="stanza"><l> With a thousand pains that vision ’s face was grained ; </l><l> Yet no blood reached there from the upper ground, </l><l> And no guns thumped, or down the flues made moan. </l><l> ’Strange friend, ’ I said, ’here is no cause to mourn.’ </l><l> ’None, ’ said that other, ’save the undone years, </l><l> The hopelessness. Whatever hope is yours, </l><l> Was my life also; I went hunting wild</l><l> After the wildest beauty in the world, </l><l> Which lies not calm in eyes, or braided hair, </l><l> But mocks the steady running of the hour, </l><l> And if it grieves, grieves richlier than here. </l><l> For by my glee might many men have laughed, </l><l> And of my weeping something had been left, </l><l> Which must die now. I mean the truth untold, </l><l> The pity of war, the pity war distilled. </l><l> Now men will go content with what we spoiled, </l><l> Or, discontent, boil bloody, and be spilled. </l><l> They will be swift with swiftness of the tigress. </l><l> None will break ranks, though nations trek from progress. </l><l> Courage was mine, and I had mystery, </l><l> Wisdom was mine, and I had mastery: </l><l> To miss the march of this retreating world</l>

123

Page 124: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

<l> Into vain citadels that are not walled. </l><l> Then, when much blood had clogged their chariot-wheels, </l><l> I would go up and wash them from sweet wells, </l><l> Even with truths <seg>that</seg> lie too deep for taint. </l><l> I would have poured my spirit without stint</l><l> But not through wounds; not on the cess of war. </l><l> Foreheads of men have bled where no wounds were. </l>

</lg><lg type="stanza"><l> ’I am the enemy you killed, my friend. </l><l> I knew you in this dark: for so you frowned</l><l> Yesterday through me as you jabbed and killed. </l><l> I parried; but my hands were loath and cold. </l><l>Let us sleep now....’ </l>

</lg></div>

</body></text>

</TEI>

6.3 Getting better quality TEI XML6.3.1 A more complex ODDWrite an ODD from scratch, or use Roma to create a skeleton, and then edit the result. Use Roma togenerate a schema, and then validate any of the ECCO files against the result. The ODD should have thefollowing features:

1. There should be a <valList> for the @type attribute on <div> which limits it to a few fixedvalues; provide a <desc> for each <valItem>

2. There should be a Schematron constraint which checks that the <publicationStmt> is notempty

3. There should be a Schematron constraint which checks that all <div> elements have a <head>,unless they have a @type attribute with the value ’title_page’.

4. The examples for some elements should be replaced with ones from the ECCO texts

5. Mathematics using MathML should be allowed as a child of <formula> (you’ll need to studythe Exemplars for this)

6.3.2 Work with XSLTWrite an XSLT stylesheet which analyzes all the ECCO files and generates a closed <valList> for<div>/@type

Write an XSLT stylesheet which checks that each pointer in a @resp attribute has a corresponding IDin the file.

6.4 XSLT transformations for genetic editionsYour task is to write an XSLT transformation to make plausible rendition of the encoding on WilfredOwen’s poem Strange Meeting as a web page. Study the input XML carefully, and consider thedifference between the teiHeader/fileDesc/sourceDesc/msDesc, <text>, the <facsimile>, and<sourceDoc>. Then consider what sort of HTML you want to make. This could be

• four separate pages for header, facsimile, genetic editing and edited edition

• four sections in the same document

• side by side sections for some parts

You’ll need to decide what these look like in HTML and start creating the right structures in your XSL.

124

Page 125: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.5 Grouping Exercises

6.5 Grouping Exercises6.5.1 Grouping, part 1The TEI files from the Godwin project encode each type of event recorded in the diaries with <seg>elements. A number of people can be mentioned in each event and they are encoded with <persName>elements. A @ref attribute provides a unique key for each person.

Find all names of people (<persName>) in some of the TEI file from Godwin and group them bythe type of event within which they are mentioned (<seg>/@type). Sort the names alphabetically.

You can either produce a web page with the results, or a simple text file.

6.5.2 Grouping, part 2Find all names of people (<persName>) in some of the TEI files from Godwin and group them by thetype of event within which they are mentioned (<seg>@type). Return the number of times each uniquename (<persName>/@ref) is mentioned for each type of event.

You can either produce a web page with the results, or a simple text file.

6.5.3 Grouping, part 3Find all names of people (<persName>) in some of the TEI files from Godwin and group them bythe type of event within which they are mentioned (<seg>/@type), then organize them by the event’ssubtype (<seg>/@subtype).

Sort the names alphabetically.You can either produce a web page with the results, or a simple text file.

6.5.4 Grouping, part 4Each of the TEI files from Inscriptions of Roman Tripolitania (IRT) encodes the text from one epigraphicinscritpion, along with a substantial amount of metadata.

Imagine that you are working on a table of contents for the inscription and want to include a "snippet"of the text, for example the first two lines (each line is identified by a <lb>/@n).

Using <xsl:for-each-group> and @group-starting-with or @group-adjacent, return an XMLfile containing all the elements of the first two lines from an IRT file. That is in div[@type=’edition]/ab,between lb[@n=’1’] and lb[@n=’3’]

6.6 Using XQuery6.6.1 XQuery 1The ECCO corpus contains many kinds of text. Use XQuery to retrieve all textcontaining poetry (<lg>). Produce an HTML file containing the id of each file(tei:TEI//tei:idno[@type="TCP"]) and the first line of poetry.

6.6.2 XQuery 2Each TEI file from Inscriptions of Roman Tripolitania (IRT) encodes the text from one epigraphicinscription. Metadata in the teiHeader contains information about the place where the object carryingthe inscription was found.

Your task is to write some xQuery code to create an HTML page containing an index of "find spots"(tei:provenance//tei:event[@type=’found’]//tei:placeName[@type=’ancientFindspot’])from all the IRT TEI files. Each location should contain a list of all the files where it is mentioned. Thename of each file is found under tei:TEI//tei:idno[@type="filename"]

6.7 Using TEI stylesheet family6.7.1 IntroductionThis is a set of XSLT 2.0 specifications to transform TEI XML documents to XHTML, to LaTeX, toXSL Formatting Objects, to/from OOXML (docx), to/from OpenOfice (odt) and to ePub format. Thefiles can be downloaded from the Releases area of http://tei.sf.net. They concentrate on the

125

Page 126: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

simpler TEI modules, but adding support for other modules is fairly easy. In the main, the setup hasbeen used on ‘new’ documents, ie reports and web pages that have been authored from scratch, ratherthan traditional TEI-encoded existing material.

There is a change log file available.The XSL FO style sheets were developed for use with PassiveTeX (http://projects.oucs.

ox.ac.uk/passivetex/), a system using XSL formatting objects to render XML to PDF viaLaTeX. They have not been extensively tested with the other XSL FO implementations.

6.7.2 File organisationThe main stylesheets are divided into four directories:

common2 templates which are independent of output type

fo2 templates for making XSL FO output

xhtml2 templates for making HTML output

latex2 templates for making LaTeX output

Within each directory there is a separate file for the templates which implement each of the TEI modules(eg textstructure.xsl, linking.xsl, or drama.xsl); these are included by a master file tei.xsl. Thisalso includes a parameterization layer in the file tei-param.xsl, and the parameterization file from thecommon directory. The tei.xsl does any necessary declaration of constants and XSL keys.

There are further directories for special-purposes conversions:

epub conversion to ePub

odt conversion to and from OpenOffice Writer format

docx conversion to and from Word OOXML format

odds2 processing of TEI ODD files

rdf conversion to RDF

txt conversion to plain text

The final important directory is profiles, which has a set of predefined project starting points, eachof which may have a file to.xsl for one or more of the supported output formats (csv, dtd, html, odt,docbook, epub, latex, p4, docx, fo, lite, and relaxng). There may also be a from.xsl to go from theselected format to TEI XML.

For example, to convert TEI to HTML in the default mannner, the user may run pro-files/default/html/to.xsl on the selected input file. Other starting points are listed below.

For the brave, there are Linux/OSX command-line shell scripts docxtotei, odttotei, teitodocx,teitodtd, teitoepub, teitoepub3, teitohtml, teitoodt, teitordf, teitorelaxng, teitornc, teitotxt, andteitoxsd for converting to/from Word, to/from OpenOffice, and to DTD, ePub, HTML, RDF, RelaxNG, plain text, W3C schema etc. These are implemented using Ant tasks, which are also availablewithin the oXygen XML editor as part of the TEI framework.

Any other use of the stylesheets, eg by referencing individual modules, is not supported and requiresgood understanding of XSL.

6.7.3 Trying the PDF rendering• load an ECCO text into oXygen, choose the TEI P5 to PDF transform scenario (press the

‘Configure Transfomation Scenario’ icon, ). If all goes well, your browser wil load a PDFrendering in due course.

126

Page 127: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.7 Using TEI stylesheet family

• Now set the parameter Institution to ‘Oxford Summer School’ and rerun the transformation.See the difference?

• More dramatically, change columnCount to have the value 2, and see what happens then.

• Set parIndent to ‘0em’ and parSkip to ‘2pt’

• Finally, change pageWidth to ‘1755mm’, change columnCount back to 1, run the transform,and check that the page width is lessened.

6.7.4 Going further with parameters of the HTMLTry some of these changes to the HTML rendering of Punch, by setting parameters, and check theresults:

• Set autoToc to ‘false’

• Set numberHeadings to false

• Set pageLayout to ‘CSS’

• Set numberParagraphs to ‘true’

You can see the catalogue of parameters at http://www.tei-c.org/release/doc/tei-xsl-common/customize.html.

6.7.5 Using OxGarageNow it is time to work with OxGarage, to check that you can create word-processor and ebook files.Visit oxgarage.oucs.ox.ac.uk:8080/ege-webclient and:

• Upload one of the ECCO files. Try conversions to Word or OpenOffice format, and check thatthey load into the relevant application properly.

• Make an ePub file, if you have an eBook reader to hand (Firefox users can download a good addonfrom http://www.epubread.com/en/)

• Open Word or OpenOffice and write a simple document. Upload this to OxGarage and ask forTEI P5 XML to be sent back. Load it into oXygen and see if it is valid or useable. Do not expectmiracles. OxGarage cannot read your mind. . .

• Edit the generate TEI file a litle, then upload it back into OxGarage and ask for a Word orOpenOffice file. How does that compare with the one you started with?

6.7.6 Rolling our ownIt is time to set up our own profile in the stylesheets, and add some custom code. You will need to

• Set up a copy of the stylesheet family by downloading it from Sourceforge, or checking it outusing Subversion

• Adapt your local processing setup to point to the copy of the XSL

• Add a new profile, and subdirectories for your chosen format

• Put in support for some elements you use which are not properly covered by the existing setup

127

Page 128: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

6.8 TEI reference material: XSL stylesheets6.8.1 IntroductionThis section describes how to produce a customization of the TEI stylesheets. It describes all theparameters which you can set, the templates which are designed to be changed, and the empty templatesprovided into which you can add your own code.

There are 13 areas for customization. In most cases there are parameters and templates which arespecific to one of the three output methods (HTML, FO and LaTeX), and those which are common toall three.

6.8.2 Making HTML: exampleYou can simply refer to the specification xhtml2/tei.xsl directly with your XSL processor, orinstall it locally on your own server. For more flexibility, you may prefer to reference the specificationsfrom an XSL wrapper of your own. The minimal specification would look like this:

<xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"><xsl:import href="xhtml2/tei.xsl"/></xsl:stylesheet>

You can customize the result by adding to this wrapper file. The normal result will be a single streamof HTML which you can save in a file. You can also configure it to produce multiple output files, oneper top-level <div> or <div1>.

6.8.3 Standard page featuresThe default behaviour of the system is to construct each HTML page with per-page navigation bars topand bottom, and a standard set of navigation links underneath.

Variables

Type Name Description Defaultdepartment Name of department within institu-

tion [string]homeLabel Name of link to home page of appli-

cation [string]Home

homeURL Project Home [anyURI] http://www.tei-c.org/homeWords Project [string] TEIinstitution Institution [string] A TEI ProjectparentURL Institution link [anyURI] http://www.tei-c.org/parentWords Name of overall institution

[string]Parent Institution

searchURL Link to search application [anyURI] http://www.google.comxhtml alignNavigationPanel How to align the navigation panel at

the bottom of the page [string]right

xhtml bottomNavigationPanel Display navigation panel at bottom ofpages [boolean]

true

xhtml feedbackURL Link for feedback [anyURI] mailto:feedbackxhtml htmlTitlePrefix Fixed string to insert before normal

page title in HTML meta <title> ele-ment [string]

xhtml linkPanel Make a panel with nextpage/previous page links.[boolean]

true

128

Page 129: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

Templates6.8.4 LayoutThere are three ways to provide a constant navigation aid. You can either make the whole page into atable, where the first column has a table of contents, or you can make an HTML frameset, or you canjust have a table of links on the left or right

Hypertext links present special problems, as we have to choose whether they should start a newwindow, occupy all of the current window, or stay within the frame. These stylesheets implement thefollowing rules:

1. Any <ref> or <ptr> link stays within the frame

2. Any link containing ‘://’ uses the whole browser window

3. Any link starting ‘.’ uses the whole browser window

4. If the stylesheet sets no splitting of the document, any <ref> or <ptr> link uses the wholebrowser window

5. If a <ref> or <ptr> link has a rend attribute value of ‘noframe’, the whole browser window isused

6. If a <ref> or <ptr> link has a rend attribute value of ‘new’, a new browser window is started

Variables

Type Name Description DefaultoddWeaveLite Whether to make simplified display

of ODD [boolean]false

parIndent Paragraph indentation [string] 1embiblioStyle Style for formatted bibliography

[string]parSkip Default spacing between paragraphs

[string]0pt

xhtml filePerPage Whether we should construct a sepa-rate file for each page (based on pagebreaks) [boolean]

false

xhtml viewPortWidth When making fixed format epub,width of viewport [number]

1200

xhtml viewPortHeight When making fixed format epub,height of viewport [number]

1700

xhtml consecutiveFNs Number footnotes consecutively[boolean]

false

xhtml footnoteBackLink Link back from footnotes to reference[boolean]

false

xhtml contentStructure How to use the front/body/back mat-ter in creating columns. The choice isbetween all: use <front> for left-handcolumn, use <body> for centre col-umn, and use <back> for right-handcolumnbody: use <body> for right-hand column, generate left-hand witha TOC or whatever [string]

body

129

Page 130: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

xhtml divOffset The difference between TEI div lev-els and HTML. headings. TEI <div>sare implicitly or explicitly numberedfrom 0 upwards; this offset is addedto that number to produce an HTML<Hn> element. So a value of 2 heremeans that a <div1> will generate an<h2> [integer]

2

xhtml footnoteFile Make a separate file for footnotes[boolean]

false

xhtml linksWidth Width of left-hand column when$pageLayout is "Table" [string]

15%

xhtml navbarFile XML resource defining a navigationbar. The XML should provide a<list> containing a series of <item>elements, each containing an <xref>link. [anyURI]

xhtml autoEndNotes Make all notes into endnotes[boolean]

false

fo backMulticolumns Put back matter in multiple columns[boolean]

false

fo bodyMarginBottom Margin at bottom of text body[string]

24pt

fo bodyMarginTop Margin at top of text body [string] 24ptfo bodyMulticolumns Put body matter in multiple columns

[boolean]false

fo bulletFour Symbol for 4th level itemized list[string]

+

fo bulletOne Symbol for top-level itemized list[string]

fo bulletThree Symbol for 3rd level itemized list[string]

*

fo bulletTwo Symbol for 2nd level itemized list[string]

fo columnCount Number of columns, whenmultiple-column work is requested[integer]

1

fo betweenStarts XSL FO "provisional-distance-between starts" [string]

18pt

fo betweenGlossStarts XSL FO "provisional-distance-between starts" for gloss lists[string]

42pt

fo betweenBiblStarts XSL FO "provisional-distance-between starts" for bibliographies[string]

14pt

fo divRunningheads Display section headings in runningheads [boolean]

false

fo exampleAfter Space below examples [string] 4ptfo exampleBefore Space above examples [string] 4ptfo exampleMargin Left margin for examples [string] 12pt

130

Page 131: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

fo flowMarginLeft Left margin of flow [string]fo forcePageMaster Which named page master name to

use [string]fo formatBackpage How to format page numbers in back

matter (use XSLT number format)[string]

1

fo formatBodypage How to format page numbers in mainmatter (use XSLT number format)[string]

1

fo formatFrontpage How to format page numbers in frontmatter (use XSLT number format)[string]

i

fo frontMulticolumns Put front matter in multiple columns[boolean]

false

fo labelSeparation XSL FO "provisional-label-separation" [string]

6pt

fo listAbove-1 Space above lists at top level[string]

6pt

fo listAbove-2 Space above lists at 2nd level[string]

4pt

fo listAbove-3 Space above lists at 3rd level[string]

0pt

fo listAbove-4 Space above lists at 4th level[string]

0pt

fo listBelow-1 Space below lists at top level[string]

6pt

fo listBelow-2 Space below lists at 2nd level[string]

4pt

fo listBelow-3 Space below lists at 3rd level[string]

0pt

fo listBelow-4 Space below lists at 4th level[string]

0pt

fo listItemsep Spacing between list items[string]

4pt

fo listLeftGlossIndent Left margin for gloss lists [string] 0.5info listLeftGlossInnerIndent Left margin for nested gloss lists

[string]0.25in

fo listLeftIndent Indentation for lists [string] 0ptfo listRightMargin Right margin for lists [string] 10ptfo pageHeight Paper height [string] 297mmfo pageMarginBottom Margin at bottom of text area

[string]100pt

fo pageMarginLeft Left margin [string] 80ptfo pageMarginRight Right margin [string] 150ptfo pageMarginTop Margin at top of text area [string] 75ptfo pageWidth Paper width [string] 211mmfo parSkipmax Maximum space allowed between

paragraphs [string]12pt

131

Page 132: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

fo readColSpecFile External XML file containing specifi-cations for column sizes for tables indocument [anyURI]

fo regionAfterExtent Region after [string] 14ptfo regionBeforeExtent Region before [string] 14ptfo sectionHeaders Construct running headers from

page number and section headings[boolean]

true

fo spaceAfterBibl Space after bibliography [string] 0ptfo spaceAroundTable Space above and below a table

[string]8pt

fo spaceBeforeBibl Space above bibliography [string] 4ptfo spaceBelowCaption Space below caption of figure or table

[string]4pt

fo titlePage Make title page [boolean] truefo twoSided Make 2-page spreads [boolean] truelatex classParameters Optional parameters for document-

class [string]11pt,twoside

latex latexLogo Logo graphics file [string]latex pagebreakStyle When processing a "pb" element, de-

cide what to generate: "active" gen-erates a page break; "visible" gener-ates a bracketed number (with scis-sors), and "bracketsonly" generates abracketed number (without scissors).[float]

latex tableMaxWidth When making a table, what widthmust be constrained to fit, as a pro-portion of the page width. [float]

0.85

latex verseNumbering Whether to number lines of poetry[boolean]

false

latex everyHowManyLines When numbering poetry, how oftento put in a line number [integer]

5

latex resetVerseLineNumbering When numbering poetry, when torestart the sequence; this must be thename of a TEI element [string]

div1

latex latexPaperSize LaTeX paper size [] a4paper

Templates

columnHeader (for xhtml) [html] Banner for top of column

hdr (for xhtml) [html] Header section across top of page

<xsl:call-template name="pageHeader"><xsl:with-param name="mode"/>

</xsl:call-template>

hdr2 (for xhtml) [html] Navigation bar

132

Page 133: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

<xsl:call-template name="navbar"/>

preBreadCrumbPath (for xhtml) [html] Text or action to take at the start of the breadcrumb trail

hdr3 (for xhtml) [html] Breadcrumb trail

<html:a href="#rh-col" title="Go to main page content" class="skiplinks">Skiplinks</html:a><html:a class="hide">|</html:a><xsl:call-template name="crumbPath"/><html:a class="hide">|</html:a><html:a class="bannerright" href="{$parentURL}" title="Go to home page"><xsl:value-of select="$parentWords"/>

</html:a>

lh-col-bottom (for xhtml) [html]Bottom of left-hand columnID of selected section

<xsl:param name="currentID"/><xsl:call-template name="leftHandFrame"><xsl:with-param name="currentID" select="$currentID"/>

</xsl:call-template>

lh-col-top (for xhtml) [html]Top of left-hand column

<xsl:call-template name="searchbox"/><xsl:call-template name="printLink"/>

logoPicture (for xhtml) [html] Logo

<html:aclass="framelogo"href="http://www.tei-c.org/Stylesheets/">

<html:imgsrc="http://www.tei-c.org/release/common2/doc/tei-xsl-common/teixsl.png"vspace="5"width="124"height="161"border="0"alt="created by TEI XSL Stylesheets"/>

</html:a>

metaHTML (for xhtml) [html] Making elements in HTML <head>The text used to create the DC.Titlefield in the HTML header

<xsl:param name="title"/><html:meta name="author"><xsl:attribute name="content"><xsl:call-template name="generateAuthor"/>

</xsl:attribute></html:meta><xsl:if test="$filePerPage=’true’"><html:meta

133

Page 134: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

name="viewport"content="width={$viewPortWidth}, height={$viewPortHeight}"/>

</xsl:if><html:meta

name="generator"content="Text Encoding Initiative Consortium XSLT stylesheets"/>

<xsl:choose><xsl:when

test="$outputTarget=’html5’ or $outputTarget=’epub3’"><html:meta charset="utf-8"/>

</xsl:when><xsl:otherwise><html:meta

http-equiv="Content-Type"content="text/html; charset={$outputEncoding}"/>

<html:meta name="DC.Title"><xsl:attribute name="content"><xsl:value-of select="normalize-space($title)"/>

</xsl:attribute></html:meta><html:meta name="DC.Type" content="Text"/><html:meta name="DC.Format" content="text/html"/>

</xsl:otherwise></xsl:choose>

navbar (for xhtml) [html] Construction of navigation bar A file is looked for relative to the stylesheet(the second parameter of the document function), which is expected to contain a TEI <list> whereeach <item> has an embedded <xref>

<xsl:choose><xsl:when test="$navbarFile=”"><xsl:comment>no nav bar</xsl:comment>

</xsl:when><xsl:otherwise><xsl:element

name="{if ($outputTarget=’html5’) then ’nav’ else ’div’}"><xsl:for-each select="document($navbarFile,document(”))"><xsl:for-each select="tei:list/tei:item"><html:span class="navbar"><html:a href="{$URLPREFIX}{tei:xref/@url}" class="navbar"><xsl:apply-templates select="tei:xref/text()"/>

</html:a></html:span><xsl:if test="following-sibling::tei:item"> | </xsl:if>

</xsl:for-each></xsl:for-each>

</xsl:element></xsl:otherwise>

</xsl:choose>

pageHeader (for xhtml) [html] Banner for top of pagelayout mode

<xsl:param name="mode"/><xsl:choose><xsl:when test="$mode=’table’"><html:table width="100%" border="0"><html:tr><html:td

134

Page 135: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

height="98"class="bgimage"onclick="window.location=’{$homeURL}’"cellpadding="0">

<xsl:call-template name="makeHTMLHeading"><xsl:with-param name="class">subtitle</xsl:with-param><xsl:with-param name="text"><xsl:call-template name="generateSubTitle"/>

</xsl:with-param><xsl:with-param name="level">2</xsl:with-param>

</xsl:call-template><xsl:call-template name="makeHTMLHeading"><xsl:with-param name="class">title</xsl:with-param><xsl:with-param name="text"><xsl:call-template name="generateTitle"/>

</xsl:with-param><xsl:with-param name="level">1</xsl:with-param>

</xsl:call-template></html:td><html:td style="vertical-align:top;"/>

</html:tr></html:table>

</xsl:when><xsl:otherwise><xsl:call-template name="makeHTMLHeading"><xsl:with-param name="class">subtitle</xsl:with-param><xsl:with-param name="text"><xsl:call-template name="generateSubTitle"/>

</xsl:with-param><xsl:with-param name="level">2</xsl:with-param>

</xsl:call-template><xsl:call-template name="makeHTMLHeading"><xsl:with-param name="class">title</xsl:with-param><xsl:with-param name="text"><xsl:call-template name="generateTitle"/>

</xsl:with-param><xsl:with-param name="level">1</xsl:with-param>

</xsl:call-template></xsl:otherwise>

</xsl:choose>

rh-col-bottom (for xhtml) [html] Bottom of right-hand columnID of selected section

<xsl:param name="currentID"/><xsl:call-template name="mainFrame"><xsl:with-param name="currentID" select="$currentID"/>

</xsl:call-template>

rh-col-top (for xhtml) [html] Top of right-hand column

<xsl:call-template name="columnHeader"/>

searchbox (for xhtml) [html] Make a search box

singleFileLabel (for xhtml) [html] Construct a label for the link which makes a printable version ofthe document.For Printing

135

Page 136: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

latexPackages (for latex) LaTeX package setup Declaration of the LaTeX packages needed toimplement this markup

<xsl:text>\usepackage[</xsl:text><xsl:value-of select="$latexPaperSize"/><xsl:text>,</xsl:text><xsl:value-of select="$latexGeometryOptions"/><xsl:text>]{geometry}\usepackage{framed}</xsl:text><xsl:text>\definecolor{shadecolor}{gray}{0.95}\usepackage{longtable}\usepackage[normalem]{ulem}\usepackage{fancyvrb}\usepackage{fancyhdr}\usepackage{graphicx}</xsl:text><xsl:if test="key(’ENDNOTES’,1)"> \usepackage{endnotes}<xsl:choose>

<xsl:when test="key(’FOOTNOTES’,1)"> \def\theendnote{\@alph\c@endnote}</xsl:when><xsl:otherwise> \def\theendnote{\@arabic\c@endnote}</xsl:otherwise>

</xsl:choose></xsl:if><xsl:text>\def\Gin@extensions{.pdf,.png,.jpg,.mps,.tif}</xsl:text><xsl:choose><xsl:when test="$reencode=’true’"><xsl:text>\IfFileExists{tipa.sty}{\usepackage{tipa}}{}

\usepackage{times}</xsl:text>

</xsl:when></xsl:choose><xsl:if test="not($userpackage=”)"> \usepackage{<xsl:value-of select="$userpackage"/>}</xsl:if><xsl:text> \pagestyle{fancy}</xsl:text>\usepackage[pdftitle={<xsl:call-template name="generateSimpleTitle"/>},pdfauthor={<xsl:call-template name="generateAuthor"/>}]{hyperref}\hyperbaseurl{<xsl:value-of select="$baseURL"/>}

<xsl:if test="count(key(’APP’,1))>0">\usepackage{ledmac}<xsl:call-template name="ledmacOptions"/></xsl:if>

latexSetup (for latex) LaTeX setup The basic LaTeX setup which you should not really tinker withunless you really understand why and how. Note that we need to set up a mapping here for Unicode8421, 10100 and 10100 to glyphs for backslash and the two curly brackets, to provide literalcharacters. The normal characters remain active for LaTeX commands. Note that if $reencode isset to false, no input or output encoding packages are loaded, since it is assumed you are using aTeX variant capable of dealing with UTF-8 directly.

<xsl:call-template name="latexSetupHook"/>\IfFileExists{xcolor.sty}%{\RequirePackage{xcolor}}%{\RequirePackage{color}}\usepackage{colortbl}<xsl:choose>

136

Page 137: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

<xsl:when test="$reencode=’true’">\IfFileExists{utf8x.def}%{\usepackage[utf8x]{inputenc}\PrerenderUnicode{-}}%{\usepackage[utf8]{inputenc}}

<xsl:call-template name="latexBabel"/>\usepackage[T1]{fontenc}\usepackage{float}\usepackage[]{ucs}\uc@dclc{8421}{default}{\textbackslash }\uc@dclc{10100}{default}{\{}\uc@dclc{10101}{default}{\}}\uc@dclc{8491}{default}{\AA{}}\uc@dclc{8239}{default}{\,}\uc@dclc{20154}{default}{ }\uc@dclc{10148}{default}{>}\def\textschwa{\rotatebox{-90}{e}}\def\textJapanese{}\def\textChinese{}

</xsl:when><xsl:otherwise>\usepackage{fontspec}

\usepackage{xunicode}\catcode‘\=\active \def\{\textbackslash}\catcode‘{=\active \def{{\{}\catcode‘}=\active \def}{\}}\def\textJapanese{\fontspec{Kochi Mincho}}\def\textChinese{\fontspec{HAN NOM A}\XeTeXlinebreaklocale

"zh"\XeTeXlinebreakskip = 0pt plus 1pt }\def\textKorean{\fontspec{Baekmuk Gulim} }\setmonofont{<xsl:value-of select="$typewriterFont"/>}

<xsl:if test="not($sansFont=”)"> \setsansfont{<xsl:value-of select="$sansFont"/>}</xsl:if><xsl:if test="not($romanFont=”)"> \setromanfont{<xsl:value-of select="$romanFont"/>}</xsl:if>

</xsl:otherwise></xsl:choose>\DeclareTextSymbol{\textpi}{OML}{25}\usepackage{relsize}\def\textsubscript#1{%\@textsubscript{\selectfont#1}}\def\@textsubscript#1{%{\m@th\ensuremath{_{\mbox{\fontsize\sf@size\z@#1}}}}}\def\textquoted#1{‘#1’}\def\textsmall#1{{\small #1}}\def\textlarge#1{{\large #1}}\def\textoverbar#1{\ensuremath{\overline{#1}}}\def\textgothic#1{{\fontspec{<xsl:value-of select="$gothicFont"/>}#1}}\def\textcal#1{{\fontspec{<xsl:value-of select="$calligraphicFont"/>}#1}}\RequirePackage{array}\def\@testpach{\@chclass\ifnum \@lastchclass=6 \@ne \@chnum \@ne \else\ifnum \@lastchclass=7 5 \else\ifnum \@lastchclass=8 \tw@ \else\ifnum \@lastchclass=9 \thr@@\else \z@\ifnum \@lastchclass = 10 \else\edef\@nextchar{\expandafter\string\@nextchar}%\@chnum\if \@nextchar c\z@ \else\if \@nextchar l\@ne \else\if \@nextchar r\tw@ \else

137

Page 138: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

\z@ \@chclass\if\@nextchar |\@ne \else\if \@nextchar !6 \else\if \@nextchar @7 \else\if \@nextchar (8 \else\if \@nextchar )9 \else10\@chnum\if \@nextchar m\thr@@\else\if \@nextchar p4 \else\if \@nextchar b5 \else\z@ \@chclass \z@ \@preamerr \z@ \fi \fi \fi \fi\fi \fi \fi \fi \fi \fi \fi \fi \fi \fi \fi \fi}

\gdef\arraybackslash{\let\\=\@arraycr}\def\textxi{\ensuremath{\xi}}\def\Panel#1#2#3#4{\multicolumn{#3}{){\columncolor{#2}}#4}{#1}}

<xsl:text disable-output-escaping="yes">\newcolumntype{L}[1]{){\raggedright\arraybackslash}p{#1}}\newcolumntype{C}[1]{){\centering\arraybackslash}p{#1}}\newcolumntype{R}[1]{){\raggedleft\arraybackslash}p{#1}}\newcolumntype{P}[1]{){\arraybackslash}p{#1}}\newcolumntype{B}[1]{){\arraybackslash}b{#1}}\newcolumntype{M}[1]{){\arraybackslash}m{#1}}\definecolor{label}{gray}{0.75}\DeclareRobustCommand*{\xref}{\hyper@normalise\xref@}\def\xref@#1#2{\hyper@linkurl{#2}{#1}}\def\Div[#1]#2{\section*{#2}}\begingroup\catcode‘\_=\active\gdef_#1{\ensuremath{\sb{\mathrm{#1}}}}\endgroup\mathcode‘\_=\string"8000\catcode‘\_=12\relax</xsl:text>

latexBabel (for latex) LaTeX babel setup LaTeX loading of babel with options \usepack-age[english]{babel}

latexLayout (for latex) LaTeX layout preamble All the LaTeX setup which affects page layout

<xsl:choose><xsl:when test="$latexPaperSize=’a3paper’">\paperwidth297mm

\paperheight420mm</xsl:when><xsl:when test="$latexPaperSize=’a5paper’">

\paperwidth148mm\paperheight210mm

</xsl:when><xsl:when test="$latexPaperSize=’a4paper’">\paperwidth210mm

\paperheight297mm</xsl:when><xsl:when test="$latexPaperSize=’letterpaper’">\paperwidth216mm

\paperheight279mm</xsl:when><xsl:otherwise/>

</xsl:choose>\def\@pnumwidth{1.55em}\def\@tocrmarg {2.55em}\def\@dotsep{4.5}

138

Page 139: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

\setcounter{tocdepth}{3}\clubpenalty=8000\emergencystretch 3em\hbadness=4000\hyphenpenalty=400\pretolerance=750\tolerance=2000\vbadness=4000\widowpenalty=10000<xsl:if test="not($docClass=’letter’)">\renewcommand\section{\@startsection{section}{1}{\z@}%{-1.75ex \@plus -0.5ex \@minus -.2ex}%{0.5ex \@plus .2ex}%{\reset@font\Large\bfseries\sffamily}}\renewcommand\subsection{\@startsection{subsection}{2}{\z@}%{-1.75ex\@plus -0.5ex \@minus- .2ex}%{0.5ex \@plus .2ex}%{\reset@font\Large\sffamily}}\renewcommand\subsubsection{\@startsection{subsubsection}{3}{\z@}%{-1.5ex\@plus -0.35ex \@minus -.2ex}%{0.5ex \@plus .2ex}%{\reset@font\large\sffamily}}\renewcommand\paragraph{\@startsection{paragraph}{4}{\z@}%{-1ex \@plus-0.35ex \@minus -0.2ex}%{0.5ex \@plus .2ex}%{\reset@font\normalsize\sffamily}}\renewcommand\subparagraph{\@startsection{subparagraph}{5}{\parindent}%{1.5ex \@plus1ex \@minus .2ex}%{-1em}%{\reset@font\normalsize\bfseries}}

</xsl:if>\def\l@section#1#2{\addpenalty{\@secpenalty} \addvspace{1.0em plus 1pt}\@tempdima 1.5em \begingroup\parindent \z@ \rightskip \@pnumwidth\parfillskip -\@pnumwidth\bfseries \leavevmode #1\hfil \hbox to\@pnumwidth{\hss #2}\par\endgroup}\def\l@subsection{\@dottedtocline{2}{1.5em}{2.3em}}\def\l@subsubsection{\@dottedtocline{3}{3.8em}{3.2em}}\def\l@paragraph{\@dottedtocline{4}{7.0em}{4.1em}}\def\l@subparagraph{\@dottedtocline{5}{10em}{5em}}\@ifundefined{c@section}{\newcounter{section}}{}\@ifundefined{c@chapter}{\newcounter{chapter}}{}\newif\if@mainmatter\@mainmattertrue\def\chaptername{Chapter}\def\frontmatter{%\pagenumbering{roman}\def\thechapter{\@roman\c@chapter}\def\theHchapter{\alph{chapter}}\def\@chapapp{}%}\def\mainmatter{%\cleardoublepage\def\thechapter{\@arabic\c@chapter}\setcounter{chapter}{0}\setcounter{section}{0}\pagenumbering{arabic}\setcounter{secnumdepth}{6}\def\@chapapp{\chaptername}%\def\theHchapter{\arabic{chapter}}

139

Page 140: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

}\def\backmatter{%\cleardoublepage\setcounter{chapter}{0}\setcounter{section}{0}\setcounter{secnumdepth}{0}\def\@chapapp{\appendixname}%\def\thechapter{\@Alph\c@chapter}\def\theHchapter{\Alph{chapter}}\appendix}\newenvironment{bibitemlist}[1]{%\list{\@biblabel{\@arabic\c@enumiv}}%{\settowidth\labelwidth{\@biblabel{#1}}%\leftmargin\labelwidth\advance\leftmargin\labelsep\@openbib@code\usecounter{enumiv}%\let\p@enumiv\@empty\renewcommand\theenumiv{\@arabic\c@enumiv}%}%\sloppy\clubpenalty4000\@clubpenalty \clubpenalty\widowpenalty4000%\sfcode‘\.\@m}%{\def\@noitemerr{\@latex@warning{Empty ‘bibitemlist’ environment}}%\endlist}

\def\tableofcontents{\section*{\contentsname}\@starttoc{toc}}\parskip<xsl:value-of select="$parSkip"/>\parindent<xsl:value-of select="$parIndent"/>\def\Panel#1#2#3#4{\multicolumn{#3}{){\columncolor{#2}}#4}{#1}}\newenvironment{reflist}{%\begin{raggedright}\begin{list}{}{%\setlength{\topsep}{0pt}%\setlength{\rightmargin}{0.25in}%\setlength{\itemsep}{0pt}%\setlength{\itemindent}{0pt}%\setlength{\parskip}{0pt}%\setlength{\parsep}{2pt}%\def\makelabel##1{\itshape ##1}}%}{\end{list}\end{raggedright}}\newenvironment{sansreflist}{%\begin{raggedright}\begin{list}{}{%\setlength{\topsep}{0pt}%\setlength{\rightmargin}{0.25in}%\setlength{\itemindent}{0pt}%\setlength{\parskip}{0pt}%\setlength{\itemsep}{0pt}%\setlength{\parsep}{2pt}%\def\makelabel##1{\upshape\sffamily ##1}}%}{\end{list}\end{raggedright}}\newenvironment{specHead}[2]%{\vspace{20pt}\hrule\vspace{10pt}%\hypertarget{#1}{}%\markright{#2}%

140

Page 141: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

<xsl:text> \pdfbookmark[</xsl:text><xsl:value-of select="$specLinkDepth"/><xsl:text>]{#2}{#1}%\hspace{-0.75in}{\bfseries\fontsize{16pt}{18pt}\selectfont#2}%}{}</xsl:text><xsl:call-template name="latexPreambleHook"/>

ledmacOptions (for latex) LaTeX setup commands for ledmac package \renewcom-mand{\notenumfont}{\bfseries} \lineation{page} \linenummargin{inner} \footthreecol{A}\foottwocol{B}

latexBegin (for latex) LaTeX setup before start of document All the LaTeX setup which are executedbefore the start of the document

<xsl:text>\makeatletter\thispagestyle{empty}\markright{\@title}\markboth{\@title}{\@author}\renewcommand\small{\@setfontsize\small{9pt}{11pt}\abovedisplayskip 8.5\p@plus3\p@ minus4\p@\belowdisplayskip \abovedisplayskip\abovedisplayshortskip \z@ plus2\p@\belowdisplayshortskip 4\p@ plus2\p@ minus2\p@\def\@listi{\leftmargin\leftmargini\topsep 2\p@ plus1\p@ minus1\p@\parsep 2\p@ plus\p@ minus\p@\itemsep 1pt}}\makeatother\fvset{frame=single,numberblanklines=false,xleftmargin=5mm,xrightmargin=5mm}\fancyhf{}\setlength{\headheight}{14pt}\fancyhead[LE]{\bfseries\leftmark}\fancyhead[RO]{\bfseries\rightmark}\fancyfoot[RO]{}\fancyfoot[CO]{\thepage}\fancyfoot[LO]{\TheID}\fancyfoot[LE]{}\fancyfoot[CE]{\thepage}\fancyfoot[RE]{\TheID}\hypersetup{linkbordercolor=0.75 0.75 0.75,urlbordercolor=0.75 0.750.75,bookmarksnumbered=true}\fancypagestyle{plain}{\fancyhead{}\renewcommand{\headrulewidth}{0pt}}</xsl:text>

latexEnd (for latex) LaTeX setup at end of document All the LaTeX setup which are executed at theend of the document

6.8.5 HeadingsHeadings for sections can be customized in various ways.

Variables

Type Name Description DefaultautoHead Construct a heading for <div> ele-

ments with no <head> [boolean]numberSpacer Character to put after number of sec-

tion header [string]space

141

Page 142: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

TemplatesautoMakeHead (for common) [common] How to make a heading for section if there is no <head>

<xsl:param name="display"/><xsl:choose><xsl:when test="tei:head and $display=’full’"><xsl:apply-templates select="tei:head" mode="makeheading"/>

</xsl:when><xsl:when test="tei:head"><xsl:apply-templates select="tei:head" mode="plain"/>

</xsl:when><xsl:when test="tei:front/tei:head"><xsl:apply-templates select="tei:front/tei:head" mode="plain"/>

</xsl:when><xsl:when test="@n"><xsl:value-of select="@n"/>

</xsl:when><xsl:when test="@type"><xsl:text>[</xsl:text><xsl:value-of select="@type"/><xsl:text>]</xsl:text>

</xsl:when><xsl:otherwise><xsl:text>></xsl:text>

</xsl:otherwise></xsl:choose>

headingNumberSuffix (for common) Punctuation to insert after a section number

<xsl:text>.</xsl:text><xsl:value-of select="$numberSpacer"/>

6.8.6 NumberingSection headings, figures, tables and notes can be numbered automatically. We can set the numberingof front matter and back matter separately. If you prefer to supply your own numbering, using the nattribute, you can choose this over automatic numbering.

Normally, heading numbers are followed by ‘. ’, but you can vary this. This would let you use egfixed spaces.

Variables

Type Name Description DefaultnumberBackFigures Automatically number figures in

back matter [boolean]false

numberBackHeadings How to construct heading numberingin back matter [string]

A.1

numberBackTables Automatically number tables in backmatter [boolean]

true

numberBodyHeadings How to construct heading numberingin main matter [string]

1.1.1.1

numberFigures Automatically number figures[boolean]

true

numberFrontFigures Automatically number figures infront matter [boolean]

false

142

Page 143: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

numberFrontHeadings How to construct heading numberingin front matter [string]

numberFrontTables Automatically number tables in frontmatter [boolean]

true

numberHeadings Automatically number sections[boolean]

true

numberHeadingsDepth Depth to which sections should benumbered [integer]

9

numberTables Automatically number tables[boolean]

true

numberParagraphs Use value of "n" attribute to numbersections [boolean]

false

numberParagraphs Automatically number paragraphs.[boolean]

false

TemplatesnumberBackDiv (for common) [common] How to number sections in back matter

<xsl:if test="not($numberBackHeadings=”)"><xsl:number

count="tei:div|tei:div1|tei:div2|tei:div3|tei:div4|tei:div5|tei:div6"format="A.1.1.1.1.1"level="multiple"/>

</xsl:if>

numberBodyDiv (for common) [common] How to number sections in main matter

<xsl:if test="$numberHeadings=’true’"><xsl:number

count="tei:div|tei:div1|tei:div2|tei:div3|tei:div4|tei:div5|tei:div6"level="multiple"/>

</xsl:if>

numberFrontDiv (for common) [common] How to number sections in front matter

<xsl:param name="minimal"/><xsl:number

count="tei:div|tei:div1|tei:div2|tei:div3|tei:div4|tei:div5|tei:div6"level="multiple"/>

<xsl:if test="$minimal=’false’"><xsl:value-of select="$numberSpacer"/>

</xsl:if>

6.8.7 OutputYou can set a name for the output file(s); if you ask for multiple output files, this name will be used tocreate unique filenames for each section. By default, results will go to wherever your XSLT processornormally writes (usually standard output). If you opt to have files created, you can specify the name ofthe directory where the output is to be placed.

If you are making HTML, do you want a single output page, or a separate one for each section of thedocument? You can decide to have a different splitting policy for front and back matter.

Variables

143

Page 144: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

Type Name Description DefaultoutputTarget Type of output being generated

[string]html

REQUEST The complete URL when the docu-ment is being delivered from a webserver (normally set by Apache orCocoon) [string]

STDOUT Write to standard output channel[boolean]

true

xhtml ID An ID passed to the stylesheet toindicate which section to display[string]

xhtml requestedID A wrapper around the ID, to allow forother ways of getting it [string]

<xsl:value-of select="$ID"/>

xhtml URLPREFIX A path fragment to put before allinternal URLs [string]

xhtml outputName The name of the output file[string]

xhtml outputDir Directory in which to place generatedfiles. [string]

xhtml outputEncoding Encoding of output file(s).[string]

utf-8

xhtml outputMethod Output method for output file(s).[string]

xhtml

xhtml outputSuffix Suffix of output file(s). [string] .htmlxhtml doctypePublic Public Doctype of output file(s).

[string]-//W3C//DTDXHTML 1.0Transitional//EN

xhtml doctypeSystem System Doctype of output file(s).[string]

http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd

xhtml pageLayout The style of HTML (Simple, CSS orTable) which creates the layout forgenerated pages. The choice is be-tween Simple: A linear presentationis createdCSS: The page is created asa series of nested <div>s which canbe arranged using CSS into a multi-column layoutTable: The page is cre-ated as an HTML table [string]

Simple

xhtml splitBackmatter Break back matter into separateHTML pages (if splitting enabled).[boolean]

true

xhtml splitFrontmatter Break front matter into separateHTML pages (if splitting enabled).[boolean]

true

144

Page 145: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

xhtml splitLevel Level at which to split sections.When processing a <div> or <div[0-5]>, compare the nesting depth andsee whether to start a new HTMLpage. Since the TEI starts with<div1>, setting this parameter to 0will cause top-level sections to besplit apart. The default is not to splitat all. [integer]

-1

xhtml standardSuffix Suffix for generated output files.[string]

<xsl:choose> <xsl:when test="tei:teiCorpus">.html</xsl:when> <xsl:when test="$STDOUT=’true’"/> <xsl:otherwise> <xsl:value-of select="$outputSuffix"/> </xsl:otherwise></xsl:choose>

xhtml topNavigationPanel Display navigation panel at top ofpages. [boolean]

true

xhtml urlChunkPrefix How to specify infra-document links.When a document is split, links needto be constructed between parts ofthe document. The default is touse a query parameter on the URL.[string]

?ID=

xhtml useIDs Construct links using existing ID val-ues. It is often nice if, when mak-ing separate files, their names corre-spond to the ID attribute of the >div<.Alternatively, you can let the systemchoose names. [boolean]

true

xhtml autoBlockQuote Whether it should be attempted tomake quotes into block quotes if theyare over a certain length [boolean]

false

xhtml autoBlockQuoteLength Length beyond which a quote is ablock quote [integer]

150

fo language Language (for hyphenation)[string]

en_US

fo foEngine Name of intended XSL FO en-gine This is used to tailor the re-sult for different XSL FO proces-sors. By default, no special mea-sures are taken, so there are nobookmarks or other such features.Possible values are passivetex (theTeX-based PassiveTeX processor xep(XEP) fop (FOP) antenna (AntennaHouse) [string]

latex baseURL URL root where referenced docu-ments are located [string]

latex reencode Whether or not to load LaTeX pack-ages which attempt to process theUTF-8 characters. Set to "false"if you are using XeTeX or similar.[boolean]

true

145

Page 146: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

latex realFigures Use real name of graphics files ratherthan pointers [boolean]

true

Templates6.8.8 Table of contents generationYou probably want tables of contents built for your document, using the <div> structure. However, ifyou have used a <divGen type="toc"> explicitly, that will also create a table of contents, so youcan suppress the automatic one. When a table of contents is created, you choose how many levels ofheadings it will show. You can choose whether or not the front and backmatter appear in the table ofcontents.

Variables

Type Name Description Defaultxhtml autoToc Make an automatic table of contents

[boolean]true

xhtml class_subtoc CSS class for second-level TOC en-tries [string]

subtoc

xhtml subTocDepth Depth at which to stop doing a recur-sive table of contents. You can havea mini table of contents at the start ofeach section. The default is only toconstruct a TOC at the top level; avalue of -1 here means no subtoc atall. [integer]

-1

xhtml tocBack Include the back matter in the table ofcontents. [boolean]

true

xhtml tocDepth Depth to which table of contents isconstructed. [string]

5

xhtml tocFront Include the front matter in the tableof contents. [boolean]

true

xhtml tocElement Which HTML element to wrap eachTOCs entry in. [string]

p

xhtml tocContainerElement Which HTML element to wrap eachTOC sections in. [string]

div

xhtml refDocFooterText Text to link back to from foot of ODDreference pages [string]

TEI Guidelines

xhtml refDocFooterURL URL to link back to from foot ofODD reference pages [anyURI]

index.html

fo div0Tocindent Indentation for level 0 TOC entries[string]

0in

fo div1Tocindent Indentation for level 1 TOC entries[string]

0.25in

fo div2Tocindent Indentation for level 2 TOC entries[string]

0.5in

fo div3Tocindent Indentation for level 3 TOC entries[string]

0.75in

fo div4Tocindent Indentation for level 4 TOC entries[string]

1in

fo div5Tocindent Indentation for level 5 TOC entries[string]

1.25in

146

Page 147: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

fo tocBack Make TOC for sections in <back>[boolean]

true

fo tocFront Make TOC for sections in <front>[boolean]

true

fo tocNumberSuffix Punctuation to insert after a sectionnumber in a TOC [string]

.

fo tocStartPage Page number on which TOC shouldstart [integer]

1

TemplatesnavInterSep (for xhtml) Gap between elements in navigation list

<xsl:text>: </xsl:text>

6.8.9 InternationalizationAt various places, the system has to create text. You can choose the words it uses (eg translate them toanother language).

Variables

Type Name Description Default

TemplatescopyrightStatement (for xhtml) [html] Make a copyright claimThis page is copyrighted

6.8.10 CSSSetting up material for the CSS file to accompany HTML output.

Variables

Type Name Description Defaultclass_toc CSS class for TOC entries [string] toc

xhtml class_ptr CSS class for links derived from<ptr> [string]

ptr

xhtml class_ref CSS class for links derived from<ref> [string]

ref

xhtml cssFile CSS style file to be associated withoutput file(s) [anyURI]

http://www.tei-c.org/release/xml/tei/stylesheet/tei.css

xhtml cssPrintFile CSS style file for print; this willbe given a media=print attribute.[anyURI]

http://www.tei-c.org/release/xml/tei/stylesheet/tei-print.css

xhtml cssSecondaryFile Secondary CSS style file; this willbe given a media=screen attribute,so that it does not affect printing.It should be used for screen layout.[anyURI]

xhtml cssInlineFile CSS file to include in the output filedirectly [anyURI]

147

Page 148: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

Templates6.8.11 TablesDefault behaviour of table elements.

Variables

Type Name Description DefaultcellAlign Default alignment of table cells

[string]left

tableAlign Default alignment of tables[string]

left

fo defaultCellLabelBackgroundDefault colour for background of ta-ble cells which are labelling rows orcolumns [string]

silver

fo inlineTables Force tables to appear inline[boolean]

false

fo makeTableCaption Put a caption on tables [boolean] truefo tableCaptionAlign Alignment of table captions

[string]center

fo tableCellPadding Default padding on table cells[string]

2pt

TemplatestableCaptionstyle (for fo) [fo] Set attributes for display of table

<xsl:attribute name="text-align">center</xsl:attribute><xsl:attribute name="font-style">italic</xsl:attribute><xsl:attribute name="end-indent"><xsl:value-of select="$exampleMargin"/>

</xsl:attribute><xsl:attribute name="start-indent"><xsl:value-of select="$exampleMargin"/>

</xsl:attribute><xsl:attribute name="space-before"><xsl:value-of select="$spaceAroundTable"/>

</xsl:attribute><xsl:attribute name="space-after"><xsl:value-of select="$spaceBelowCaption"/>

</xsl:attribute><xsl:attribute name="keep-with-next">always</xsl:attribute>

6.8.12 Figures and graphicsSometimes you need to prefix the names of all graphics files with a directory name or a URL, or providea default suffix. You can also tell <figure> elements whether or not to produce anything.

Variables

Type Name Description DefaultgraphicsPrefix Directory specification to put before

names of graphics files, unless theystart with "./" [string]

graphicsSuffix Default file suffix for graphics files, ifnot directly specified [string]

.png

148

Page 149: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

standardScale Scaling of imported graphics[decimal]

1

headInXref [common] Whether cross-referenceto a figure or table includes its caption[boolean]

true

xhtml dpi Resolution of images. This is neededto calculate HTML width and height(in pixels) from supplied dimensions.[integer]

96

xhtml showFigures Display figures. [boolean] truefo autoScaleFigures How to scale figures if no width and

height specified (pass to XSL FOcontent-width) [string]

fo captionInlineFigures Put captions on inline figures[boolean]

false

fo showFloatHead Show the contents of <head> ina cross-reference to table or figure[boolean]

false

fo showFloatLabel Show a title for figures or tables (egTable or Figure) in a cross-reference[boolean]

false

fo xrefShowPage Show the page number in across-reference to table or figure[boolean]

false

TemplatesfigureCaptionstyle (for fo) [fo] Set attributes for display of figures

<xsl:attribute name="text-align">center</xsl:attribute><xsl:attribute name="font-style">italic</xsl:attribute><xsl:attribute name="end-indent"><xsl:value-of select="$exampleMargin"/>

</xsl:attribute><xsl:attribute name="start-indent"><xsl:value-of select="$exampleMargin"/>

</xsl:attribute>

6.8.13 StyleYou can choose lots of features which affect the font, size, etc

• What font to use for URLs.

• Whether titles, dates and authors are shown.

• Whether headings of objects are included in cross-references.

Variables

Type Name Description DefaultpagebreakStyle Display of <pb> element. Choices

are "visible", "active" and "none".[string]

visible

149

Page 150: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

displayMode How to display Relax NG schemafragments (rnc or rng) [string]

rnc

minimalCrossRef Provide minimal context for a link[boolean]

false

postQuote Character to insert at end of quote.[string]

preQuote Character to insert at start of quote[string]

xhtml urlMarkup HTML element to put around visibletext of display URLs [string]

span

fo activeLinebreaks Make <lb> active (ie cause a linebreak) [boolean]

true

fo alignment Alignment of text (ie justified orragged) [string]

justify

fo authorSize Font size for display of author name[string]

14pt

fo biblSize Font size for bibliography [string] 16ptfo bodyFont Default font for body [string] Timesfo bodyMaster Default font size for body (without

dimension) [string]10

fo bodySize Calculation of normal body font size(add dimension) [string]

<xsl:value-of select="$bodyMaster"/><xsl:text>pt</xsl:text>

fo dateSize Font size for display of date[string]

14pt

fo divFont Font for section headings [string] Timesfo exampleColor Colour for display of <eg> blocks.

[string]black

fo exampleBackgroundColor Colour for background display of<eg> blocks. [string]

gray

fo exampleSize Calculation of font size for examples(add dimension) [string]

<xsl:value-of select="$bodyMaster * 0.6"/><xsl:text>pt</xsl:text>

fo quoteSize Calculation of font size for quota-tions [string]

<xsl:value-of select="$bodyMaster * 0.9"/><xsl:text>pt</xsl:text>

fo footnoteSize Font size for footnotes [string]<xsl:value-of select="$bodyMaster * 0.8"/>

fo footnotenumSize Font size for footnote numbers[string]

<xsl:value-of select="$bodyMaster * 0.7"/>

fo giColor Colour for display of element names[string]

black

fo headingOutdent Indentation of headings [string] 0emfo hyphenate Hyphenate text [boolean] truefo identColor Colour for display of <ident> values

Customization parameter. [string]black

fo runFont Font family for running header andfooter [string]

sans-serif

150

Page 151: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

fo runSize Font size for running header andfooter [string]

9pt

fo sansFont Sans-serif font [string] Helveticafo smallSize Calculation of small font size (add

dimension) [string]<xsl:value-of select="$bodyMaster * 0.9"/><xsl:text>pt</xsl:text>

fo tableSize Create font size for tables, by refer-ence to $bodyMaster [string]

<xsl:value-of select="$bodyMaster * 0.9"/><xsl:text>pt</xsl:text>

fo titleSize Font size for display of title[string]

16pt

fo tocSize Font size for TOC heading[string]

16pt

fo typewriterFont Font for literal code [string] Courierlatex typewriterFont Font for literal code [string] DejaVu Sans Monolatex sansFont Font for sans-serif [string]latex romanFont Font for serif [string]latex gothicFont Font for gothic [string] Lucida Blackletterlatex calligraphicFont Font for calligraphic [string] Lucida Calligraphy

TemplatesdivXRefHeading (for fo) [fo] How to display section headings in a cross-reference section title

<xsl:param name="head"><xsl:apply-templates mode="section" select="tei:head"/>

</xsl:param><xsl:text> (</xsl:text><xsl:value-of select="normalize-space($head)"/><xsl:text>)</xsl:text>

linkStyle (for fo) [fo] Set attributes for display of links

<xsl:attribute name="text-decoration">underline</xsl:attribute>

setupDiv0 (for fo) [fo] Set attributes for display of heading for chapters (level 0)

<xsl:attribute name="font-size">18pt</xsl:attribute><xsl:attribute name="text-align">left</xsl:attribute><xsl:attribute name="font-weight">bold</xsl:attribute><xsl:attribute name="space-after">6pt</xsl:attribute><xsl:attribute name="space-before.optimum">12pt</xsl:attribute><xsl:attribute name="text-indent"><xsl:value-of select="$headingOutdent"/>

</xsl:attribute>

setupDiv1 (for fo) [fo] Set attributes for display of heading for 1st level sections

<xsl:attribute name="font-size">14pt</xsl:attribute><xsl:attribute name="text-align">left</xsl:attribute>

151

Page 152: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

<xsl:attribute name="font-weight">bold</xsl:attribute><xsl:attribute name="space-after">3pt</xsl:attribute><xsl:attribute name="space-before.optimum">9pt</xsl:attribute><xsl:attribute name="text-indent"><xsl:value-of select="$headingOutdent"/>

</xsl:attribute>

setupDiv2 (for fo) [fo] Set attributes for display of heading for 2nd level sections

<xsl:attribute name="font-size">12pt</xsl:attribute><xsl:attribute name="text-align">left</xsl:attribute><xsl:attribute name="font-weight">bold</xsl:attribute><xsl:attribute name="font-style">italic</xsl:attribute><xsl:attribute name="space-after">2pt</xsl:attribute><xsl:attribute name="space-before.optimum">4pt</xsl:attribute><xsl:attribute name="text-indent"><xsl:value-of select="$headingOutdent"/>

</xsl:attribute>

setupDiv3 (for fo) [fo]Set attributes for display of heading for 3rd level sections

<xsl:attribute name="font-size">10pt</xsl:attribute><xsl:attribute name="text-align">left</xsl:attribute><xsl:attribute name="font-style">italic</xsl:attribute><xsl:attribute name="space-after">0pt</xsl:attribute><xsl:attribute name="space-before.optimum">4pt</xsl:attribute><xsl:attribute name="text-indent"><xsl:value-of select="$headingOutdent"/>

</xsl:attribute>

setupDiv4 (for fo) [fo] Set attributes for display of heading for 4th level sections

<xsl:attribute name="font-size">10pt</xsl:attribute><xsl:attribute name="text-align">left</xsl:attribute><xsl:attribute name="font-style">italic</xsl:attribute><xsl:attribute name="space-after">0pt</xsl:attribute><xsl:attribute name="space-before.optimum">4pt</xsl:attribute><xsl:attribute name="text-indent"><xsl:value-of select="$headingOutdent"/>

</xsl:attribute>

setupDiv5 (for fo) [fo] Set attributes for display of heading for 5th level sections

<xsl:attribute name="font-size">10pt</xsl:attribute><xsl:attribute name="text-align">left</xsl:attribute><xsl:attribute name="font-style">italic</xsl:attribute><xsl:attribute name="space-after">0pt</xsl:attribute><xsl:attribute name="space-before.optimum">4pt</xsl:attribute><xsl:attribute name="text-indent"><xsl:value-of select="$headingOutdent"/>

</xsl:attribute>

setupDiv6 (for fo) [fo] Set attributes for display of heading for 6th level sections

152

Page 153: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.8 TEI reference material: XSL stylesheets

<xsl:attribute name="font-size">10pt</xsl:attribute><xsl:attribute name="text-align">left</xsl:attribute><xsl:attribute name="font-style">italic</xsl:attribute><xsl:attribute name="space-after">0pt</xsl:attribute><xsl:attribute name="space-before.optimum">4pt</xsl:attribute><xsl:attribute name="text-indent"><xsl:value-of select="$headingOutdent"/>

</xsl:attribute>

showXrefURL (for fo) [fo] How to display the link text of a <ptr>the URL being linked to

<xsl:param name="dest"/><xsl:value-of select="$dest"/>

6.8.14 HooksA set of templates which are empty by default; they can be used to add code at strategic points. Thecontent must be valid XSLT.

Variables

Type Name Description Default

TemplatessectionHeadHook (for common) [common] Hook where actions can be inserted when making a

heading

bodyHook (for xhtml) [html] Hook where HTML can be inserted just after <body>

bodyEndHook (for xhtml) [html] Hook where HTML can be inserted just before the <body> ends.This can be used to add a page-wide footer block.

bodyJavascriptHook (for xhtml) [html] Hook where Javascript calls can be inserted just after <body>

cssHook (for xhtml) [html] Hook where extra CSS can be inserted

headHook (for xhtml) [html] Hook where code can be added to the HTML <head>. This would beused to insert <meta> tags.

imgHook (for xhtml) [html] Hook where HTML can be inserted when creating an <img>

figureHook (for xhtml) [html] Hook where HTML can be inserted when processing a figure

javascriptHook (for xhtml) [html] Hook where extra Javascript functions can be defined

preAddressHook (for xhtml) [html] Hook where HTML can be inserted just before the <address>

startDivHook (for xhtml) [html] Hook where HTML can be inserted at the start of processing eachsection

startHook (for xhtml) [html] Hook where HTML can be inserted at the beginning of the main text,after the header

teiEndHook (for xhtml) [html] Hook where HTML can be inserted after processing <TEI>

teiStartHook (for xhtml) [html] Hook where HTML can be inserted before processing <TEI>

153

Page 154: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: Working with TEI Texts

xrefHook (for xhtml) [html] Hook where HTML can be inserted when creating an <a> element

egXMLStartHook (for xhtml) [html] Hooks where HTML can be inserted when processing<egXML> element

afterBodyHook (for fo) [fo] Hook where extra material can be inserted after the <body> has beenprocessed

blockStartHook (for fo) [fo] Hook where work can be done at the start of each block

pageMasterHook (for fo) [fo] Hook where extra page masters can be defined

beginDocumentHook (for latex) [latex] Hook where LaTeX commands can be inserted after thebeginning of the document

latexSetupHook (for latex) [latex] Hook where LaTeX commands can be at start of setup

latexPreambleHook (for latex) [latex] Hook where LaTeX commands can be inserted in the preamblebefore the beginning of the document

6.8.15 Miscellaneous and advancedFinally, some miscellaneous or advanced features which you probably won’t use much.

Variables

Type Name Description DefaultteixslHome The home page for these stylesheets

[anyURI]http://www.tei-c.org/Stylesheets/

teiP4Compat Process elements according to as-sumptions of TEI P4 [boolean]

false

useHeaderFrontMatter Title, author and date is taken fromthe <teiHeader> rather than lookedfor in the front matter [boolean]

false

useFixedDate Whether to attempt to work out acurrent date (set to true for test resultswhich won’t differ [boolean]

false

xhtml generateParagraphIDs Generate a unique ID for all para-graphs [boolean]

false

xhtml rendSeparator Character separating values in a rendattribute. Some projects use multi-ple values in rend attributes. Theseare handled, but the separator charac-ter(s) must be specified. [string]

;

xhtml showTitleAuthor Show a title and author at start ofdocument [boolean]

false

xhtml verbose Be talkative while working.[boolean]

false

Templates

154

Page 155: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

6.9 Quick reference cards for XSLT, XQuery, XPath, Regular Expressions, and Schematron

6.9 Quick reference cards for XSLT, XQuery, XPath, Regular Expres-sions, and Schematron

155

Page 156: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Text/

Str

ing F

uncti

ons

codepoin

t-equal(xs:s

trin

g?,

xs:s

trin

g?)

as

xs:b

oole

an?

codepoin

ts-to

-str

ing(x

s:inte

ger*

) as x

s:s

trin

g

com

pare

(xs:s

trin

g?,

xs:s

trin

g?)

as x

s:inte

ger?

com

pare

(xs:s

trin

g?,

xs:s

trin

g?,

xs:s

trin

g)

as

xs:inte

ger?

concat(

xs:a

nyA

tom

icType?,

xs:a

nyA

tom

icType?,

)

as x

s:s

trin

g

conta

ins(x

s:s

trin

g?,

xs:s

trin

g?)

as x

s:b

oole

an

conta

ins(x

s:s

trin

g?,

xs:s

trin

g?,

xs:s

trin

g)

as

xs:b

oole

an

curr

ent-

date

() a

s x

s:d

ate

curr

ent-

date

Tim

e()

as x

s:d

ate

Tim

e

curr

ent-

tim

e()

as x

s:t

ime

defa

ult

-collati

on()

as x

s:s

trin

g

encode-fo

r-uri

(xs:s

trin

g?)

as x

s:s

trin

g

ends-w

ith(x

s:s

trin

g?,

xs:s

trin

g?)

as x

s:b

oole

an

ends-w

ith(x

s:s

trin

g?,

xs:s

trin

g?,

xs:s

trin

g)

as

xs:b

oole

an

escape-htm

l-uri

(xs:s

trin

g?)

as x

s:s

trin

g

low

er-

case(x

s:s

trin

g?)

as x

s:s

trin

g

norm

alize-space()

as x

s:s

trin

g

norm

alize-space(x

s:s

trin

g?)

as x

s:s

trin

g

norm

alize-unic

ode(x

s:s

trin

g?)

as x

s:s

trin

g

norm

alize-unic

ode(x

s:s

trin

g?,

xs:s

trin

g)

as

xs:s

trin

g

sta

rts-w

ith(x

s:s

trin

g?,

xs:s

trin

g?)

as x

s:b

oole

an

sta

rts-w

ith(x

s:s

trin

g?,

xs:s

trin

g?,

xs:s

trin

g) as

xs:b

oole

an

str

ing()

as x

s:s

trin

g

str

ing(ite

m()

?) a

s x

s:s

trin

g

str

ing-jo

in(x

s:s

trin

g*,

xs:s

trin

g)

as x

s:s

trin

g

str

ing-le

ngth

() a

s x

s:inte

ger

str

ing-le

ngth

(xs:s

trin

g?)

as x

s:inte

ger

str

ing-to

-codepoin

ts(x

s:s

trin

g?)

as x

s:inte

ger*

substr

ing(x

s:s

trin

g?,

xs:d

ouble

) as x

s:s

trin

g

substr

ing(x

s:s

trin

g?,

xs:d

ouble

, xs:d

ouble

) as

xs:s

trin

g

substr

ing-aft

er(

xs:s

trin

g?,

xs:s

trin

g?)

as x

s:s

trin

g

substr

ing-aft

er(

xs:s

trin

g?,

xs:s

trin

g?,

xs:s

trin

g)

as

xs:s

trin

g

substr

ing-befo

re(x

s:s

trin

g?,

xs:

str

ing?)

as x

s:s

trin

g

substr

ing-befo

re(x

s:s

trin

g?,

xs:

str

ing?,

xs:s

trin

g)

as x

s:s

trin

g

transla

te(x

s:s

trin

g?,

xs:s

trin

g, xs:s

trin

g)

as x

s:s

trin

g

upper-

case(x

s:s

trin

g?)

as x

s:s

trin

g

XSL-Lis

t:

htt

p:/

/w

ww

.mulb

err

yte

ch.c

om

/xsl/

xsl-

list

REG

EX F

uncti

ons

matc

hes(x

s:s

trin

g?,

xs:s

trin

g) as x

s:b

oole

an

matc

hes(x

s:s

trin

g?,

xs:s

trin

g,

xs:s

trin

g) as

xs:b

oole

an

rep

lace(x

s:s

trin

g?,

xs:s

trin

g,

xs:

str

ing) as

xs:s

trin

g

rep

lace(x

s:s

trin

g?,

xs:s

trin

g,

xs:

str

ing, xs:s

trin

g)

as x

s:s

trin

g

tokeniz

e(x

s:s

trin

g?,

xs:s

trin

g) as x

s:s

trin

g*

tokeniz

e(x

s:s

trin

g?,

xs:s

trin

g,

xs:s

trin

g) as

xs:s

trin

g*

Ari

thm

eti

c O

pera

tors

+

(num

eri

c) as ~

num

eri

c

(num

eri

c) +

(num

eri

c)

as ~

num

eri

c

- (num

eri

c)

as ~

num

eri

c

(num

eri

c) - (num

eri

c) as ~

num

eri

c

(num

eri

c) *

(num

eri

c)

as ~

num

eri

c

(num

eri

c) d

iv (num

eri

c) as ~

num

eri

c

(num

eri

c)

idiv

(num

eri

c)

as x

s:inte

ger

(num

eri

c) m

od

(num

eri

c)

as ~

num

eri

c

Ari

thm

eti

c F

uncti

ons

abs(n

um

eri

c?)

as ~

num

eri

c?

avg(x

s:a

nyA

tom

icType*)

as ~

xs:

anyA

tom

icType?

ceilin

g(n

um

eri

c?)

as ~

num

eri

c?

floor(

num

eri

c?)

as ~

num

eri

c?

num

ber(

) as x

s:d

ouble

num

ber(

xs:a

nyA

tom

icType?)

as x

s:d

ouble

round(n

um

eri

c?)

as ~

num

eri

c?

round-half

-to

-even(n

um

eri

c?)

as ~

num

eri

c?

round-half

-to

-even(n

um

eri

c?,

xs:inte

ger)

as

~num

eri

c?

sum

(xs:a

nyA

tom

icType*)

as ~

xs:a

nyA

tom

icType

sum

(xs:a

nyA

tom

icType*,

xs:a

nyA

tom

icType?)

as

~xs:a

nyA

tom

icType?

The e

q, ne, lt

, gt,

le a

nd g

e c

om

pari

sons a

re

support

ed f

or

the n

um

eri

c t

ypes.

Sequence O

pera

tors

(ite

m()

*) ,

(it

em

()*)

as ~

item

()*

(node()

*) u

nio

n (node()

*) a

s ~

node()

*

(node()

*) inte

rsect

(node()

*) a

s ~

node()

*

(node()

*) e

xcept

(node()

*) a

s ~

node()

*

(xs:inte

ger)

to (

xs:inte

ger)

as x

s:inte

ger*

Node C

om

pari

sons

(node()

) is

(node()

) as x

s:b

oole

an

(node()

) <

< (node()

) as x

s:b

oole

an

(node()

) >

> (node()

) as x

s:b

oole

an

Sequence a

nd N

ode F

uncti

ons

collecti

on()

as n

ode()

*

collecti

on(x

s:s

trin

g?)

as n

ode()

*

count(

item

()*)

as x

s:inte

ger

data

(ite

m()

*) a

s ~

xs:a

nyA

tom

icType*

deep-equal(it

em

()*,

ite

m()

*) a

s x

s:b

oole

an

deep-equal(it

em

()*,

ite

m()

*, s

trin

g)

as x

s:b

oole

an

dis

tinct-

valu

es(x

s:a

nyA

tom

icType*)

as

~xs:a

nyA

tom

icType*

dis

tinct-

valu

es(x

s:a

nyA

tom

icType*,

xs:s

trin

g)

as

~xs:a

nyA

tom

icType*

doc(x

s:s

trin

g?)

as d

ocum

ent-

node()

?

em

pty

(ite

m()

*) a

s x

s:b

oole

an

exactl

y-one(ite

m()

*) a

s ~

item

()

exis

ts(ite

m()

*) a

s x

s:b

oole

an

index-of(

xs:a

nyA

tom

icType*,

xs:a

nyA

tom

icType)

as x

s:inte

ger*

index-of(

xs:a

nyA

tom

icType*,

xs:a

nyA

tom

icType,

xs:s

trin

g) as x

s:inte

ger*

insert

-befo

re(ite

m()

*, x

s:inte

ger,

ite

m()

*) a

s

~it

em

()*

last(

) as x

s:inte

ger

nille

d(n

ode()

?) a

s x

s:b

oole

an?

node-nam

e(n

ode()

?) a

s x

s:Q

Nam

e?

one-or-

more

(ite

m()

*) a

s ~

item

()+

posit

ion()

as x

s:inte

ger

rem

ove(ite

m()

*, x

s:inte

ger)

as ~

item

()*

revers

e(ite

m()

*) a

s ~

item

()*

root(

) as n

ode()

root(

node()

?) a

s n

ode()

?

subsequence(ite

m()

*, x

s:d

ouble

) as ~

item

()*

subsequence(ite

m()

*, x

s:d

ouble

, xs:d

ouble

) as

~it

em

()*

unord

ere

d(ite

m()

*) a

s ~

item

()*

zero

-or-

one(ite

m()

*) a

s ~

item

()?

Mis

cellaneous F

uncti

ons

err

or(

) as n

one

err

or(

xs:Q

Nam

e) as n

one

err

or(

xs:Q

Nam

e?,

xs:s

trin

g)

as n

one

err

or(

xs:Q

Nam

e?,

xs:s

trin

g,

item

()*)

as n

one

lang(x

s:s

trin

g?)

as x

s:b

oole

an

lang(x

s:s

trin

g?,

node()

) as x

s:b

oole

an

max(x

s:a

nyA

tom

icType*)

as ~

xs:a

nyA

tom

icType?

max(x

s:a

nyA

tom

icType*,

str

ing)

as

~xs:a

nyA

tom

icType?

min

(xs:a

nyA

tom

icType*)

as ~

xs:

anyA

tom

icType?

min

(xs:a

nyA

tom

icType*,

str

ing)

as

~xs:a

nyA

tom

icType?

trace(ite

m()

*, x

s:s

trin

g)

as ~

item

()*

Boole

an F

uncti

ons

boole

an(ite

m()

*) a

s x

s:b

oole

an

fals

e()

as x

s:b

oole

an

not(

item

()*)

as x

s:b

oole

an

true()

as x

s:b

oole

an

The e

q, ne, lt

, gt,

le a

nd g

e c

om

pari

sons a

re

support

ed f

or

the x

s:b

oole

an t

ype.

UR

I, ID

and X

ML N

am

e F

uncti

ons

base-uri

() a

s x

s:a

nyU

RI?

base-uri

(node()

?) a

s x

s:a

nyU

RI?

docum

ent-

uri

(node()

?) a

s x

s:a

nyU

RI?

doc-available

(xs:s

trin

g?)

as x

s:b

oole

an

in-scope-pre

fixes(e

lem

ent(

)) a

s x

s:s

trin

g*

id(x

s:s

trin

g*)

as e

lem

ent(

)*

id(x

s:s

trin

g*,

node()

) as e

lem

ent(

)*

idre

f(xs:s

trin

g*)

as n

ode()

*

idre

f(xs:s

trin

g*,

node()

) as n

ode()

*

iri-

to-uri

(xs:s

trin

g?)

as x

s:s

trin

g

local-

nam

e()

as x

s:s

trin

g

local-

nam

e(n

ode()

?) a

s x

s:s

trin

g

local-

nam

e-fr

om

-Q

Nam

e(x

s:Q

Nam

e?)

as

xs:N

CN

am

e?

nam

e()

as x

s:s

trin

g

nam

e(n

ode()

?) a

s x

s:s

trin

g

nam

espace-uri

() a

s x

s:a

nyU

RI

nam

espace-uri

(node()

?) a

s x

s:a

nyU

RI

nam

espace-uri

-fo

r-pre

fix(x

s:s

trin

g?,

ele

ment(

))

as x

s:a

nyU

RI?

nam

espace-uri

-fr

om

-Q

Nam

e(x

s:Q

Nam

e?)

as

xs:a

nyU

RI?

pre

fix-fr

om

-Q

Nam

e(x

s:Q

Nam

e?)

as x

s:N

CN

am

e?

QN

am

e(x

s:s

trin

g?,

xs:s

trin

g)

as x

s:Q

Nam

e

resolv

e-Q

Nam

e(x

s:s

trin

g?,

ele

ment(

)) a

s

xs:Q

Nam

e?

resolv

e-uri

(xs:s

trin

g?)

as x

s:a

nyU

RI?

resolv

e-uri

(xs:s

trin

g?,

xs:s

trin

g)

as x

s:a

nyU

RI?

sta

tic-base-uri

() a

s x

s:a

nyU

RI?

Built-

In S

chem

a T

ypes

These t

ypes a

re a

vailable

in a

ll im

ple

menta

tions.

xs:a

nyA

tom

icType

xs:g

Month

xs:a

nySim

ple

Type

xs:a

nyU

RI

xs:a

nyType

xs:g

Month

Day

xs:b

ase64Bin

ary

xs:g

Year

xs:b

oole

an

xs:g

YearM

onth

xs:d

ate

xs:h

exBin

ary

xs:d

ate

Tim

e

xs:inte

ger

xs:d

ayTim

eD

ura

tion

xs:Q

Nam

e

xs:d

ecim

al

xs:s

trin

g

xs:d

ouble

xs:t

ime

xs:d

ura

tion

xs:u

nty

ped

xs:f

loat

xs:u

nty

pedA

tom

ic

xs:g

Day

xs:y

earM

onth

Dura

tion

XPath Functions

156

Page 157: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Date

/T

ime F

uncti

ons

adju

st-

date

-to

-ti

mezone(x

s:d

ate

?) a

s x

s:d

ate

?

adju

st-

date

-to

-ti

mezone(x

s:d

ate

?,

xs:d

ayTim

eD

ura

tion?)

as x

s:d

ate

?

adju

st-

date

Tim

e-to

-ti

mezone(x

s:d

ate

Tim

e?)

as

xs:d

ate

Tim

e?

adju

st-

date

Tim

e-to

-ti

mezone(x

s:d

ate

Tim

e?,

xs:d

ayTim

eD

ura

tion?)

as x

s:d

ate

Tim

e?

adju

st-

tim

e-to

-ti

mezone(x

s:t

ime?)

as x

s:t

ime?

adju

st-

tim

e-to

-ti

mezone(x

s:t

ime?,

xs:d

ayTim

eD

ura

tion?)

as x

s:t

ime?

date

Tim

e(x

s:d

ate

?, x

s:t

ime?)

as x

s:d

ate

Tim

e?

day-fr

om

-date

(xs:d

ate

?) a

s x

s:inte

ger?

day-fr

om

-date

Tim

e(x

s:d

ate

Tim

e?)

as x

s:inte

ger?

days-fr

om

-dura

tion(x

s:d

ura

tion?)

as x

s:inte

ger?

hours

-fr

om

-date

Tim

e(x

s:d

ate

Tim

e?)

as

xs:inte

ger?

hours

-fr

om

-dura

tion(x

s:d

ura

tion?)

as x

s:inte

ger?

hours

-fr

om

-ti

me(x

s:t

ime?)

as x

s:inte

ger?

implicit

-ti

mezone()

as x

s:d

ayTim

eD

ura

tion

min

ute

s-fr

om

-date

Tim

e(x

s:d

ate

Tim

e?)

as

xs:inte

ger?

min

ute

s-fr

om

-dura

tion(x

s:d

ura

tion?)

as

xs:inte

ger?

min

ute

s-fr

om

-ti

me(x

s:t

ime?)

as x

s:inte

ger?

month

-fr

om

-date

(xs:d

ate

?) a

s x

s:inte

ger?

month

-fr

om

-date

Tim

e(x

s:d

ate

Tim

e?)

as

xs:inte

ger?

month

s-fr

om

-dura

tion(x

s:d

ura

tion?)

as

xs:inte

ger?

seconds-fr

om

-date

Tim

e(x

s:d

ate

Tim

e?)

as

xs:d

ecim

al?

seconds-fr

om

-dura

tion(x

s:d

ura

tion?)

as

xs:d

ecim

al?

seconds-fr

om

-ti

me(x

s:t

ime?)

as x

s:d

ecim

al?

tim

ezone-fr

om

-date

(xs:d

ate

?) a

s

xs:d

ayTim

eD

ura

tion?

tim

ezone-fr

om

-date

Tim

e(x

s:d

ate

Tim

e?)

as

xs:d

ayTim

eD

ura

tion?

tim

ezone-fr

om

-ti

me(x

s:t

ime?)

as

xs:d

ayTim

eD

ura

tion?

year-

from

-date

(xs:d

ate

?) a

s x

s:in

teger?

year-

from

-date

Tim

e(x

s:d

ate

Tim

e?)

as x

s:inte

ger?

years

-fr

om

-dura

tion(x

s:d

ura

tion?)

as x

s:inte

ger?

XPath

2.0

: htt

p:/

/w

ww

.w3

.org

/TR/xpath

20/

XQ

uery

1.0

: htt

p:/

/w

ww

.w3

.org

/TR/xquery

/

XQ

uery

1.0

& X

Path

2.0

Functi

ons &

Opera

tors

: htt

p:/

/w

ww

.w3

.org

/TR/xpath

-fu

ncti

ons/

XSLT

-O

nly

Functi

ons

curr

ent(

) as ite

m()

curr

ent-

gro

up()

as ite

m()

*

curr

ent-

gro

upin

g-key()

as x

s:a

nyA

tom

icType?

docum

ent(

item

()*)

as n

ode()

*

docum

ent(

item

()*,

node()

) as n

ode()

*

ele

ment-

available

(xs:s

trin

g) as x

s:b

oole

an

form

at-

date

Tim

e(x

s:d

ate

Tim

e?,

xs:s

trin

g,

xs:s

trin

g?,

xs:s

trin

g?,

xs:s

trin

g?)

as x

s:s

trin

g?

form

at-

date

Tim

e(x

s:d

ate

Tim

e?,

xs:s

trin

g)

as

xs:s

trin

g?

form

at-

date

(xs:d

ate

?, x

s:s

trin

g, xs:s

trin

g?,

xs:s

trin

g?,

xs:s

trin

g?)

as x

s:s

trin

g?

form

at-

date

(xs:d

ate

?, x

s:s

trin

g)

as x

s:s

trin

g?

form

at-

num

ber(

num

eri

c?,

xs:s

trin

g)

as x

s:s

trin

g

form

at-

num

ber(

num

eri

c?,

xs:s

trin

g,

xs:s

trin

g)

as

xs:s

trin

g

form

at-

tim

e(x

s:t

ime?,

xs:s

trin

g, xs:s

trin

g?,

xs:s

trin

g?,

xs:s

trin

g?)

as x

s:s

trin

g?

form

at-

tim

e(x

s:t

ime?,

xs:s

trin

g)

as x

s:s

trin

g?

functi

on-available

(xs:s

trin

g) as x

s:b

oole

an

functi

on-available

(xs:s

trin

g,

xs:

inte

ger)

as

xs:b

oole

an

genera

te-id

() a

s x

s:s

trin

g

genera

te-id

(node()

?) a

s x

s:s

trin

g

key(x

s:s

trin

g,

xs:a

nyA

tom

icType*)

as n

ode()

*

key(x

s:s

trin

g,

xs:a

nyA

tom

icType*,

node()

) as

node()

*

regex-gro

up(x

s:inte

ger)

as x

s:s

trin

g

syste

m-pro

pert

y(x

s:s

trin

g) as x

s:s

trin

g

type-available

(xs:s

trin

g)

as x

s:b

oole

an

unpars

ed-te

xt(

xs:s

trin

g?)

as x

s:str

ing?

unpars

ed-te

xt(

xs:s

trin

g?,

xs:s

trin

g) as x

s:s

trin

g?

unpars

ed-te

xt-

available

(xs:s

trin

g?)

as x

s:b

oole

an

unpars

ed-te

xt-

available

(xs:s

trin

g?,

xs:s

trin

g?)

as

xs:b

oole

an

unpars

ed-enti

ty-uri

(xs:s

trin

g) as x

s:a

nyU

RI

unpars

ed-enti

ty-public-id

(xs:s

trin

g)

as x

s:s

trin

g

Arg

um

ent

Nota

tion

num

eri

c

Any o

f xs:inte

ger,

xs:d

ecim

al, x

s:f

loat

or

xs:d

ouble

. *

A s

equence o

f th

e indic

ate

d t

ype.

? The indic

ate

d t

ype o

r em

pty

sequence.

~

The r

esult

type v

ari

es d

ependin

g o

n t

he

arg

um

ents

. xs:

htt

p:/

/w

ww

.w3

.org

/2001/XM

LSchem

a

2008-07-21

XQ

uery

1.0

&

XPath

2.0

Functi

ons &

Opera

tors

Quic

k R

efe

rence

Sam

Wilm

ott

sam

@w

ilm

ott

.ca

htt

p:/

/w

ww

.wilm

ott

.ca

and

Mulb

err

y T

echnolo

gie

s, In

c.

17 W

est

Jeff

ers

on S

treet,

Suit

e 2

07

Rockville

, M

D 2

085

0 U

SA

Phone:

+1 3

01/31

5-9

63

1

Fax:

+1 3

01

/31

5-828

5

info

@m

ulb

err

yte

ch.c

om

htt

p:/

/w

ww

.mulb

err

yte

ch.c

om

© 2

007

-2

008

Sam

Wilm

ott

and

M

ulb

err

y T

echnolo

gie

s, In

c.

Date

/T

ime O

pera

tors

(x

s:d

ate

) +

(xs:d

ayTim

eD

ura

tion)

as x

s:d

ate

(xs:d

ate

) +

(xs:y

earM

onth

Dura

tion)

as x

s:d

ate

(xs:d

ate

Tim

e)

+ (

xs:d

ayTim

eD

ura

tion) as

xs:d

ate

Tim

e

(xs:d

ate

Tim

e)

+ (

xs:y

earM

onth

Dura

tion)

as

xs:d

ate

Tim

e

(xs:d

ayTim

eD

ura

tion)

+ (

xs:d

ayTim

eD

ura

tion) as

xs:d

ayTim

eD

ura

tion

(xs:t

ime) +

(xs:d

ayTim

eD

ura

tion)

as x

s:t

ime

(xs:y

earM

onth

Dura

tion)

+ (xs:y

earM

onth

Dura

tion)

as x

s:y

earM

onth

Dura

tion

(xs:d

ate

) - (

xs:d

ate

) as x

s:d

ayTim

eD

ura

tion

(xs:d

ate

) - (

xs:d

ayTim

eD

ura

tion) as x

s:d

ate

(xs:d

ate

) - (

xs:y

earM

onth

Dura

tion)

as x

s:d

ate

(xs:d

ate

Tim

e)

- (xs:d

ate

Tim

e) as

xs:d

ayTim

eD

ura

tion

(xs:d

ate

Tim

e)

- (xs:d

ayTim

eD

ura

tion) as

xs:d

ate

Tim

e

(xs:d

ate

Tim

e)

- (xs:y

earM

onth

Dura

tion)

as

xs:d

ate

Tim

e

(xs:d

ayTim

eD

ura

tion)

- (xs:d

ayTim

eD

ura

tion) as

xs:d

ayTim

eD

ura

tion

(xs:t

ime) - (xs:d

ayTim

eD

ura

tion)

as x

s:t

ime

(xs:t

ime) - (xs:t

ime) as x

s:d

ayTim

eD

ura

tion

(xs:y

earM

onth

Dura

tion)

- (

xs:y

earM

onth

Dura

tion)

as x

s:y

earM

onth

Dura

tion

(xs:d

ayTim

eD

ura

tion)

* (x

s:d

ouble

) as

xs:d

ayTim

eD

ura

tion

(xs:y

earM

onth

Dura

tion)

* (x

s:d

ouble

) as

xs:y

earM

onth

Dura

tion

(xs:d

ayTim

eD

ura

tion)

div

(xs:d

ayTim

eD

ura

tion)

as

xs:d

ecim

al

(xs:d

ayTim

eD

ura

tion)

div

(xs:d

ouble

) as

xs:d

ayTim

eD

ura

tion

(xs:y

earM

onth

Dura

tion)

div

(xs:d

ouble

) as

xs:y

earM

onth

Dura

tion

(xs:y

earM

onth

Dura

tion)

div

(x

s:y

earM

onth

Dura

tion)

as x

s:d

ecim

al

The e

q, ne, lt

, gt,

le a

nd g

e c

om

pari

sons a

re

suppote

d f

or

the t

ypes: xs:d

ate

and x

s:t

ime.

The e

q a

nd n

e (

only

) com

pari

sons a

re s

upport

ed

for

the t

ypes: xs:d

ura

tion, xs:g

Day,

xs:g

Month

, xs:g

Month

Day, xs:g

Year

and

xs:g

YearM

onth

.

The lt,

gt,

le a

nd g

e (

only

) com

pari

sons a

re

support

ed f

or

the t

ypes: xs:d

ayTim

eD

ura

tion

and x

s:y

earM

onth

Dura

tion

.

Oth

er

Com

pari

sons

The e

q a

nd n

e (

only

) com

pari

sons a

re s

upport

ed

for

the t

ypes: xs:b

ase64Bin

ary

, xs:h

exBin

ary

, xs:N

OTA

TIO

N a

nd x

s:Q

Nam

e.

157

Page 158: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Cate

gory

Escapes

A c

ate

gory

escape m

atc

hes a

chara

cte

r fr

om

a s

et

specif

ied b

y a

pro

pert

y o

r usin

g a

blo

ck:

\p

indic

ate

s m

atc

h a

ny c

hara

cte

r in

the s

et.

\P

indic

ate

s m

atc

h a

ny c

hara

cte

r not

in t

he s

et.

Cate

gori

es a

nd P

ropert

ies

Any c

hara

cte

r can b

e m

atc

hed b

y its

pro

pert

ies

usin

g a

cate

gory

escape c

onsis

ting o

f a C

ate

gory

code f

ollow

ed b

y a

n o

pti

onal Pro

pert

y c

ode:

\p{L

} A

ny L

ett

er

\p{L

u}

Any U

pper-

case L

ett

er

\p{L

l}

Any L

ow

er-

case L

ett

er

\p{L

t}

Any T

itle

-case L

ett

er

\p{L

m}

Any L

ett

er

Modif

ier

\p{L

o}

Any “

Oth

er”

Lett

er

\p{M

} A

ny M

ark

\p{M

n}

Any N

on-Spacin

g M

ark

\p{M

c}

Any C

om

bin

ing M

ark

\p{M

e}

Any E

nclo

sin

g M

ark

\p{N

} A

ny D

igit

\p{N

d}

Any D

ecim

al D

igit

\p{N

l}

Any L

ett

er

Dig

it

\p{N

o}

Any “

Oth

er”

Dig

it

\p{P

} A

ny P

unctu

ati

on C

hara

cte

r

\p{P

c}

Any C

onnecto

r C

hara

cte

r

\p{P

d}

Any D

ash

Chara

cte

r

\p{P

s}

Any O

pen C

hara

cte

r

\p{P

e}

Any C

lose C

hara

cte

r

\p{P

i}

Any Init

ial Q

uote

Chara

cte

r

\p{P

f}

Any F

inal Q

uote

Chara

cte

r

\p{P

o}

Any “

Oth

er”

Punctu

ati

on

\p{Z

} A

ny S

epara

tor

Chara

cte

r

\p{Z

s}

Any S

pace S

epara

tor

\p{Z

l}

Any L

ine S

epara

tor

\p{Z

p}

Any P

ara

gra

ph S

epara

tor

\p{S

} A

ny S

ym

bol C

hara

cte

r

\p{S

m}

Any M

ath

Sym

bol

\p{S

c}

Any C

urr

ency S

ym

bol

\p{S

k}

Any M

odif

ier

Sym

bol

\p{S

o}

Any “

Oth

er”

Sym

bol

\p{C

} A

ny “

Oth

er”

Chara

cte

r

\p{C

c}

Any C

ontr

ol C

hara

cte

r

\p{C

f}

Any F

orm

at

Chara

cte

r

\p{C

o}

Any P

rivate

Use C

hara

cte

r

\p{C

n}

Any “

Not

Assig

ned”

Chara

cte

r

Chara

cte

r Blo

cks

Any c

hara

cte

r w

ithin

a U

nic

ode c

hara

cte

r blo

ck

can b

e m

atc

hed u

sin

g a

cate

gory

escape

consis

ting o

f “I

s”

follow

ed b

y t

he b

lock‟s

nam

e.

For

exam

ple

: \p{IsBasic

Lati

n}

Blo

ck

Sta

rt

Blo

ck

End

Blo

ck

Nam

e

0000

007F

Basic

Lati

n

0080

00FF

Lati

n-1Supple

ment

0100

017F

Lati

nExte

nded-A

0180

024F

Lati

nExte

nded-B

0250

02A

F

IPA

Exte

nsio

ns

02B0

02FF

Spacin

gM

odif

ierL

ett

ers

0300

036F

Com

bin

ingD

iacri

ticalM

ark

s

0370

03FF

Gre

ek

0400

04FF

Cyri

llic

0530

058F

Arm

enia

n

0590

05FF

Hebre

w

0600

06FF

Ara

bic

0700

074F

Syri

ac

0780

07BF

Thaana

0900

097F

Devanagari

0980

09FF

Bengali

0A

00

0A

7F

Gurm

ukhi

0A

80

0A

FF

Guja

rati

0B00

0B7F

Ori

ya

0B80

0BFF

Tam

il

0C

00

0C

7F

Telu

gu

0C

80

0C

FF

Kannada

0D

00

0D

7F

Mala

yala

m

0D

80

0D

FF

Sin

hala

0E00

0E7F

Thai

0E80

0EFF

Lao

0F00

0FFF

Tib

eta

n

1000

109F

Myanm

ar

10A

0

10FF

Georg

ian

1100

11FF

HangulJam

o

1200

137F

Eth

iopic

13A

0

13FF

Chero

kee

1400

167F

U

nif

iedC

anadia

nA

bori

gin

alS

yllabic

s

1680

169F

Ogham

16A

0

16FF

Runic

1780

17FF

Khm

er

1800

18A

F

Mongolian

1E00

1EFF

Lati

nExte

ndedA

ddit

ional

1F00

1FFF

Gre

ekExte

nded

2000

206F

Genera

lPunctu

ati

on

2070

209F

Supers

cri

pts

andSubscri

pts

20A

0

20C

F

Curr

encySym

bols

20D

0

20FF

Com

bin

ingM

ark

sfo

rSym

bols

2100

214F

Lett

erl

ikeSym

bols

2150

218F

Num

berF

orm

s

Blo

ck

Sta

rt

Blo

ck

End

Blo

ck

Nam

e

2190

21FF

Arr

ow

s

2200

22FF

Math

em

ati

calO

pera

tors

2300

23FF

Mis

cellaneousTechnic

al

2400

243F

Contr

olP

ictu

res

2440

245F

Opti

calC

hara

cte

rRecognit

ion

2460

24FF

Enclo

sedA

lphanum

eri

cs

2500

257F

BoxD

raw

ing

2580

259F

Blo

ckEle

ments

25A

0

25FF

Geom

etr

icShapes

2600

26FF

Mis

cellaneousSym

bols

2700

27BF

Din

gbats

2800

28FF

Bra

ille

Patt

ern

s

2E80

2EFF

CJK

Radic

als

Supple

ment

2F00

2FD

F

KangxiR

adic

als

2FF0

2FFF

Ideogra

phic

Descri

pti

onC

hara

cte

rs

3000

303F

CJK

Sym

bols

andPunctu

ati

on

3040

309F

Hir

agana

30A

0

30FF

Kata

kana

3100

312F

Bopom

ofo

3130

318F

HangulC

om

pati

bilit

yJa

mo

3190

319F

Kanbun

31A

0

31BF

Bopom

ofo

Exte

nded

3200

32FF

Enclo

sedC

JKLett

ers

andM

onth

s

3300

33FF

CJK

Com

pati

bilit

y

3400

4D

B5

C

JKU

nif

iedId

eogra

phsExte

nsio

nA

4E00

9FFF

CJK

Unif

iedId

eogra

phs

A000

A48F

YiS

yllable

s

A490

A4C

F

YiR

adic

als

AC

00

D7A

3

HangulS

yllable

s

E000

F8FF

Pri

vate

Use

F900

FA

FF

CJK

Com

pati

bilit

yId

eogra

phs

FB00

FB4F

Alp

habeti

cPre

senta

tionForm

s

FB50

FD

FF

Ara

bic

Pre

senta

tionForm

s-A

FE20

FE2F

Com

bin

ingH

alf

Mark

s

FE30

FE4F

CJK

Com

pati

bilit

yForm

s

FE50

FE6F

Sm

allForm

Vari

ants

FE70

FEFE

Ara

bic

Pre

senta

tionForm

s-B

FEFF

FEFF

Specia

ls

FF0

0

FFEF

Half

wid

thandFullw

idth

Form

s

FFF0

FFFD

Specia

ls

XSLT 2

.0:

htt

p:/

/w

ww

.w3

.org

/TR/xslt

20/

XQ

uery

1.0

:

htt

p:/

/w

ww

.w3

.org

/TR/xquery

/

XPath

2.0

:

htt

p:/

/w

ww

.w3

.org

/TR/xpath

20/

Unic

ode:

htt

p:/

/w

ww

.unic

ode.o

rg

Regula

r Expre

ssio

n E

xam

ple

s

^[A

-Za-z]

An A

scii lett

er

at

the s

tart

of

a s

trin

g o

r line.

^\p{L

u}

An u

pper-

case U

nic

ode lett

er

at

the s

tart

of

a

str

ing o

r line.

\.$

A p

eri

od a

t th

e e

nd o

f a s

trin

g o

r line.

\p{IsG

reek}+

One o

r m

ore

Gre

ek lett

ers

.

\p{IsG

reek}{

1,}

One o

r m

ore

Gre

ek lett

ers

.

.*?;

Up t

o a

nd inclu

din

g t

he n

ext

sem

icolo

n.

.*;

Up t

o a

nd inclu

din

g t

he last

sem

icolo

n.

^\c+

$

Matc

h o

nly

if

the s

trin

g c

onsis

ts e

nti

rely

of

XM

L n

am

e c

hara

cte

rs.

[ -~-[\

[\]]

]+

Any A

scii p

rinta

ble

chara

cte

r except

the

square

bra

ckets

.

\w

+ A "

word

".

[^\s]+

Non-w

hit

e-space c

hara

cte

rs.

\S+

Non-w

hit

e-space c

hara

cte

rs.

(['"])

(.*?

)\1

A s

trin

g d

elim

ited b

y s

ingle

or

double

quote

s.

$2 o

r re

gex-gro

up(2

) w

ill re

turn

the u

nquote

d

substr

ing.

(\1 is t

he q

uote

chara

cte

r used.)

\s*(

\i\

c*)

\s*=

\s*(

["'])(

.*?)

\2

An X

ML-att

ribute

-like n

am

e, equal and

quote

d v

alu

e (

wit

h o

pti

onal le

adin

g a

nd

inte

rvenin

g w

hit

e s

pace).

$1 is t

he n

am

e a

nd

$3 is t

he v

alu

e.

\((

\d+

|\p{L

}+)\

)

A p

are

nth

esiz

ed s

equence e

ither

of

dig

its o

r

of

lett

ers

(but

not

a m

ixtu

re o

f both

).

\p{S

c}(

\d+

(\.\

d*)

?|\.\

d+

)

A d

ecim

al num

ber

wit

h a

leadin

g c

urr

ency

sym

bol.

Regular expressions

158

Page 159: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Escapin

g C

hara

cte

rs

Chara

cte

rs t

hat

have s

pecia

l m

eanin

g in r

egula

r expre

ssio

ns n

eed t

o b

e e

scaped if

they a

re t

o b

e

repre

sente

d “

as is”.

These c

hara

cte

rs a

re:

\

| .

? *

+

( )

{ }

[ ]

-

^

$

In a

ddit

ion, th

e f

ollow

ing e

scapes r

epre

sent

sin

gle

chara

cte

rs:

\n

new

line o

r line-fe

ed c

hara

cte

r (&

#x0A

;)

\r

carr

iage r

etu

rn c

hara

cte

r (&

#x0D

;)

\t

tab c

hara

cte

r (&

#x09;)

Mult

i-C

hara

cte

r Escapes

. (d

ot)

Any N

on

-Lin

e-End C

hara

cter

\s

Any S

pace C

hara

cte

r

\i

Any Init

ial N

am

e C

hara

cte

r

(inclu

din

g „

_‟ and „:‟)

\c

Any N

am

e C

hara

cte

r

(inclu

din

g „

.‟, „-

„, „_‟ and „:‟)

\d

Any D

ecim

al D

igit

\w

A

ny “

Word

” C

hara

cte

r (a

nyth

ing o

ther

than P

unctu

ati

on, Separa

tor

or

“Oth

er”

)

An u

pper-

case m

ult

i-ch

ara

cte

r escape m

atc

hes

any c

hara

cte

r not

descri

bed b

y t

he low

er-

case

escape. T

he u

pper-

case e

scapes a

re:

\S

\I

\C

\D

\W

Chara

cte

r C

lass E

xpre

ssio

ns

A c

hara

cte

r cla

ss e

xpre

ssio

n m

atc

hes a

sin

gle

chara

cte

r. It

‟s w

rapped in s

quare

bra

ckets

and

consis

ts o

f th

ree p

art

s:

1.

an o

pti

onal negati

on indic

ato

r, ^

.

2.

one o

r m

ore

chara

cte

rs o

r ra

nges, and

3.

an o

pti

onal ch

ara

cte

r cla

ss s

ubtr

acti

on.

If t

he n

egati

on indic

ato

r is

used, th

e s

ingle

chara

cte

r m

atc

hed is a

ny c

hara

cte

r not

giv

en

follow

ing it

or

in a

giv

en r

ange.

A c

hara

cte

r ra

nge c

onsis

ts o

f tw

o c

hara

cte

rs

separa

ted b

y a

dash, as in:

[-a-zA

-Z0-9_]

A leadin

g d

ash (-)

is a

dash

, not

a r

ange.

A c

hara

cte

r cla

ss s

ubtr

acti

on c

onsis

ts o

f a d

ash

fo

llow

ed b

y a

chara

cte

r, c

ate

gory

escape o

r neste

d c

hara

cte

r cla

ss e

xpre

ssio

n, as in:

[a-z-[a

eio

u]]

i.e. M

atc

h low

er-

case lett

ers

but

not

the v

ow

els

.

XPath

2.0

and X

Query

1.0

Functi

ons

That

Use R

egula

r Expre

ssio

ns

matc

hes(x

s:s

trin

g?,

xs:s

trin

g) as x

s:b

oole

an

matc

hes(x

s:s

trin

g?,

xs:s

trin

g,

xs:s

trin

g) as

xs:b

oole

an

rep

lace(x

s:s

trin

g?,

xs:s

trin

g,

xs:

str

ing) as

xs:s

trin

g

rep

lace(x

s:s

trin

g?,

xs:s

trin

g,

xs:

str

ing,

xs:s

trin

g)

as x

s:s

trin

g

tokeniz

e(x

s:s

trin

g?,

xs:s

trin

g) as x

s:s

trin

g*

tokeniz

e(x

s:s

trin

g?,

xs:s

trin

g,

xs:s

trin

g) as

xs:s

trin

g*

XSLT

2.0

Instr

ucti

ons T

hat

Use

Regula

r Expre

ssio

ns

<xsl:analy

ze-str

ing s

ele

ct

= e

xpre

ssio

n

regex =

{ s

trin

g }

flags =

{ s

trin

g }>

<xsl:m

atc

hin

g-su

bstr

ing>

sequence-constr

ucto

r

</xsl:m

atc

hin

g-su

bstr

ing>

<xsl:non-m

atc

hin

g-substr

ing>

sequence-constr

ucto

r

</xsl:non-m

atc

hin

g-substr

ing>

xsl:fa

llback*

<

/xsl:analy

ze-str

ing>

One b

ut

not

both

of

xsl:m

atc

hin

g-su

bstr

ing a

nd

xsl:non-m

atc

hin

g-substr

ing

can b

e o

mit

ted.

Insid

e x

sl:m

atc

hin

g-su

bstr

ing, th

e

regex-gro

up(N

) fu

ncti

on r

etu

rns t

he N

th g

roup

captu

red b

y t

he r

egula

r expre

ssio

n.

Regula

r Expre

ssio

n M

atc

hin

g F

lags

Fla

gs a

re lett

ers

used t

o indic

ate

how

Regula

r Expre

ssio

n m

atc

hin

g is t

o b

e d

one:

s

Dot

(.)

matc

hes a

ny c

hara

cte

r, lin

e-end

chara

cte

rs inclu

ded.

m

^ a

nd $

matc

h a

t th

e s

tart

and e

nd o

f all

lines, not

just

the s

tart

and e

nd o

f th

e

sele

cte

d s

trin

g a

s a

whole

.

i M

atc

h c

ase insensit

ive.

x

Rem

ove w

hit

e-space (

space, ta

b a

nd lin

e-

end)

chara

cte

rs f

rom

the r

egula

r expre

ssio

n

befo

re u

sin

g it.

Zero

or

more

fla

gs a

re s

pecif

ied a

s a

str

ing u

sin

g

the o

pti

onal fl

ags=

att

ribute

of

xsl:analy

ze-str

ing

or

the o

pti

onal la

st

arg

um

ent

of

the m

atc

hes,

rep

lace a

nd t

okeniz

e f

uncti

ons.

2008-07-21

Regula

r Expre

ssio

ns

in X

SLT

2.0

,

XQ

uery

1.0

and

XPath

2.0

Sam

Wilm

ott

sam

@w

ilm

ott

.ca

htt

p:/

/w

ww

.wilm

ott

.ca

and

Mulb

err

y T

echnolo

gie

s, In

c.

17 W

est

Jeff

ers

on S

treet,

Suit

e 2

07

Rockville

, M

D 2

085

0 U

SA

Phone:

+1 3

01/31

5-9

63

1

Fax:

+1 3

01

/31

5-828

5

info

@m

ulb

err

yte

ch.c

om

htt

p:/

/w

ww

.mulb

err

yte

ch.c

om

© 2

007

-2

008

Sam

Wilm

ott

and

M

ulb

err

y T

echnolo

gie

s, In

c.

Regula

r Expre

ssio

n B

asic

s

A r

egula

r expre

ssio

n is:

oneThin

g |

anoth

erT

hin

g |

yetA

noth

er

Matc

h o

ne t

hin

g o

r anoth

er

or

anoth

er

(one o

r m

ore

thin

gs).

oneThin

g a

noth

erT

hin

g y

etA

noth

er

Matc

h o

ne t

hin

g f

ollow

ed b

y a

noth

er

etc

. (o

ne

or

more

thin

gs)

ato

m q

uanti

fier

Matc

h a

tom

the n

um

ber

of

tim

es indic

ate

d b

y

quanti

fier;

once if

quanti

fier

is o

mit

ted.

Where

ato

m is a

ny o

f:

an u

nescaped c

hara

cte

r,

an e

scaped c

hara

cte

r,

a p

are

nth

esiz

ed r

egula

r expre

ssio

n, or

a c

hara

cte

r cla

ss e

xpre

ssio

n.

Where

quanti

fier

is a

ny o

f:

? zero

or

one t

imes (

i.e. opti

onal)

* zero

or

more

tim

es

+

one o

r m

ore

tim

es

{N}

exactl

y N

tim

es

{N,}

N o

r m

ore

tim

es

{N,M

} betw

een N

and M

tim

es inclu

siv

e.

An e

xtr

a t

railin

g ?

, as in ?

?, +

? or

{N,M

}? m

eans

matc

h t

he s

hort

est

possib

le n

um

ber

of

repeti

tions r

ath

er

than t

he (

defa

ult

) lo

ngest.

Lin

e S

tart

s a

nd E

nds

A r

egula

r expre

ssio

n c

an b

e a

nchore

d a

t th

e s

tart

and/or

end o

f a s

trin

g u

sin

g ^

(th

e s

tart

) and $

(t

he e

nd).

If

a r

egula

r expre

ssio

n is u

sed w

ith

the m

fla

g, ^

and $

matc

h a

t th

e s

tart

and e

nd o

f each lin

e.

In t

he a

bsence o

f ^

or

$, a r

egula

r expre

ssio

n

matc

hes u

nanch

ore

d: anyw

here

wit

hin

the s

trin

g.

Subexpre

ssio

ns a

nd B

ack R

efe

rences

Each p

are

nth

esiz

ed g

roup in a

regula

r expre

ssio

n

is a

ssig

ned a

gro

up n

um

ber

counti

ng u

nescaped

left

pare

nth

eses s

tart

ing f

rom

the left

.

Gro

up n

um

bers

can b

e u

sed in t

hre

e w

ays:

1.

Wit

hin

a r

egula

r expre

ssio

n, to

matc

h w

hat

was m

atc

hed b

y a

pre

vio

us s

ubexpre

ssio

n. A

pre

vio

usly

matc

hed g

roup is identi

fied b

y

backsla

sh a

nd a

num

ber:

\1

, \2

etc

.

2.

Wit

hin

a r

epla

ce r

epla

cem

ent

expre

ssio

n t

o

matc

h w

hat

was m

atc

hed b

y a

pre

vio

us

subexpre

ssio

n. A

gro

up is identi

fied b

y a

num

eri

c n

am

e:

$1

, $

2 e

tc. A

s w

ell, $

0

identi

fies t

he w

hole

matc

hed s

ubstr

ing.

3.

wit

hin

a X

SLT r

egex-gro

up(N

) to

access t

he

matc

hed s

ubexpre

ssio

n.

159

Page 160: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Sim

ple

Expre

ssio

ns

$V

arN

am

e

( Expr

)

( ) . (

one d

ot:

self

)

QN

am

e ( E

xpr

, ... )

QN

am

e ( )

Inte

gerL

itera

l

Decim

alL

itera

l

Double

Lit

era

l

Str

ingLit

era

l

Ari

thm

eti

c E

xpre

ssio

ns

+ E

xpr

Expr

+ E

xpr

- E

xpr

Expr

- E

xpr

Expr

* Expr

Expr

div

Expr

Expr

idiv

Expr

Expr

mod

Expr

Cre

ati

ng S

equences

Cre

ate

a s

equence f

rom

a lis

t of

item

s:

Expr

, ...

Note

: A

sequence lis

t m

ust

usu

ally b

e p

are

nth

esiz

ed.

Repeat

over

one o

r m

ore

sequences, re

turn

ing a

sequence o

f re

sult

s:

for

Vari

able

Bin

din

g ,

... r

etu

rn E

xpr

where

a V

ari

able

Bin

din

g is:

$V

arN

am

e in E

xpr

Cre

ate

a n

um

eri

c s

equences, fr

om

low

er

bound t

o

upper

bound:

Expr

to E

xpr

All t

he ite

ms a

ppeari

ng in e

ither

sequence:

Expr

unio

n E

xpr

Expr

| Expr

Only

ite

ms a

ppeari

ng in b

oth

sequences:

Expr

inte

rsect

Expr

All ite

ms in t

he f

irst

sequence n

ot

in s

econd:

Expr

except

Expr

Com

ments

in X

Path

Expre

ssio

ns

(: T

his

is a

com

ment

wit

hin

an X

Path

expr

:)

Testi

ng

Test

if t

he c

ondit

ion is s

ati

sfi

ed f

or

at

least

one

com

bin

ati

on o

f th

e b

ound e

xpre

ssio

ns:

som

e V

ari

able

Bin

din

g , ... s

ati

sfi

es E

xpr

Test

if t

he c

ondit

ion is s

ati

sfi

ed f

or

all o

f th

e

bound e

xpre

ssio

ns:

every

Vari

able

Bin

din

g ,

... s

ati

sfi

es E

xpr

Sele

ct

one o

r th

e o

ther

of

two p

ossib

iliite

s:

if (

Expr

) th

en E

xpr

els

e E

xpr

Eit

her

or

both

of

two t

ests

:

Expr

or

Expr

Expr

and

Expr

Test

if t

hey a

re t

he s

am

e n

ode:

Expr

is E

xpr

Test

if a

node a

ppears

befo

re o

r aft

er

anoth

er:

Expr

<<

Expr

Expr

>>

Expr

Test

an e

xpre

ssio

n’s

dynam

ic t

ype:

Expr

insta

nce o

f SequenceType

Test

if a

n e

xpre

ssio

n c

an b

e c

onvert

ed t

o a

type:

Expr

casta

ble

as A

tom

icType

Expr

casta

ble

as A

tom

icType?

Com

pare

tw

o a

tom

ic v

alu

es:

Expr

eq

Expr

Expr

ne E

xpr

Expr

lt E

xpr

Expr

le E

xpr

Expr

gt

Expr

Expr

ge E

xpr

Com

pare

all ite

ms in o

ne s

equence t

o a

ll ite

ms in

a s

econd, and r

etu

rn if

true f

or

any p

air

of

valu

es:

Expr

= E

xpr

Expr

!= Expr

Expr

< E

xpr

Expr

<=

Expr

Expr

> E

xpr

Expr

>=

Expr

Type M

odif

icati

on E

xpre

ssio

ns

Use a

s w

ithout

convert

ing:

Expr

treat

as S

equenceType

Use a

s, convert

ing a

s n

eeded a

nd d

oable

:

Expr

cast

as A

tom

icType

Expr

cast

as A

tom

icType?

XPath

2.0

: htt

p:/

/w

ww

.w3.o

rg/TR/xpath

20/

XSL-Lis

t:

htt

p:/

/w

ww

.mulb

err

yte

ch.c

om

/xsl/

xsl-

list

Path

Expre

ssio

ns

/

Top level, d

ocum

ent

root

/ S

tep

At

top level

Ste

p

Rela

tive t

o c

urr

ent

node

// S

tep

Anyw

here

wit

hin

docum

ent

Path

/ S

tep

Imm

edia

tely

wit

hin

Path

Path

// S

tep

Anyw

here

wit

hin

Path

Where

a S

tep is o

ne o

f:

Expr

Axis

Nam

e::N

am

eTest

Axis

Nam

e::K

indTest

@N

am

eTest

(a

ttri

bute

test)

Nam

eTest

(

child e

lem

ent

test)

Kin

dTest

(

child n

ode t

est)

..

(tw

o d

ots

: pare

nt

test)

Follow

ed b

y z

ero

or

more

pre

dic

ate

s:

[ Expr

]

Where

an A

xis

Nam

e is o

ne o

f:

ancesto

r ancesto

r-or-

self

att

ribute

child

descendant

descendant-

or-

self

follow

ing

follow

ing-sib

ling

nam

espace

pare

nt

pre

cedin

g

pre

cedin

g-sib

ling

self

Where

a N

am

eTest

is o

ne o

f:

QN

am

e

* NC

Nam

e:*

*:N

CN

am

e

Where

a K

indTest

is o

ne o

f:

att

ribute

( A

ttri

bute

Nam

e )

att

ribute

( A

ttri

bute

Nam

e ,

TypeN

am

e )

att

ribute

( *

)

att

ribute

( *

, T

ypeN

am

e )

att

ribute

( )

com

ment

( )

docum

ent-

node (

ele

ment

... )

docum

ent-

node (

schem

a-ele

ment

... )

docum

ent-

node (

)

ele

ment

( Ele

mentN

am

e )

ele

ment

( Ele

mentN

am

e ,

TypeN

am

e )

ele

ment

( *

)

ele

ment

( *

, TypeN

am

e )

ele

ment

( )

node (

)

pro

cessin

g-in

str

ucti

on (

NC

Nam

e )

pro

cessin

g-in

str

ucti

on (

Str

ingLit

era

l )

pro

cessin

g-in

str

ucti

on (

)

schem

a-att

ribute

( A

ttri

bute

Nam

e )

schem

a-ele

ment

( Ele

mentN

am

e )

text

( )

Nam

es a

nd T

ypes

XM

L Q

Nam

es, w

ith o

r w

ithout

a c

olo

n-separa

ted

pre

fix, is

use f

or

all o

f:

VarN

am

e

Att

ribute

Nam

e

Ele

mentN

am

e

TypeN

am

e

Ato

mic

Type

A S

equenceType is o

ne o

f:

em

pty

-sequence (

)

Kin

dTest

item

( )

Ato

mic

Type

Where

Kin

dTest,

ite

m()

or

Ato

mic

Type c

an b

e

opti

onally f

ollow

ed b

y:

?

(may b

e e

mpty

sequence)\

+

(is a

non-em

pty

sequence o

f th

e ty

pe)

* (is a

sequence o

f th

e t

ype, em

pty

or

not)

O

pera

tor

Pre

cedence:

1

, (c

om

ma)

2

for

s

om

e e

very

if

3

or

4

and

5

=

!= <

<

= >

>

=

eq

ne lt

le

g

t ge

is <

< >

>

6

to

7

(tw

o-arg

um

ent)

+

-

8

*

div

id

iv m

od

9

unio

n |

10

inte

rsect

e

xcept

11

insta

nce o

f

12

treat

as

13

casta

ble

as

14

cast

as

15

(one-arg

um

ent)

+

-

16

/

//

17

ste

p node-te

st

$

nam

e

( Expr

) f

uncti

on-call lite

ral

XPath2

160

Page 161: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Rela

tive L

ocati

on P

ath

s

Rela

tive L

ocati

on P

ath

s t

ravers

e t

he d

ocum

ent

from

the c

onte

xt

node

para

para

ele

ment

childre

n

Als

o -

child::para

@ty

pe

the t

ype a

ttri

bute

A

lso -

att

ribute

::ty

pe

../ti

tle

the t

itle

ele

ment

childre

n o

f th

e p

are

nt

* except

titl

e

child e

lem

ents

except

titl

e e

lem

ents

A

lso -

*[n

ot(

self

::ti

tle)]

(w

ork

s in X

Path

1.0

)

ancesto

r::s

ec

all s

ec a

ncesto

r ele

ments

ancesto

r::s

ec/@

n

all n

att

ribute

s o

n s

ec a

ncesto

r ele

ments

list/

(ite

m |

ste

p)

item

and s

tep

ele

ment

childre

n o

f list

childre

n, in

docum

ent

ord

er

list/

item

, list/

ste

p

item

ele

ment

childre

n o

f list

childre

n f

ollow

ed

by s

tep

childre

n o

f list

childre

n

pre

cedin

g-sib

ling::

ste

p

all p

recedin

g s

ibling s

tep

ele

ments

pre

cedin

g-sib

ling::

*[1][

self

::ste

p]

the d

irectl

y p

recedin

g s

ibling e

lem

ent,

if

it is a

ste

p (

oth

erw

ise n

oth

ing)

descendant:

:div

[last(

)]

the last

div

descendant

of

the c

urr

ent

node

.//div

[last(

)]

div

descendants

that

are

the last

child d

iv o

f each o

f th

eir

pare

nts

pre

cedin

g::

pb[1

] th

e f

irst

(most

im

media

te)

pre

cedin

g p

b

ancesto

r::s

ec//pb inte

rsect

pre

ced

ing::

pb

pb

ele

ments

insid

e t

he s

am

e s

ec e

lem

ent

as

the c

onte

xt

node, pre

cedin

g it

p[n

orm

alize-space()

] p

child e

lem

ents

that

have a

non

-w

hit

espace

valu

e (

text

conte

nt)

*[not(

node()

)]

em

pty

ele

ment

childre

n (

i.e., e

lem

ent

childre

n

wit

h n

o n

ode c

hildre

n)

*[not(

node()

excep

t (c

om

ment(

)|

p

rocessin

g-in

str

ucti

on()

)]

ele

ment

childre

n t

hat

are

em

pty

(have n

o

childre

n)

except

for

com

ments

or

pro

cessin

g

instr

ucti

ons

ste

p[p

osit

ion()

gt

1]

all s

tep

ele

ment

childre

n b

ut

the f

irst

ste

p e

xcept

*[1]

ste

p e

lem

ent

childre

n b

ut

the f

irst

ste

p[p

osit

ion()

le 4

] th

e f

irst

four

ste

p e

lem

ent

childre

n

Als

o -

ste

p[p

osit

ion()

= (

1 t

o 4

)]

ste

p[p

osit

ion()

mod 2

] odd-num

bere

d s

tep

childre

n

ste

p[n

ot(

posit

ion()

mod 2

)]

even-num

bere

d s

tep

childre

n

*[posit

ion()

le 4

] in

ters

ect

ste

p

from

the f

irst

four

ele

ment

childre

n, th

e s

tep

childre

n

ancesto

r-or-

self

::*[

exis

ts(@

lang)]

[1]/

@la

ng

the c

losest

lang a

ttri

bute

on t

he c

onte

xt

node

or

an a

ncesto

r ele

ment

Expre

ssio

ns t

hat

are

not

Locati

on P

ath

s

(@

cla

ss,'none')[1

] th

e c

lass a

ttri

bute

, or

if it

does n

ot

exis

t, t

he

str

ing "

none".

A

lso -

if

(exis

ts(@

cla

ss))

then @

cla

ss e

lse "

none"

//*/

nam

e()

the n

am

es o

f all e

lem

ents

, in

docum

ent

ord

er

dis

tinct-

valu

es(/

/*/

nam

e()

) th

e n

am

es o

f all e

lem

ents

, in

docum

ent

ord

er,

w

ith d

uplicate

s r

em

oved

//nam

e/str

ing-jo

in((

firs

t, last)

,' ')

a s

equence o

f str

ings c

onstr

ucte

d f

rom

the

nam

e e

lem

ents

in t

he d

ocum

ent,

each o

ne

concate

nati

ng t

he v

alu

es o

f it

s f

irst

and last

ele

ment

childre

n, in

that

ord

er,

join

ing t

hem

w

ith s

paces

Als

o -

for

$n in /

/nam

e r

etu

rn

s

trin

g-jo

in((

$n/fi

rst,

$n/la

st)

,' ')

//*/

count(

ancesto

r-or-

self

::*)

a s

equence o

f num

bers

repre

senti

ng t

he

depth

of

each e

lem

ent

in t

he d

ocum

ent

max(/

/*/

count(

ancesto

r-or-

self

::*)

) th

e m

axim

um

depth

of

all e

lem

ents

in t

he

docum

ent

(a n

um

ber

in a

sin

gle

ton s

equence)

for

$sto

oge in (

'Moe','L

arr

y','C

url

y')

retu

rncount(

//p[c

onta

ins(.

,$sto

oge)]

) th

e c

ounts

of

all p

ele

ments

in t

he d

ocum

ent

menti

onin

g e

ach

of

"Moe",

"Larr

y"

and "

Curl

y",

in

that

ord

er

index-of(

('M

oe','L

arr

y','C

url

y'), speaker[

1])

if

the f

irst

speaker

ele

ment

child h

as t

he v

alu

e

"Moe",

then 1

; if

"Larr

y",

then 2

; if

"C

url

y",

th

en 3

; oth

erw

ise t

he e

mpty

sequence (

i.e., n

o

valu

e)

(: Y

ou’v

e g

ot

to b

e k

idd

ing m

e. :)

do n

oth

ing.

A c

om

ment

is just

a c

om

ment.

2008-07-21

XPath

2.0

Q

uic

k R

efe

rence

See a

lso t

he “

XQ

uery

1.0

& X

Path

2.0

Functi

ons &

Opera

tors

Quic

k

Refe

rence”

Sam

Wilm

ott

sam

@w

ilm

ott

.ca

htt

p:/

/w

ww

.wilm

ott

.ca

and

Mulb

err

y T

echnolo

gie

s, In

c.

17 W

est

Jeff

ers

on S

treet,

Suit

e 2

07

Rockville

, M

D 2

085

0 U

SA

Phone:

+1 3

01/31

5-9

63

1

Fax:

+1 3

01

/31

5-828

5

info

@m

ulb

err

yte

ch.c

om

htt

p:/

/w

ww

.mulb

err

yte

ch.c

om

© 2

007

-2

008

Sam

Wilm

ott

and

M

ulb

err

y T

echnolo

gie

s, In

c.

Absolu

te L

ocati

on P

ath

s

Absolu

te L

ocati

on P

ath

s t

ravers

e t

he d

ocum

ent

sta

rtin

g a

t th

e t

op (

the r

oot)

, and c

an b

e

recogniz

ed b

y t

heir

init

ial / (

forw

ard

sla

sh).

/book/bookin

fo/abstr

act

an a

bstr

act

ele

ment

child o

f a b

ookin

fo c

hild

of

the b

ook d

ocum

ent

ele

ment

Als

o -

/child::book/child::bookin

fo/ch

ild::abstr

act

//para

all p

ara

ele

ments

in t

he d

ocum

ent

Als

o -

/descendant-

or-

self

::*/

child::para

A

lso -

/descendant:

:para

/descendant:

:para

[1]

the f

irst

para

ele

ment

in t

he d

ocum

ent

Als

o -

(//para

)[1]

//@

ord

er-

by

all o

rder-

by a

ttri

bute

s in t

he d

ocum

ent

//list[

exis

ts(a

ncesto

r::lis

t)]

all lis

t ele

ments

that

have a

ncesto

r liste

lem

ents

//list[

not(

ancesto

r::lis

t)]

all lis

t ele

ments

that

do n

ot

have a

ncesto

r list

ele

ments

A

lso -

//list[

not(

exis

ts(a

ncesto

r::lis

t))]

A

lso -

//list[

em

pty

(ancesto

r::lis

t)]

//(*

except

titl

e)

all e

lem

ents

except

titl

e e

lem

ents

A

lso -

//*[

not(

self

::ti

tle)]

(w

ork

s in X

Path

1.0

)

//pro

cessin

g-in

str

ucti

on()

[not(

ancesto

r::s

ec/@

n =

1)]

all p

rocessin

g instr

ucti

ons w

ith n

o s

ec a

ncesto

r ele

ments

wit

h n

att

ribute

s e

qual to

1

//para

[matc

hes(.

,'[X

|x]{

3}')]

all p

ara

ele

ments

whose v

alu

e inclu

des t

he

regula

r expre

ssio

n [

X|x

]{3}

Tip

- [

X|x

]{3} m

atc

hes t

hre

e X

or

xchara

cte

rs

appeari

ng in a

row

//sec[@

id =

//@

rid/to

keniz

e(.

,'\s+

')]

all s

ec e

lem

ents

wit

h id

att

ribute

s w

hose

valu

es a

re a

lso g

iven a

s a

valu

e b

y a

to

keniz

ed r

id a

ttri

bute

anyw

here

in t

he

docum

ent

Als

o -

//sec[@

id =

$ri

d-valu

es]

where

$

rid-valu

es is

dis

tinct-

valu

es(/

/@

rid/to

keniz

e(.

,'\s+

'))

Tip

- u

se

dis

tinct-

valu

es(/

/@

rid/to

keniz

e(.

,'\s+

'))

to

rem

ove d

uplicate

s f

rom

the lis

t of

tokeniz

ed

@ri

d v

alu

es

Tip

- t

he r

egula

r expre

ssio

n \

s+

matc

hes a

ny

conti

guous s

equence o

f spaces

(space,

linefe

ed o

r ta

b c

hara

cte

rs)

161

Page 162: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

XQ

uery

Scri

pts

An X

Query

scri

pt

consis

ts o

f:

1.

A V

ers

ion D

ecla

rati

on

xquery

vers

ion

Str

ingLit

era

l

follow

ed, opti

onally, by:

encodin

g S

trin

gLit

era

l

follow

ed, opti

onally, by a

sem

icolo

n (

";")

.

2. If

an X

Query

scri

pt

is a

Lib

rary

Module

, th

en it's m

odule

nam

espace d

ecla

rati

on

com

es n

ext:

module

nam

espace N

CN

am

e =

URIL

itera

l ;

3.

Defa

ult

Decla

rati

ons a

nd Im

port

s:

zero

or

more

of:

decla

re d

efa

ult

ele

ment

nam

esp

ace U

RIL

itera

l ;

decla

re d

efa

ult

functi

on n

am

espace U

RIL

itera

l ;

decla

re b

oundary

-space p

reserv

e ;

decla

re b

oundary

-space s

trip

;

decla

re d

efa

ult

collati

on

URIL

itera

l ;

decla

re b

ase-uri

URIL

itera

l ;

decla

re c

onstr

ucti

on s

trip

;

decla

re c

onstr

ucti

on p

reserv

e ;

decla

re o

rderi

ng o

rdere

d ;

decla

re o

rderi

ng u

nord

ere

d ;

decla

re d

efa

ult

ord

er

em

pty

gre

ate

st

;

decla

re d

efa

ult

ord

er

em

pty

least

;

decla

re c

opy-nam

espaces p

reserv

e ,

inheri

t ;

decla

re c

opy-nam

espaces p

reserv

e ,

no-in

heri

t ;

decla

re c

opy-nam

espaces n

o-pre

serv

e , inheri

t ;

decla

re c

opy-nam

espaces n

o-pre

serv

e ,

no-in

heri

t ;

decla

re n

am

espace N

CN

am

e =

URIL

itera

l ;

import

schem

a n

am

espace N

CN

am

e =

U

RIL

itera

lLis

t ;

import

schem

a d

efa

ult

ele

ment

nam

espace

URIL

itera

lLis

t ;

import

schem

a U

RIL

itera

lLis

t ;

import

module

nam

espace N

CN

am

e =

U

RIL

itera

lLis

t ;

import

module

URIL

itera

lLis

t ;

XQ

uery

1.0

:

htt

p:/

/w

ww

.w3

.org

/TR/xquery

/

4.

Vari

able

, Functi

on a

nd O

pti

on D

ecla

rati

ons:

zero

or

more

of:

decla

re v

ari

able

Vari

able

Decla

rati

on :=

ExprS

ingle

;

decla

re v

ari

able

Vari

able

Decla

rati

on e

xte

rnal ;

decla

re f

uncti

on Q

Nam

e

Para

mete

rDecla

rati

ons ;

decla

re f

uncti

on Q

Nam

e

Para

mete

rDecla

rati

ons

exte

rnal;

decla

re f

uncti

on Q

Nam

e

Para

mete

rDecla

rati

ons a

s

SequenceType e

xte

rnal ;

decla

re o

pti

on

QN

am

e S

trin

gLit

era

l ;

where

Para

mete

rDecla

rati

ons is o

ne o

f:

( )

(

i.e. em

pty

if

no p

ara

mete

rs)

( V

ari

able

Decla

rati

on )

(

for

one p

ara

mete

r)

( V

ari

able

Decla

rati

on ,

... )

(w

hen t

wo o

r m

ore

)

where

Vari

able

Decla

rati

on is o

ne o

f:

$Q

Nam

e

$Q

Nam

e a

s S

equenceType

and w

here

URIL

itera

lLis

t is

one o

f:

URIL

itera

l

URIL

itera

l at

URIL

itera

l

URIL

itera

l at

URIL

itera

l ,

... (t

wo o

r m

ore

)

5. Fin

ally, if

the X

Query

scri

pt

is a

Main

module

,

not

a L

ibra

ry m

odule

, an X

Query

expre

ssio

n is

requir

ed t

o s

pecif

y t

he q

uery

bein

g m

ade:

Expr

Cre

ati

ng S

equences

Cre

ate

a s

equence f

rom

a lis

t of

item

s:

Expr

, ...

Note

: A

sequence lis

t m

ust

usu

ally b

e p

are

nth

esiz

ed.

Repeat

over

one o

r m

ore

sequences, re

turn

ing a

sequence o

f re

sult

s:

for

Vari

able

Bin

din

g ,

... r

etu

rn E

xpr

Cre

ate

a n

um

eri

c s

equences, fr

om

low

er

bound t

o

upper

bound:

Expr

to E

xpr

All t

he ite

ms a

ppeari

ng in e

ither

sequence:

Expr

unio

n E

xpr

Exp

r |

Expr

Only

ite

ms a

ppeari

ng in b

oth

sequences:

Expr

inte

rsect

Expr

All ite

ms in t

he f

irst

sequence n

ot

in s

econd:

Expr

except

Expr

Ari

thm

eti

c E

xpre

ssio

ns

+ E

xpr

Expr

+ E

xpr

- E

xpr

Expr

- E

xpr

Expr

* Expr

Expr

div

Expr

Expr

idiv

Expr

Expr

mod

Expr

Type M

odif

icati

on E

xpre

ssio

ns

Use a

s w

ithout

convert

ing:

Expr

treat

as S

equenceType

Use a

s, convert

ing a

s n

eeded a

nd d

oable

:

Expr

cast

as A

tom

icType

Expr

cast

as A

tom

icType?

Sim

ple

Expre

ssio

ns

$ V

arN

am

e

. (

one d

ot:

self

)

( )

( Expr

)

QN

am

e ( E

xpr

, ... )

QN

am

e ( )

Inte

gerL

itera

l D

ecim

alL

itera

l

Double

Lit

era

l Str

ingLit

era

l

Validati

ng N

odes

validate

{ E

xpr

} (d

efa

ult

s t

o s

tric

t)

validate

lax {

Expr

}

validate

str

ict

{ Expr

}

Ord

eri

ng M

ode f

or

Sequences

ord

ere

d {

Expr

}

unord

ere

d {

Expr

}

Imple

menta

tion-D

efi

ned

Instr

ucti

ons

(#

QN

am

e ... #

) …

{ O

pti

onalE

xpr

}

Path

Expre

ssio

ns

/

Top level, d

ocum

ent

root

/ S

tep

At

top level

Ste

p

Rela

tive t

o c

urr

ent

node

// S

tep

Anyw

here

wit

hin

docum

ent

Path

/ S

tep

Imm

edia

tely

wit

hin

Path

Path

// S

tep

Anyw

here

wit

hin

Path

Where

a S

tep is o

ne o

f:

Expr

Axis

Nam

e :: N

am

eTest

Axis

Nam

e ::

Kin

dTest

@N

am

eTest

(

att

ribute

test)

Nam

eTest

(

child e

lem

ent

test)

Kin

dTest

(

child n

ode t

est)

..

(

two d

ots

: pare

nt

test)

Follow

ed b

y z

ero

or

more

pre

dic

ate

s:

[ Expr

]

Where

an A

xis

Nam

e is o

ne o

f:

ancesto

r ancesto

r-or-

self

att

ribute

child

descendant

descendant-

or-

self

follow

ing

follow

ing-sib

ling

nam

espace

pare

nt

pre

cedin

g

pre

cedin

g-sib

ling

self

A N

am

eTest

is o

ne o

f:

QN

am

e

*

NC

Nam

e:*

*:

NC

Nam

e

And a

Kin

dTest

is o

ne o

f:

att

ribute

( A

ttri

bute

Nam

e )

att

ribute

( A

ttri

bute

Nam

e ,

TypeN

am

e )

att

ribute

( *

, T

ypeN

am

e )

att

ribute

( *

)

att

ribute

( )

com

ment

( )

docum

ent-

node (

ele

ment

... )

docum

ent-

node (

schem

a-ele

ment

... )

docum

ent-

node (

)

ele

ment

( Ele

mentN

am

e )

ele

ment

( Ele

mentN

am

e ,

TypeN

am

e )

ele

ment

( *

, TypeN

am

e )

ele

ment

( *

)

ele

ment

( )

node (

)

pro

cessin

g-in

str

ucti

on (

NC

Nam

e )

pro

cessin

g-in

str

ucti

on (

Str

ingLit

era

l )

pro

cessin

g-in

str

ucti

on (

)

schem

a-att

ribute

( A

ttri

bute

Nam

e )

schem

a-ele

ment

( Ele

mentN

am

e )

text

( )

XQuery.pdf

162

Page 163: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Testi

ng

Sele

ct

based o

n t

he t

ype o

f an e

xpre

ssio

n (

one o

r

more

cases p

lus a

defa

ult

):

typesw

itch (

Expr

) case ... d

efa

ult

...

where

case a

nd d

efa

ult

are

:

case S

equenceType r

etu

rn E

xpr

case $

VarN

am

e a

s S

equenceType r

etu

rn E

xpr

defa

ult

retu

rn E

xpr

defa

ult

$V

arN

am

e r

etu

rn E

xpr

Test

if t

he c

ondit

ion is s

ati

sfi

ed f

or

at

least

one

com

bin

ati

on o

f th

e b

ound e

xpre

ssio

ns:

som

e V

ari

able

Bin

din

g ,

... s

ati

sfi

es E

xpr

Test

if t

he c

ondit

ion is s

ati

sfi

ed f

or

all o

f th

e

bound e

xpre

ssio

ns:

every

Vari

able

Bin

din

g ,

... s

ati

sfi

es E

xpr

where

a V

ari

able

Bin

din

g is:

$V

arN

am

e in E

xpr

$V

arN

am

e a

s S

equenceType in

Expr

Sele

ct

one o

r th

e o

ther

of

two p

ossib

iliite

s:

if (

Expr

) th

en E

xpr

els

e E

xpr

Eit

her

or

both

of

two t

ests

:

Expr

or

Expr

Expr

and

Expr

Test

if t

hey a

re t

he s

am

e n

ode:

Expr

is E

xpr

Test

if a

node a

ppears

befo

re o

r aft

er

anoth

er:

Expr

<<

Expr

Expr

>>

Expr

Test

an e

xpre

ssio

n’s

dynam

ic t

ype:

Expr

insta

nce o

f SequenceType

Test

if a

n e

xpre

ssio

n c

an b

e c

onvert

ed t

o a

type:

Expr

casta

ble

as A

tom

icType

Expr

casta

ble

as A

tom

icType?

Com

pare

tw

o ite

m v

alu

es:

Expr

eq

Expr

Expr

ne E

xpr

Expr

lt E

xpr

Expr

le E

xpr

Expr

gt

Expr

Expr

ge E

xpr

Com

pare

all ite

ms in o

ne s

equence t

o a

ll ite

ms in

a s

econd, and r

etu

rn if

true f

or

any p

air

of

valu

es:

Expr

= E

xpr

Expr

!= Expr

Expr

< E

xpr

Expr

<=

Expr

Expr

> E

xpr

Expr

>=

Expr

Nam

es a

nd T

ypes

VarN

am

e

Att

ribute

Nam

e

Ele

mentN

am

e

TypeN

am

e

Ato

mic

Type

are

all X

ML Q

Nam

es, w

ith o

r w

ithout

a c

olo

n-

separa

ted p

refi

x.

A S

equenceType is o

ne o

f:

em

pty

-sequence (

)

Kin

dTest

item

( )

Ato

mic

Type

Where

Kin

dTest,

ite

m()

or

Ato

mic

Type c

an b

e

opti

onally f

ollow

ed b

y:

?

(may b

e e

mpty

sequence)

+

(is a

non-em

pty

sequence o

f th

e ty

pe)

* (is a

sequence o

f th

e t

ype, em

pty

or

not)

O

pera

tor

Pre

cedence:

1

, (c

om

ma)

2

for

let

som

e every

if

t

ypesw

itch

3

or

4

and

5

=

!= <

<

= >

>

=

eq

ne lt

le

g

t ge

is <

< >

>

6

to

7

(tw

o-arg

um

ent)

+

-

8

*

div

id

iv m

od

9

unio

n |

10

inte

rsect

e

xcept

11

insta

nce o

f

12

treat

as

13

casta

ble

as

14

cast

as

15

(one-arg

um

ent)

+

-

16

/

//

17

ste

p n

ode-te

st

$

nam

e

( Expr

)

functi

on-call lite

ral

validate

(#

… #

) c

onstr

ucto

r

ord

ere

d unord

ere

d

Pre

defi

ned N

am

espace N

am

es:

xm

l =

htt

p:/

/w

ww

.w3.o

rg/XM

L/1998/nam

espace

xs =

htt

p:/

/w

ww

.w3

.org

/2001/XM

LSch

em

a

xsi =

htt

p:/

/w

ww

.w3

.org

/2001/XM

LSchem

a-in

sta

nce

fn =

htt

p:/

/w

ww

.w3

.org

/2005/xpath

-fu

ncti

ons

local =

htt

p:/

/w

ww

.w3

.org

/2005/xquery

-lo

cal-

functi

ons

2008-07-21

XQ

uery

1.0

Quic

k R

efe

rence

See a

lso t

he “

XQ

uery

1.0

& X

Path

2.0

Functi

ons &

Opera

tors

Quic

k

Refe

rence”

Sam

Wilm

ott

sam

@w

ilm

ott

.ca

htt

p:/

/w

ww

.wilm

ott

.ca

and

Mulb

err

y T

echnolo

gie

s, In

c.

17 W

est

Jeff

ers

on S

treet,

Suit

e 2

07

Rockville

, M

D 2

085

0 U

SA

Phone:

+1 3

01/31

5-9

63

1

Fax:

+1 3

01

/31

5-828

5

info

@m

ulb

err

yte

ch.c

om

htt

p:/

/w

ww

.mulb

err

yte

ch.c

om

© 2

007

-2

008

Sam

Wilm

ott

and

M

ulb

err

y T

echnolo

gie

s, In

c.

FLW

OR

Expre

ssio

ns

FLW

OR E

xpre

ssio

ns s

tart

wit

h o

ne o

r m

ore

for

or

let:

for

SequenceV

ari

able

Bin

din

g ,

...

let

Assig

nedV

ari

able

Bin

din

g ,

...

follow

ed b

y:

where

Expr

(

opti

onal)

Ord

eri

ngIn

fo , …

(o

ne o

r m

ore

, opti

onal)

retu

rn E

xpr

where

SequenceV

ari

able

Bin

din

g is o

ne o

f:

$V

arN

am

e in E

xpr

$V

arN

am

e a

s S

equenceType in

Expr

$V

arN

am

e a

t $

VarN

am

e in

Expr

$V

arN

am

e a

s S

eq

uence

Typ

e a

t $

VarN

am

e in E

xpr

where

Assig

nedV

ari

able

Bin

din

g is o

ne o

f:

$V

arN

am

e :

= E

xpr

$V

arN

am

e a

s S

equenceType :=

Exp

and w

here

Ord

eri

ngIn

fo c

onsis

ts o

f, in o

rder:

sta

ble

(o

pti

onal)

ord

er

Expr

ascendin

g o

r d

escendin

g

(opti

onal)

em

pty

gre

ate

st

or

em

pty

least

(o

pti

onal)

collati

on U

RIL

itera

l

(opti

onal)

Constr

ucto

rs

< Q

Nam

e ...

/>

< Q

Nam

e ... >

... <

/ Q

Nam

e >

<![

CD

ATA

[ ... ]]

>

<!-

- ... -

->

<?

PIT

arg

et

... ?>

docum

ent

{ Expr

}

ele

ment

QN

am

e {

Opti

onalE

xpr

}

ele

ment

{ Expr

} {

Opti

onalE

xpr

}

att

ribute

QN

am

e {

Opti

onalE

xpr

}

att

ribute

{ E

xpr

} {

Opti

onalE

xpr

}

text

{ Expr

}

com

ment

{ Expr

}

pro

cessin

g-in

str

ucti

on N

CN

am

e {

Opti

onalE

xpr

}

pro

cessin

g-in

str

ucti

on {

Expr

} {

Opti

onalE

xpr

}

Wit

hin

a c

onstr

ucto

r’s a

ttri

bute

valu

es a

nd

ele

ment

conte

nt,

lit

era

l "{

" and "

}" n

eed d

oubling.

Anyth

ing w

ithin

sin

gle

"{"

... "

}" is e

valu

ate

d a

s a

n

Expr.

163

Page 164: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Top-Level D

ecla

rati

ons

<xsl:att

ribute

-set

nam

e =

qnam

e

use-att

ribute

-sets

= q

nam

es>

xsl:att

ribute

*

<

/xsl:att

ribute

-set>

<xsl:chara

cte

r-m

ap n

am

e =

qnam

e

use-chara

cte

r-m

aps =

qnam

es>

xsl:outp

ut-

chara

cte

r*

<

xsl:outp

ut-

chara

cte

r chara

cte

r =

char

str

ing =

str

ing /

>

<

/xsl:chara

cte

r-m

ap>

One o

r m

ore

xsl:outp

ut-

chara

cter

is a

llow

ed.

<xsl:d

ecim

al-

form

at

nam

e =

qnam

e

decim

al-

separa

tor

= c

har

gro

upin

g-separa

tor

= c

har

infi

nit

y =

str

ing

min

us-sig

n =

char

NaN

= s

trin

g

perc

ent

= c

har

per-

mille

= c

har

zero

-dig

it =

char

dig

it =

char

patt

ern

-separa

tor

= c

har

/>

<xsl:fu

ncti

on n

am

e =

qnam

e

as =

sequence-ty

pe

overr

ide =

"yes"

| "n

o">

xsl:para

m*,

sequence-constr

ucto

r

<

/xsl:fu

ncti

on>

<xsl:im

port

-sch

em

a n

am

espace

= u

ri

schem

a-lo

cati

on =

uri

>

xs:s

chem

a?

</xsl:im

port

-sch

em

a>

<xsl:in

clu

de h

ref

= u

ri /

>

<xsl:key n

am

e =

qnam

e

matc

h =

patt

ern

use =

expre

ssio

n

collati

on =

uri

>

sequence-constr

ucto

r

<

/xsl:key>

<xsl:nam

espace-alias

sty

lesheet-

pre

fix =

pre

fix |

"#defa

ult

"

re

sult

-pre

fix =

pre

fix |

"#defa

ult

" />

Conte

nt

Specif

icati

on O

pti

ons

? opti

onal

* zero

or

more

+

one o

r m

ore

#PC

DA

TA

ju

st

text

sequence-constr

ucto

r In

str

ucti

ons a

nd t

ext

<xsl:outp

ut

nam

e =

qnam

e

meth

od =

"xm

l" |

"htm

l" |

"xhtm

l" |

"text"

|qnam

e-but-

not-

ncn

am

e

byte

-ord

er-

mark

= "

yes"

| "n

o"

cdata

-secti

on-ele

ments

= q

nam

es

docty

pe-public =

str

ing

docty

pe-syste

m =

str

ing

encodin

g =

str

ing

escape-uri

-att

ribute

s =

"yes"

| "n

o"

inclu

de-conte

nt-

type =

"yes"

| "n

o"

indent

= "

yes"

| "n

o"

media

-ty

pe =

str

ing

norm

alizati

on-fo

rm =

"N

FC

" |

"NFD

" |

"NFK

C"

| "N

FK

D"

| "n

one"

|

"f

ully-norm

alized"

| nm

token

om

it-xm

l-decla

rati

on =

"yes"

| "n

o"

sta

ndalo

ne =

"yes"

| "n

o"

| "o

mit

"

undecla

re-pre

fixes =

"yes"

| "n

o"

use-chara

cte

r-m

aps =

qnam

es

vers

ion =

nm

token /

>

<xsl:para

m n

am

e =

qnam

e

sele

ct

= e

xpre

ssio

n

as =

sequence-ty

pe

requir

ed =

"yes"

| "n

o"

tunnel =

"yes"

| "n

o">

sequence-constr

ucto

r

<

/xsl:para

m>

xsl:para

m is a

lso a

llow

ed in x

sl:fu

ncti

on a

nd

xsl:te

mpla

te.

<xsl:p

reserv

e-space e

lem

ents

= t

okens /

>

<xsl:str

ip-space e

lem

ents

= t

okens /

>

<xsl:te

mpla

te m

atc

h =

patt

ern

nam

e =

qnam

e

pri

ori

ty =

num

ber

mode =

tokens

as =

sequence-ty

pe>

xsl:para

m*,

sequence-constr

uct

or

</xsl:te

mpla

te>

<xsl:vari

ab

le n

am

e =

qnam

e

sele

ct

= e

xpre

ssio

n

as =

sequence-ty

pe>

sequence-constr

ucto

r

<

/xsl:vari

ab

le>

xsl:vari

ab

le is a

lso a

llow

ed in s

equence-

constr

ucto

r conte

xts

.

Att

ribute

Specif

icati

on O

pti

ons

{ }

specif

ied u

sin

g a

n a

ttri

bute

valu

e t

em

pla

te

bold

=

requir

ed a

ttri

bute

non-bold

=

opti

onal att

ribute

Node C

onstr

ucti

ng Instr

ucti

ons

<xsl:att

ribute

nam

e =

{ q

nam

e }

nam

espace =

{ u

ri }

sele

ct

= e

xpre

ssio

n

separa

tor

= {

str

ing }

ty

pe =

qnam

e

validati

on =

"str

ict"

| "

lax"

|

"pre

serv

e"

| "s

trip

">

sequence-constr

ucto

r

<

/xsl:att

ribute

>

<xsl:com

ment

sele

ct

= e

xpre

ssio

n>

sequence-constr

ucto

r

<

/xsl:com

ment>

<xsl:docum

ent

type =

qnam

e

validati

on =

"str

ict"

| "

lax"

|

"pre

serv

e"

| "s

trip

" >

sequence-constr

ucto

r

<

/xsl:docum

ent>

<xsl:ele

ment

nam

e =

{ q

nam

e }

nam

espace =

{ u

ri}

inheri

t-nam

espaces =

"yes"

| "n

o"

use-att

ribute

-sets

= q

nam

es

type =

qnam

e

validati

on =

"str

ict"

| "

lax"

|

"pre

serv

e"

| "s

trip

">

sequence-constr

ucto

r

<

/xsl:ele

ment>

Ele

ment

nodes c

an a

lso b

e c

onstr

ucte

d u

sin

g X

ML

ele

ments

not

in t

he x

sl: n

am

esp

ace, w

hic

h c

an

als

o s

pecif

y x

sl:ty

pe, xsl:validati

on

and

xsl:use-att

ribute

-sets

att

ribute

s.

<xsl:nam

espace n

am

e =

{ n

cnam

e }

sele

ct

= e

xpre

ssio

n>

sequence-constr

ucto

r

<

/xsl:nam

espace>

<xsl:p

rocessin

g-in

str

ucti

on

nam

e =

{ n

cnam

e }

sele

ct

= e

xpre

ssio

n>

sequence-constr

ucto

r

<

/xsl:p

rocessin

g-in

str

ucti

on>

<xsl:sequence s

ele

ct

= e

xpre

ssio

n>

xsl:fa

llback*

</xsl:sequence>

<xsl:te

xt

dis

able

-outp

ut-

escapin

g =

"yes"

| "n

o"

>

#PC

DA

TA

</xsl:te

xt>

dis

able

-outp

ut-

escapin

g is d

epre

cate

d.

Text

als

o c

onstr

ucts

text

nodes.

XSL-Lis

t:

htt

p:/

/w

ww

.mulb

err

yte

ch.c

om

/xsl/

xsl-

list

<xsl:re

sult

-docum

ent

form

at

= {

qnam

e }

hre

f =

{ u

ri }

validati

on =

"str

ict"

| "

lax"

|

"pre

serv

e"

| "s

trip

"

ty

pe =

qnam

e

meth

od =

{ "

xm

l" |

"htm

l" |

"xhtm

l" |

"

text"

| q

nam

e-but-

not-

ncnam

e }

byte

-ord

er-

mark

= {

"yes"

| "n

o"

}

cdata

-secti

on-ele

ments

= {

qnam

es }

docty

pe-public =

{ s

trin

g }

docty

pe-syste

m =

{ s

trin

g }

encodin

g =

{ s

trin

g }

escape-uri

-att

ribute

s =

{ "

yes"

| "n

o"

}

in

clu

de-conte

nt-

type =

{ "

yes"

| "n

o"

}

in

dent

= {

"yes"

| "n

o"

}

m

edia

-ty

pe =

{ s

trin

g }

norm

alizati

on-fo

rm =

{ "

NFC

" |

"NFD

" |

"

NFK

C"

| "N

FK

D"

| "n

one”

|

"f

ully-norm

alized"

| nm

token }

om

it-xm

l-decla

rati

on =

{ "

yes"

| "n

o"

}

sta

ndalo

ne =

{ "

yes"

| "n

o"

| "o

mit

" }

undecla

re-pre

fixes =

{ "

yes"

| "n

o"

}

use-chara

cte

r-m

aps =

qnam

es

outp

ut-

vers

ion =

{ n

mto

ken } >

sequence-constr

ucto

r

<

/xsl:re

sult

-docum

ent>

Allow

ed A

ttri

bute

Valu

es:

char

a s

ingle

chara

cte

r

expre

ssio

n

an X

Path

expre

ssio

n

id

an ID

att

ribute

valu

e

ncnam

e

a n

am

e w

ith n

o

nam

espace p

refi

x

nm

token

a n

um

ber

token

num

ber

a n

um

ber

(only

dig

its)

patt

ern

an X

Path

expre

ssio

n

confo

rmin

g t

o p

att

ern

synta

x

pre

fix

a n

am

espace p

refi

x

qnam

e-but-

not-

ncnam

e

a n

am

e w

ith a

nam

espace p

refi

x

qnam

e

a n

am

e w

ith o

r w

ithout

a

nam

espace p

refi

x

sequence-ty

pe

an X

ML S

chem

a

sequence t

ype (w

ith *

)

str

ing

just

text

token

specif

ic t

o its

use

uri

-list

whit

e-space s

epara

ted

list

of

URIs

uri

a u

nif

orm

resourc

e

identi

fier

XSLT2

164

Page 165: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Condit

ional and L

oopin

g

Instr

ucti

ons

<xsl:analy

ze-str

ing s

ele

ct

= e

xpre

ssio

n

regex =

{ s

trin

g }

flags =

{ s

trin

g }>

<xsl:m

atc

hin

g-su

bstr

ing>

sequence-constr

ucto

r

</xsl:m

atc

hin

g-su

bstr

ing>

<xsl:non-m

atc

hin

g-substr

ing>

sequence-constr

ucto

r

</xsl:non-m

atc

hin

g-substr

ing>

xsl:fa

llback*

<

/xsl:analy

ze-str

ing>

One b

ut

not

both

of

xsl:m

atc

hin

g-su

bstr

ing a

nd

xsl:non-m

atc

hin

g-substr

ing

can b

e o

mit

ted.

regex-gro

up(N

) re

turn

s t

he N

th g

roup m

atc

hed

by t

he r

egex w

ithin

xsl:m

atc

hin

g-substr

ing.

<xsl:choose>

<xsl:w

hen t

est

= e

xpre

ssio

n>

sequence-constr

ucto

r

</xsl:w

hen>

<xsl:oth

erw

ise>

sequence-constr

ucto

r

</xsl:oth

erw

ise>

</xsl:choose>

One o

r m

ore

xsl:w

hen

and z

ero

or

one

xsl:oth

erw

ise a

re a

lllo

wed.

<xsl:fo

r-each

sele

ct

= e

xpre

ssio

n>

xsl:sort

*, sequence-constr

ucto

r

<

/xsl:fo

r-each>

<xsl:fo

r-each-gro

up s

ele

ct

= e

xpre

ssio

n

gro

up-by =

expre

ssio

n

gro

up-adja

cent

= e

xpre

ssio

n

gro

up-sta

rtin

g-w

ith =

patt

ern

gro

up-endin

g-w

ith =

patt

ern

collati

on =

{ u

ri }

>

xsl:sort

*, sequence-constr

ucto

r

<

/xsl:fo

r-each-gro

up>

<xsl:if

test

= e

xpre

ssio

n>

sequence-constr

ucto

r

<

/xsl:if

>

Sta

ndard

Att

ribute

s

Sta

ndard

att

ribute

s a

re a

llow

ed o

n a

ll e

lem

ents

.

When n

ot

on x

sl: e

lem

ents

, th

e x

sl: p

refi

x is

requir

ed o

n t

he a

ttri

bute

nam

e.

[xsl:]d

efa

ult

-collati

on =

uri

[xsl:]e

xclu

de-re

sult

-pre

fixes =

tokens

[xsl:]e

xte

nsio

n-ele

ment-

pre

fixes =

tokens

[xsl:]u

se-w

hen =

expre

ssio

n

[xsl:]v

ers

ion =

"1.0

" |

"2.0

"

[xsl:]x

path

-defa

ult

-nam

espace =

uri

Valu

e/C

opy Instr

ucti

ons

<xsl:copy c

opy-nam

espaces =

"yes"

| "n

o"

inheri

t-nam

espaces =

"yes"

| "n

o"

use-att

ribute

-sets

= q

nam

es

type =

qnam

e

validati

on =

"str

ict"

| "

lax"

|

"pre

serv

e"

| "s

trip

">

sequence-constr

ucto

r

<

/xsl:copy>

<xsl:copy-of

sele

ct

= e

xpre

ssio

n

copy-nam

espaces =

"yes"

| "n

o"

type =

qnam

e

validati

on =

"str

ict"

| "

lax"

|

"pre

serv

e"

| "s

trip

" />

<xsl:num

ber

valu

e =

expre

ssio

n

sele

ct

= e

xpre

ssio

n

level =

"sin

gle

" |

"mult

iple

" |

"any"

count

= p

att

ern

fr

om

= p

att

ern

fo

rmat

= {

str

ing }

lang =

{ n

mto

ken }

lett

er-

valu

e =

{ "

alp

habeti

c"

|

"tr

adit

ional"

}

ord

inal =

{ s

trin

g }

gro

upin

g-separa

tor

= {

char

}

gro

upin

g-siz

e =

{ n

um

ber

} />

<xsl:p

erf

orm

-sort

sele

ct

= e

xpre

ssio

n>

xsl:sort

+, sequence-constr

ucto

r

<

/xsl:p

erf

orm

-sort

>

<xsl:valu

e-of

sele

ct

= e

xpre

ssio

n

separa

tor

= {

str

ing }

dis

able

-outp

ut-

escapin

g =

"yes"

| "n

o"

>

sequence-constr

ucto

r

<

/xsl:valu

e-of>

dis

able

-outp

ut-

escapin

g is d

epre

cate

d.

<xsl:sort

sele

ct

= e

xpre

ssio

n

lang =

{ n

mto

ken }

ord

er

= {

"ascendin

g"

| "d

escendin

g"}

collati

on =

{ u

ri }

sta

ble

= {

"yes"

| "n

o"

}

case-ord

er

= {

"upper-

firs

t" |

"lo

wer-

firs

t" }

data

-ty

pe =

{ "

text"

| "

num

ber"

|

qnam

e-but-

not-

ncn

am

e } >

sequence-constr

ucto

r

<

/xsl:sort

>

xsl:sort

is u

sed in x

sl:fo

r-each,

xsl:fo

r-each-gro

up, xsl:apply

-te

mpla

tes a

nd

xsl:p

erf

orm

-sort

. XSLT 2

.0:

htt

p:/

/w

ww

.w3

.org

/TR/xslt

20/

XPath

2.0

:

htt

p:/

/w

ww

.w3

.org

/TR/xpath

20/

2008-07-21

XSLT

2.0

Quic

k R

efe

rence

Sam

Wilm

ott

sam

@w

ilm

ott

.ca

htt

p:/

/w

ww

.wilm

ott

.ca

and

Mulb

err

y T

echnolo

gie

s, In

c.

17 W

est

Jeff

ers

on S

treet,

Suit

e 2

07

Rockville

, M

D 2

085

0 U

SA

Phone:

+1 3

01/31

5-9

63

1

Fax:

+1 3

01

/31

5-828

5

info

@m

ulb

err

yte

ch.c

om

htt

p:/

/w

ww

.mulb

err

yte

ch.c

om

© 2

007

-2

008

Sam

Wilm

ott

and

M

ulb

err

y T

echnolo

gie

s, In

c.

The S

tyle

sheet

Ele

ment

<xsl:sty

lesheet

id =

id

exte

nsio

n-ele

ment-

pre

fixes =

tokens

exclu

de-re

sult

-pre

fixes =

tokens

vers

ion

= "

1.0

" |

"2.0

"

xpath

-defa

ult

-nam

espace =

uri

defa

ult

-validati

on =

"pre

serv

e"

| "s

trip

"

defa

ult

-collati

on =

uri

-list

input-

type-annota

tions =

"pre

serv

e"

|

"

str

ip"

| "u

nspecif

ied"

xm

lns:x

sl=

"

htt

p:/

/w

ww

.w3

.org

/1999/XSL/Tra

nsf

orm

">

xsl:im

port

*, t

op-le

vel-

decla

rati

ons

<

/xsl:sty

lesheet>

xsl:tr

ansf

orm

is a

synonym

for

xsl:sty

lesheet.

<xsl:im

port

hre

f =

uri

/>

A lit

era

l re

sult

ele

ment

can b

e u

sed in p

lace o

f xsl:sty

lesheet,

so long a

s it

specif

ies a

ttri

bute

xsl:vers

ion a

nd n

am

espace x

mln

s:x

sl.

Tem

pla

te Invocati

on Instr

ucti

ons

<xsl:app

ly-im

port

s>

xsl:w

ith-para

m*

</xsl:app

ly-im

port

s>

<xsl:app

ly-te

mpla

tes s

ele

ct

= e

xpre

ssio

n

mode =

token>

(xsl:sort

| x

sl:w

ith-para

m)*

</xsl:app

ly-te

mpla

tes>

<xsl:call-te

mpla

te n

am

e =

qnam

e>

xsl:w

ith-para

m*

</xsl:call-te

mpla

te>

<xsl:next-

matc

h>

(xsl:w

ith-para

m |

xsl:fa

llback)*

</xsl:next-

matc

h>

<xsl:w

ith-para

m n

am

e =

qnam

e

sele

ct

= e

xpre

ssio

n

as =

sequence-ty

pe

tunnel =

"yes"

| "n

o">

sequence-constr

ucto

r

<

/xsl:w

ith-para

m>

Excepti

on-H

andling Instr

ucti

ons

<xsl:fa

llback>

sequence-constr

ucto

r

<

/xsl:fa

llback>

<xsl:m

essage s

ele

ct

= e

xpre

ssio

n

term

inate

= {

"yes"

| "n

o"

}>

sequence-constr

ucto

r

<

/xsl:m

essage>

165

Page 166: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Top-

Leve

l Sc

hem

a Th

is Q

uick

Ref

eren

ce p

rimar

ily d

escr

ibes

ISO

Sc

hem

atro

n. S

ee th

e “D

iffer

ence

” pan

el fo

r Sc

ham

atro

n 1.

5 an

d 1.

6.

<sc

hem

a id

="ID

" ico

n="U

RI" s

ee=

"URI

"

fp

i=”F

ORM

AL-P

UBLI

C-ID

” xm

l:lan

g="L

ANG"

xm

l:spa

ce=

{"pre

serv

e" |

"def

ault"

}

sc

hem

aVer

sion

="V

ERSI

ON

"

de

faul

tPha

se=

"IDRE

F"

quer

yBin

ding

="B

IND

ING-

NAM

E"

xmln

s=

"h

ttp:

//pu

rl.oc

lc.o

rg/d

sdl/

sche

mat

ron"

>

<

title

>?,

<ns

>*,

<p>

*, <

let>

*, <

phas

e>*,

<

patt

ern>

+, <

p>*,

<di

agno

stic

s>?,

plu

s

in

ters

pers

ed <

incl

ude>

</s

chem

a>

<ns

pre

fix=

"NM

TOKE

N" u

ri=

"URI

"/>

All n

ames

pace

s us

ed in

val

idat

ed d

ocum

ents

, and

re

fere

nced

in th

e sc

hem

a, m

ust b

e de

clar

ed u

sing

<

ns>

.

<le

t na

me=

"NAM

E" v

alue

="V

ALUE

"/>

<in

clud

e hr

ef=

”URI

”/>

Patt

erns

<

patt

ern

abst

ract

="f

alse

" id=

"ID"

icon

="U

RI" s

ee=

"URI

"

fp

i=”F

ORM

AL-P

UBLI

C-ID

” xm

l:lan

g="L

ANG"

xm

l:spa

ce=

{"pre

serv

e" |

"def

ault"

}>

<

p>*,

<le

t>*,

<ru

le>

*, p

lus

inte

rspe

rsed

<

incl

ude>

</p

atte

rn>

With

in e

ach

patt

ern,

onl

y th

e fir

st n

on-a

bstr

act

<ru

le>

who

se @

cont

ext

mat

ches

is u

sed.

Abs

trac

t pa

tter

ns

<pa

tter

n ab

stra

ct=

"tru

e" id

="ID

"

ic

on=

"URI

" see

="U

RI"

fpi=

”FO

RMAL

-PUB

LIC-

ID” x

ml:l

ang=

"LAN

G"

xml:s

pace

={"p

rese

rve"

| "d

efau

lt"}>

<p>

*, <

let>

*, <

rule

>*,

plu

s in

ters

pers

ed

<in

clud

e>

<

/pat

tern

>

Usi

ng a

bstr

act

pat

tern

s <

patt

ern

abst

ract

="f

alse

" is-

a="ID

REF"

id=

"ID"

icon

="U

RI" s

ee=

"URI

"

fp

i=”F

ORM

AL-P

UBLI

C-ID

” xm

l:lan

g="L

ANG"

xm

l:spa

ce=

{"pre

serv

e" |

"def

ault"

}>

<

p>*,

<pa

ram

>*,

and

inte

rspe

rsed

<in

clud

e>

<

/pat

tern

>

<pa

ram

nam

e="N

CNAM

E" v

alue

="V

ALUE

"/>

@va

lue

mus

t be

non-

empt

y-st

ring

Phas

es

<ph

ase

id=

"ID" i

con=

"URI

" see

="U

RI"

fpi=

”FO

RMAL

-PUB

LIC-

ID” x

ml:l

ang=

"LAN

G"

xml:s

pace

={"p

rese

rve"

| "d

efau

lt"}>

<p>

*, <

let>

*, <

activ

e>*,

plu

s in

ters

pers

ed

<in

clud

e>

<

/pha

se>

<ac

tive

pat

tern

="ID

REF"

>

an

y nu

mbe

r of t

ext,

<di

r>, <

emph

> a

nd

<sp

an>

</a

ctiv

e>

Rule

s, A

sser

tion

s an

d Re

port

s <

rule

flag

=”N

AME”

abs

trac

t="f

alse

"?

cont

ext=

”PAT

H” i

d="ID

" ico

n="U

RI"

fpi=

”FO

RMAL

-PUB

LIC-

ID” x

ml:l

ang=

"LAN

G"

xml:s

pace

={"p

rese

rve"

| "d

efau

lt"}

see=

"URI

" rol

e="R

OLE

" sub

ject

="P

ATH

">

an

y nu

mbe

r of <

let>

, fol

low

ed b

y an

y nu

mbe

r

(a

t lea

st o

ne) o

f <as

sert

>, <

repo

rt>

and

<

exte

nds>

, plu

s in

ters

pers

ed <

incl

ude>

</r

ule>

<ex

tend

s ru

le=

"IDRE

F"/>

plus

any

fore

ign

attr

ibut

es

<as

sert

tes

t="E

XPR"

flag

=”N

AME”

id=

"ID"

diag

nost

ics=

"IDRE

FS" i

con=

"URI

" see

="U

RI"

fpi=

”FO

RMAL

-PUB

LIC-

ID” x

ml:l

ang=

"LAN

G"

xml:s

pace

={"p

rese

rve"

| "d

efau

lt"}

role

="R

OLE

" sub

ject

="P

ATH

">

an

y nu

mbe

r of t

ext,

<na

me>

, <va

lue-

of>

,

<

emph

>, <

dir>

and

<sp

an>

</a

sser

t>

<re

port

tes

t="E

XPR"

flag

=”N

AME”

id=

"ID"

diag

nost

ics=

"IDRE

FS" i

con=

"URI

" see

="U

RI"

fpi=

”FO

RMAL

-PUB

LIC-

ID” x

ml:l

ang=

"LAN

G"

xml:s

pace

={"p

rese

rve"

| "d

efau

lt"}

role

="R

OLE

" sub

ject

="P

ATH

">

an

y nu

mbe

r of t

ext,

<na

me>

, <va

lue-

of>

,

<

emph

>, <

dir>

and

<sp

an>

</r

epor

t>

Abs

trac

t ru

les

(use

d to

<ex

tend

s> o

ther

s)

<ru

le fl

ag=

”NAM

E” a

bstr

act=

"tru

e"

id=

"ID" i

con=

"URI

"

fp

i=”F

ORM

AL-P

UBLI

C-ID

” xm

l:lan

g="L

ANG"

xm

l:spa

ce=

{"pre

serv

e" |

"def

ault"

}

se

e="U

RI" r

ole=

"RO

LE" s

ubje

ct=

"PAT

H">

any

num

ber o

f <le

t>, f

ollo

wed

by

any

num

ber

(at l

east

one

) of <

asse

rt>

, <re

port

> a

nd

<ex

tend

s>, p

lus

inte

rspe

rsed

<in

clud

e>

<

/rul

e>

XSL-

List

: ht

tp:/

/ww

w.m

ulbe

rryt

ech.

com

/xsl

/xsl

-lis

t

Dia

gnos

tics

<

diag

nost

ics>

any

num

ber o

f <di

agno

stic

> a

nd <

incl

ude>

</d

iagn

osti

cs>

<di

agno

stic

id=

"ID" i

con=

"URI

" see

="U

RI"

fpi=

”FO

RMAL

-PUB

LIC-

ID” x

ml:l

ang=

"LAN

G"

xml:s

pace

={"p

rese

rve"

| "d

efau

lt"}>

any

num

ber o

f tex

t, <

valu

e-of

>, <

emph

>,

<di

r> a

nd <

span

>

<

/dia

gnos

tic>

Form

atti

ng O

utpu

t <

titl

e>

an

y nu

mbe

r of <

dir>

and

text

</t

itle

>

<p

id=

"ID" c

lass

="C

LASS

" ico

n="U

RI">

any

num

ber o

f tex

t, <

dir>

, <em

ph>

and

<

span

>

<

/p>

<di

r va

lue=

{"ltr

" | "r

tl"}>

text

</d

ir>

<em

ph>

text

</e

mph

>

<sp

an c

lass

="C

LASS

">

te

xt

<

/spa

n>

<va

lue-

of s

elec

t="P

ATH

"/>

<

nam

e pa

th=

"PAT

H"/

>

If @

path

not

spe

cifie

d, <

nam

e> re

turn

s th

e na

me

of th

e cu

rren

t nod

e.

Att

ribu

te S

peci

fica

tion

Opt

ions

{

}

alte

rnat

e al

low

ed v

alue

s bo

ld =

re

quire

d at

trib

ute

non-

bold

=

optio

nal a

ttrib

ute

W3C

XSL

T 1.

0 Sp

ecif

icat

ion:

ht

tp:/

/ww

w.w

3.or

g/TR

/xsl

t

W3C

XPa

th 1

.0 S

peci

fica

tion

: ht

tp:/

/ww

w.w

3.or

g/TR

/xpa

th

W3C

XSL

T 2.

0 Sp

ecif

icat

ion:

ht

tp:/

/ww

w.w

3.or

g/TR

/xsl

t20

W3C

XPa

th 2

.0 S

peci

fica

tion

: ht

tp:/

/ww

w.w

3.or

g/TR

/xpa

th20

Whi

ch P

atte

rns

Are

Use

d?

All n

on-a

bstr

act <

patt

ern>

s ar

e us

ed if

:

• th

ere’

s no

<ph

ase>

in th

e <

sche

ma>

,

• th

ere’

s no

<ph

ase>

sel

ecte

d by

its

@id

at

trib

ute,

or

• th

e <

sche

ma>

is in

voke

d w

ith th

e #A

LL o

ptio

n.

If th

ere’

s a

@de

faul

tPha

se, a

nd th

e <

sche

ma>

is

invo

ked

with

the

#DEF

AULT

opt

ion,

then

all

<pa

tter

n>s

refe

renc

ed in

the

<ac

tive>

chi

ldre

n of

the

defa

ult <

phas

e> a

re u

sed.

If th

e im

plem

enta

tion

sele

cts

a <

phas

e> u

sing

its

@id

att

ribut

e, th

en a

ll <

patt

ern>

s re

fere

nced

in

the

<ac

tive>

chi

ldre

n of

that

<ph

ase>

are

use

d.

How

#AL

L, #

DEF

AULT

and

nam

ed p

hase

s ar

e sp

ecifi

ed is

impl

emen

tatio

n-de

term

ined

.

Mor

e A

bout

Att

ribu

tes

@ab

stra

ct in

dica

tes

whe

ther

a <

patt

ern>

or

<ru

le>

is to

be

used

as-

is (i

f “fa

lse”

) or b

y an

othe

r <pa

tter

n> o

r <ru

le>

(if “

true

”).

@de

faul

tPha

se (o

n <

sche

ma>

) ind

icat

es w

hich

<

phas

e> is

use

d to

det

erm

ine

whi

ch <

patt

ern>

s ar

e se

lect

ed b

y th

e #D

EFAU

LT o

ptio

n.

@fl

ag o

n a

fired

<ru

le>

, on

a fa

iling

<as

sert

> o

r on

a s

ucce

edin

g <

repo

rt>

set

s a

flag

for f

urth

er

proc

essi

ng.

@fp

i is

a pu

blic

iden

tifie

r ass

ocia

ted

with

the

elem

ent i

t app

ears

on.

@ic

on is

the

URI o

f the

loca

tion

of a

gra

phic

.

@qu

eryB

indi

ng (

on <

sche

ma>

) ind

icat

es w

hich

qu

ery

lang

uage

is to

be

used

. Th

e de

faul

t is

“xsl

t” —

for X

SLT/

XPat

h 1.

0. O

ther

app

ropr

iate

va

lues

are

: “st

x”, “

xslt1

.1”,

“exs

lt”, “

xslt2

”, “x

path

”, “x

path

2”, “

xque

ry”.

@ro

le is

a n

ame

clas

sify

ing

the

<ru

le>

, <as

sert

>

or <

repo

rt>

, or t

he @

subj

ect,

if a

ny.

@se

e is

the

URI o

f inf

orm

atio

n ab

out t

he s

chem

a its

elf.

@su

bjec

t is

a p

ath

desc

ribin

g re

late

d el

emen

ts

and/

or a

ttrib

utes

, if o

ther

than

the

cont

ext o

f the

cu

rren

t <ru

le>

.

Fore

ign

Elem

ents

and

Att

ribu

tes

Sche

ma

elem

ents

can

hav

e “f

orei

gn” a

ttrib

utes

, an

d no

n-em

pty

sche

ma

elem

ents

can

con

tain

“f

orei

gn” c

hild

ele

men

ts.

Fore

ign

attr

ibut

es a

nd

elem

ents

are

thos

e in

a n

ames

pace

oth

er th

an

"htt

p://

purl.

oclc

.org

/dsd

l/sc

hem

atro

n”.

ISO Schematron

166

Page 167: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Sche

mat

ron

1.5

Scha

mat

ron

1.5

diff

ers

from

ISO

Sch

mat

ron

in th

e fo

llow

ing

way

s:

Ove

rall:

The

nam

espa

ce fo

r Sch

emat

ron

1.5

is:

"htt

p:/w

ww

.asc

c.ne

t/xm

l/sc

hem

atro

n"

• <

let>

and

<in

clud

e> e

lem

ents

are

not

su

ppor

ted.

• <

key>

ele

men

t is

supp

orte

d:

<

key

nam

e="N

AME"

pat

h="P

ATH

"

ic

on=

"URI

"/>

<

key>

is a

llow

ed a

nyw

here

in th

e co

nten

t of

<ru

le>

. (In

ISO

Sch

emat

rons

impl

emen

tatio

ns

supp

ortin

g th

e us

e of

XSL

T "f

orei

gn" e

lem

ents

, <

xsl:k

ey>

can

be

used

in p

lace

of S

chem

atro

n 1.

5's

<ke

y>.)

• Ab

stra

ct <

patt

ern>

s ar

e no

t sup

port

ed.

• At

trib

ute

patt

ern/

@na

me

used

to n

ame

<pa

tter

n>s

rath

er th

an @

id.

It's

a re

quire

d at

trib

ute.

Uns

uppo

rted

Att

ribu

tes:

Thes

e at

trib

utes

are

not

sup

port

ed a

nyw

here

: @

xml:s

pace

, @fla

g.

• Th

ese

attr

ibut

es a

re n

ot s

uppo

rted

on

<ru

le>

: @

see,

@xm

l:lan

g, @

icon

, @fp

i, @

subj

ect.

• Th

ese

attr

ibut

es a

re n

ot s

uppo

rted

on

<di

agno

stic

s>: @

see,

@fp

i.

• In

add

ition

, att

ribut

e @

see

is n

ot s

uppo

rted

on

<sc

hem

a>, <

asse

rt>

or <

repo

rt>

.

Oth

er D

iffe

renc

es:

• <

valu

e-of

> is

n’t a

llow

ed a

s a

child

of

<as

sert

> o

r <re

port

>.

• At

trib

ute

@ve

rsio

n is

allo

wed

on

<sc

hem

a>.

(Def

ault

valu

e is

"1.5

".)

• Th

e fo

llow

ing

attr

ibut

es a

re o

ptio

nal:

ns/@

uri,

dir/

@va

lue

and

span

/@cl

ass.

Sche

mat

ron

1.6

Sche

mat

ron

1.6

diff

ers

from

Sch

emat

ron

1.5

in

supp

ortin

g m

ost I

SO S

chem

atro

n fe

atur

es,

incl

udin

g <

let>

, <in

clud

e>, a

bstr

act <

patt

ern>

s an

d <

valu

e-of

> in

<as

sert

> a

nd <

repo

rt>

. Sc

hem

atro

n 1.

5/1.

6 Re

sour

ces:

ht

tp:/

/xm

l.asc

c.ne

t/sc

hem

atro

n/

Sche

mat

ron

Val

idat

ion

Repo

rt L

angu

age

The

Sche

mat

ron

Valid

atio

n Re

port

Lan

guag

e is

th

e st

anda

rd fo

r the

out

put o

f an

ISO

Sch

emat

ron

proc

esso

r. It

can

be

post

-pro

cess

ed to

pro

duce

m

ore

read

able

out

put,

if re

quire

d.

<sc

hem

atro

n-ou

tput

title

="T

EXT"

ph

ase=

"NM

TOKE

N" s

chem

aVer

sion

="T

EXT"

xm

lns=

"htt

p://

purl.

oclc

.org

/dsd

l/sv

rl">

<te

xt>

*, <

ns-p

refix

-in-

attr

ibut

e-va

lues

>*,

(<

activ

e-pa

tter

n>, (

<fir

ed-r

ule>

,

(<fa

iled-

asse

rt>

|

<

succ

essf

ul-r

epor

t>)*

)+)+

</s

chem

atro

n-ou

tput

>

<ns

-pre

fix-

in-a

ttri

bute

-val

ues

pref

ix=

"NM

TOKE

N" u

ri=

"URI

"/>

Onl

y na

mes

pace

s fr

om <

ns>

nee

d to

be

repo

rted

.

<ac

tive

-pat

tern

id=

"ID" n

ame=

"TEX

T"

role

="N

MTO

KEN

"/>

Onl

y ac

tive

<pa

tter

n>s

are

repo

rted

.

<fi

red-

rule

id=

"ID" c

onte

xt=

"TEX

T"

role

="N

MTO

KEN

" fla

g="N

MTO

KEN

"/>

Onl

y <

rule

>s

that

are

fire

d ar

e re

port

ed.

<di

agno

stic

-ref

eren

ce

diag

nost

ic=

"NM

TOKE

N">

<te

xt>

</d

iagn

osti

c-re

fere

nce>

Onl

y re

fere

nces

are

repo

rted

, not

the

<di

agno

stic

>.

<fa

iled

-ass

ert

id=

"ID" l

ocat

ion=

"TEX

T"

test

="T

EXT"

role

="N

MTO

KEN

"

fla

g="N

MTO

KEN

">

<

diag

nost

ic-r

efer

ence

>*,

<te

xt>

</f

aile

d-as

sert

>

Onl

y fa

iled

<as

sert

>s

are

repo

rted

.

<su

cces

sful

-rep

ort

id=

"ID" l

ocat

ion=

"TEX

T"

test

="T

EXT"

role

="N

MTO

KEN

"

fla

g="N

MTO

KEN

">

<

diag

nost

ic-r

efer

ence

>*,

<te

xt>

</s

ucce

ssfu

l-re

port

>

Onl

y su

cces

sful

<re

port

>s

are

repo

rted

.

<te

xt>

text

</t

ext>

Se

e ot

her

Qui

ck R

efer

ence

s fo

r at

: ht

tp:/

/ww

w.m

ulbe

rryt

ech.

com

/qui

ckre

f

2012

-03-

05

ISO

Sch

emat

ron

Qui

ck R

efer

ence

Sam

Wil

mot

t sa

m@

wilm

ott.c

a ht

tp:/

/ww

w.w

ilmot

t.ca

and

Mul

berr

y Te

chno

logi

es,

Inc.

17

Wes

t Jef

fers

on S

tree

t, Su

ite 2

07

Rock

ville

, MD

208

50 U

SA

Phon

e: +

1 30

1/31

5-96

31

Fax:

+1

301/

315-

8285

in

fo@

mul

berr

ytec

h.co

m

http

://w

ww

.mul

berr

ytec

h.co

m

© 2

009-

2012

Sam

Wilm

ott a

nd

Mul

berr

y Te

chno

logi

es, I

nc.

ISO

Sch

emat

ron

Exam

ples

C

heck

ing

a do

cum

ent

for

good

pr

acti

ce:

<sc

hem

a xm

lns=

"htt

p://

purl.

oclc

.org

/dsd

l/sc

hem

atro

n"

qu

eryB

indi

ng=

"xsl

t2">

<pa

tter

n>

<tit

le>

Chec

k pa

ragr

aphs

and

title

s fo

r

co

nten

t</t

itle>

<

rule

con

text

="t

itle"

>

<

repo

rt te

st=

"*">

A tit

le c

an o

nly

cont

ain

text

.</r

epor

t>

<

asse

rt te

st=

"nor

mal

ize-

spac

e()">

A tit

le

mus

t hav

e co

nten

t.</a

sser

t>

</r

ule>

<

rule

con

text

="p

">

<

asse

rt te

st=

"* o

r nor

mal

ize-

spac

e()">

A

p

mus

t hav

e co

nten

t.</a

sser

t>

</r

ule>

</p

atte

rn>

<pa

tter

n>

<tit

le>

Repo

rt u

se o

f HTM

L fo

rmat

ting

elem

ents

.</t

itle>

<

rule

con

text

="b

| i">

<

repo

rt te

st=

"tru

e()">

HTM

L <

nam

e/>

el

eme n

ts s

houl

dn't

be u

sed

(foun

d

in

<na

me

path

="..

"/>

).</r

epor

t>

</r

ule>

</p

atte

rn>

<pa

tter

n>

<tit

le>

Chec

k th

at ti

tles

prec

ede

som

ethi

ng.<

/titl

e>

<ru

le c

onte

xt=

"titl

e">

<as

sert

test

=

"f

ollo

win

g-si

blin

g::*

[1][n

ot(s

elf::

title

)]"

>

A tit

le s

houl

d be

follo

wed

by

a

no

n-tit

le e

lem

ent.<

/ass

ert>

<

/rul

e>

<

/pat

tern

>

</s

chem

a>

ISO

Sch

emat

ron:

Go

to:

http

://w

ww

.iso.

org/

Publ

icly

Avai

labl

eSta

ndar

ds

and

sear

ch fo

r "Sc

hem

atro

n".

Oth

er S

chem

atro

n re

sour

ces:

ht

tp:/

/ww

w.s

chem

atro

n.co

m

167

Page 168: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Workshop: An Introduction to Digital Humanities Tools and Approaches

7 Workshop: An Introduction to Digital Humanities Tools andApproaches

7.1 Corpus Linguistics and Text Analysis7.1.1 Corpus and Text Analysis for Research in the HumanitiesAs more and more large digital datasets of modern and historical texts become available, it is becomingincreasingly important for scholars in the humanities and beyond to be able to search, sort and analyzeelectronic texts. Traditional scholarship and research methods have taught us how to read texts closelyand critically, and how to evaluate and use printed sources. But what do we do when we have a millionbooks readily available to us online? Corpus linguistics and related areas of electronic electronic textanalysis have pioneered techniques to deal with this ’data deluge’ and have transformed many areas ofliterary and linguistic study. This short course will aim to promote the use of the techniques developedin these domains to a dress a much wider range of research questions from across the disciplines of theHumanities.

The first session will be a lecture giving an overview of some of the most relevant and useful resources,tools and methods, some suggestions of investigations based on them, and an exploration of some of theongoing barriers and problems.

7.1.2 Dealing with the Data Deluge: Corpus Linguistics for Text-Based ResearchThe second session will be an opportunity to work ’hands-on’ with some of these corpora and tools,exploring exercises based on finding and evaluating evidence relevant to a variety of primarily non-linguistic topics, such as:

• Do men swear more than women in conversation?

• How can we trace changes in meaning in political terminology?

• Can we find when a word was first used, and how its meaning has changed?

• In my writing, can I find out if someone overuses some words and phrases compared to the normsfor the language?

• How much can we trust Google Books? How can it be used for scholarship?

• How many new words or meanings did Shakespeare invent?

168

Page 169: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Spatial awareness: a brief introduction to ArcGIS

This practical is designed to give a brief introduction to some of the analyses you can perform when working with GIS. It is intentionally over length: I do not expect you to necessarily get every stage done. Feel free to either work through as much as you can, or to pick and choose the parts of most interest. However, sections 1 to 4 should be considered the most important

The data used is completely invented (if you plotted it against the Ordnance Survey basemap, you would find that it is located somewhere in the uplands east of Manchester). This means you can feel free to play around with it however you wish after you leave, but also means that you should not use it for any kind of real world application or in any published material.

The back story to this GIS analysis is that the development of a small tourist attraction has been proposed in the region of a rural village. You are taking the role of the local authority planning archaeologist (the techniques used are of wide application), who has brought together a collection of different data in order to assess what impact this development might have on any buried archaeological remains in the local area. You have already added most of this data to your GIS map, and you will now try to study the data available to you to discover what past remains might still exist under the ground.

1. Exploring the map

We will begin by learning how to explore the map, zooming in and out, and turning the visibility of data layers on and off. Here is an image of the ArcMap user interface:

Page 170: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

You will not have this exact configuration of toolbars, but don’t worry about that. The main things to take note of are the map exploration tools: The magnifying glasses enable you to zoom in and out by either clicking on the map or dragging out a box. The hand lets you grab and pan the map. The globe lets you zoom out to the full extent of all layers on the map. And the two sets of arrows are fixed zooms in and out. Try all of these tools now. Then, zoom to the full extent of the map.

Another useful tool is the identify tool: This little ‘i’ allows you to click on any object on the map to get more information on it. The gridded features on the map are field survey results: use the identify tool to click on some of the transects to see what data was gathered. When you are done with the identify tool, click on the pointer icon next to it to get back your normal mouse pointer (the same applies to the magnification and pan tools).

On the left is the Table of Contents, which shows you which geographic data layers have been added to the map document. When you are working in ArcMap, the document you are working on is essentially just a container, which links through to other files containing the actual data and which defines the appearance and ordering of layers. As a result, when you undertake most operations in ArcMap, you are not changing

Page 171: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

anything in the original data (unless working with editing tools or creating new data). However, this also means that if you move files around on your hard drive, then ArcMap will lose its links to them: being disciplined about maintaining file locations and directory structures is very important.

If you drag layers up and down in the Table of Contents, you will see that their drawing order is changed accordingly on the map. If you untick the boxes next to the layers, you will switch off their visibility (tick to turn back on). Try experimenting with this now. One thing to note is that there are several different views available in the Table of Contents, we want the ‘List by drawing order’ view, which should already be selected at the top with the left-hand icon:

While we are looking at the user interface, here are a few other things to note which will be of use later. If you want to add new data to the map, you click on the + icon: If you want to open up ArcCatalog or ArcToolbox (see below), you click on the filing cabinet or toolbox icons:

These should then appear as new windows or as tabs along the right hand edge of the user interface (image rotated):

Finally, to switch easily between the data and layout views (see below), we click on these small icons at the bottom left of the map window: The map icon is for data view (which is our normal view) and the page icon is for layout view (which is for creating map outputs, more on which later). The little refresh icon will refresh the displayed map and the pause icon will stop the map re-drawing until clicked again.

Here is a quick guide to what you are looking at on the map. The layer named ‘dem’ is a Digital Elevation Model of the local area: this shows the terrain. We then have a rectified (i.e. fitted to the map) satellite image and the rectified results of a geophysical survey (showing buried features based upon their magnetic properties). These are all raster data, whereas the rest of the data is vector. The lakes, rivers, roads and buildings layers should be self-explanatory. The divisions layer shows local field divisions. As already briefly stated, the survey fields layers show the transects surveyed by fieldwalking to discover surface pottery finds. Finally, the development layer shows the extent of the proposed new development. As you can see, we have quite a large amount of data here with which to assess the potential archaeological impact of this planned new tourist attraction.

2. Creating a layer from a table

Page 172: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

Next, we will add one more data layer to the map. This is in the form of a .csv table, which is essentially a form of spreadsheet (Excel files will also work similarly with modern versions of ArcGIS). This spreadsheet contains x and y coordinates for each entry, so when we plot the contents on the map, it will appear as a series of points.

Open up the ArcCatalog window. This will either be a tab at the right hand side of the window or will open as a window by itself when clicking on the ArcCatalog yellow filing cabinet button (). You should see a list of folders and files. In the list, there will be an entry called SMR next to a page icon. Right click on this item and select ‘Create feature class’ and then ‘From XY table’. A new window should appear. Select ‘Easting’ from the drop down box below ‘X Field:’ and ‘Northing’ from the drop down box below ‘Y Field:’. Then click on the button labelled ‘Coordinate System of Input Coordinates…’. This is where we set the projection for our new data, which in this case is the British National Grid. In the new window, click on the ‘Select…’ button and then navigate to Projected Coordinate Systems\National Grids\Europe\ and double click on the British National Grid.prj file. Then click on ‘OK’ in the two previous windows in turn. This has now created a new ‘shapefile’ from our table.

The term shapefile is a slight misnomer, in that it actually consists of a series of related files. It is a very common type of GIS data object, and as such is a good option to store your data, as it is very portable to other software. The files that make up a shapefile must always be kept together in the same location on your disk, otherwise it may become broken. Other common types of GIS data object include TAB files (used by MapInfo) and KML or KMZ files (particularly associated with Google Earth). These are all vector objects. Raster files are probably most commonly seen in TIFF or IMG format, or sometimes JPEG. The new layer may have appeared automatically on your map or may not have. If it did not, click on the Add Data button ( ) and add the new layer (which will be called XYSMR.shp unless you renamed it). You should then see a series of points appear on your map. These represent records kept by the local authority of previously discovered archaeological features in the area.

3. Displaying attributes

Now, we will learn how to change the symbols for our different data layers, particularly in regard to displaying attributes associated with our data.

Page 173: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

In the Table of Contents, right click on the layer you just added (on the layer name, not the symbol) and select ‘Open Attribute Table’. You can see from this table that there are a series of attributes associated with each of our points. We can query these attributes (in the conventional database fashion) or we can use them to define the symbols used to display our objects on the map. Close the table in the normal way.

For example, you might wish to use different symbols to display which period each point dated to. To do so, either double click on the layer name or right click on the name and select ‘Properties…’. The window that appears is used to set the various properties associated with this layer. You will see a series of tabs along the top: select the ‘Symbology’ tab. Here is where we set the symbols used. Under ‘Show:’ on the left select ‘Categories’ and then ‘Unique values’. This allows us to draw different symbols depending upon the different categories in any of the columns in the layer’s attribute table. Under ‘Value Field’, select the ‘Period’ entry and then click on the ‘Add All Values’ button. You should see a series of different symbols appear for each category of period. If you double click on a symbol icon, you can change it using the symbol selector window. If you right click on a symbol icon and select ‘Properties for all symbols…’, you can set the symbol for all of the categories in the list. Experiment with this and try to create a symbology that you like.

When you are satisfied, click on ‘OK’ and you should see that the points on the map are now drawn using your new symbology. Next, try setting the symbology of the various field survey layers so that they show the number of mid-Roman pottery sherds recovered from each transect: to make this work, you will have to use the ‘Quantities’ rather than the ‘Categories’ symbology option (hint: graduated colours will display the number of sherds using a colour range). Unfortunately, you may discover a difficulty in this in that these surveys were undertaken by different people who quantified their pottery counts differently, but do the best that you can: this is the type of difficulty that

Page 174: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

you might encounter quite regularly when dealing with other people’s data.

Another problem with the output is that the different layers use their own colour scaling according to the different maximum sherd counts. If you wish to try to counter this, note which layer has the highest sherd count, then open up the properties for the other layers (one at a time), click on the ‘Import…’ button and import the symbology from the layer with the highest maximum value, making sure to select the relevant value field. Don’t worry if this is getting a little too complicated, just move onto the next item.

4. Creating map outputs

We are currently in the data view, but certain things are lacking to turn this into an acceptable map for publication or wider dissemination: the so-called map furniture. A map should be considered incomplete if it lacks a scale, an indication of the direction of north and a legend (albeit this final one is not always entirely necessary). We will now learn how to add these to our map and export the results to an image file.

First we need to switch to Layout view by clicking on the page symbol at the bottom left of the map ( ). You can also do this from the ‘View’ menu. Next, we need to switch the page layout to landscape to better fit the shape of our map. As with most software, this is done by selecting ‘Page and print setup…’ from the ‘File’ menu. Select ‘Landscape’ and click ‘OK’. You will see that the frame in which our map is drawn now overlaps the edges of the page, so resize it to fit and then click on the zoom to extent button (the globe icon: ).

Next, we will add a north arrow. This is easily accomplished via the ‘Insert’ menu by selecting the ‘North arrow’ option. Select an arrow you like from the window that appears and click on ‘OK’. Drag the arrow to a sensible place on your page.

Then add a scale bar by selecting ‘Scale bar’ from the ‘Insert menu’. Again, choose a design you like and then click ‘OK’. You will probably see that the result looks something of a mess. If we drag to resize the scale bar, the units will adjust, so drag it out until it is 1 kilometre in length. However, kilometres is hardly a sensible unit for a map on this scale, so double click on the scale bar to open up its properties. Set the number of divisions to 5, the units

Page 175: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

to metres, and then on the next tab, set the marks to only appear next to divisions (rather than subdivisions). Once you click ‘OK’, you will probably find that the scale bar again needs resizing to use sensible units. Set it to 1,000m.

You should now have a map that looks much more useful. It still lacks a legend, however. Try adding a legend in the same way. This is quite complicated, but a bit of experimentation should result in something acceptable. Think about which layers require a legend to explain them and which ought to be pretty self-explanatory based upon their form: you do not have to add every layer to the legend. If you cannot get your legend to look pretty, then feel free to remove it: you can explain what is on the map in your figure caption when putting together your document if your map is not too complex.

Finally, we now want to export our map as an image so that we can insert it into a document (or for whatever else you might wish to do with it). To do so, go to the ‘File’ menu and select ‘Export map…’. Navigate to a sensible location to save your file in the usual way, select a file type (I usually use PNG as it results in a smaller file size than a TIFF but is much crisper than a JPEG). Set the resolution to 300dpi and tick the box at the bottom of the window that says ‘Clip output to graphics extent’. Then click on the ‘Save’ button to save your image. Once it has finished exporting (watch for messages in the bar at the bottom of the ArcMap window), minimise ArcMap and find your image where you saved it. You should hopefully end up with something like this (preferably prettier though!):

Page 176: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

5. Performing spatial queries

You can query a GIS database in similar ways to conventional databases, e.g. based on attributes. However, one of the great strengths of GIS is in the performing of spatial queries. We will now learn how to undertake these.

Firstly, switch back to the Data view by clicking on the map icon () below the map window. You can see the area of the

proposed development on the map as a partially transparent red polygon. We can see that one of the points we added to the map falls within this area. However, because the records in this dataset are not necessarily precisely recorded (spatially) and because they are points which might represent features of greater extent than a point implies, we cannot be certain that none of the other material represented by the other records is likely to be impacted by the development. Therefore, we need to find out which records fall within 500 metres of the development area.

To do so, we go to the ‘Select’ menu and choose ‘Select by location…’: when you do this, note that the ‘Select by attributes…’ option in the same menu is what we would choose to perform a standard query. A

Page 177: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

new window should appear. Working downwards, the ‘Selection method:’ should be ‘select features from’, the ‘Target layer(s):’ should be the SMR point layer added earlier, the ‘Source layer:’ should be ‘Development’, the ‘Spatial selection method:’ should be ‘Target layer(s) features are within a distance of the Source layer feature’, ‘Apply a search distance’ should be ticked, and the search distance needs to be 500 metres. When you have this set correctly, click ‘OK’.

You should see that all of the features in our point layer that fall within 500 metres of the development area are selected on the map (probably in cyan). If you open the attribute table for the layer (right click on the layer name in the Table of Contents), you will see that these records are also highlighted in cyan here. If you read the descriptions, you will see that some of these are indeed features which might be of sufficient extent to fall within the development area.

If you wish to try a slightly more complex query, try to find out how many SMR point records fall within 250 metres of the church seen in the buildings layer. To do this, you will first need to either construct an attribute query to select the church or use the selection tool ( ) on the toolbars to select it by hand.

The final exercise is more complicated and, as a result, you may not wish to work through it. At the end of these instructions are suggestions for how you might explore this data further, so if you do not want to try the more complex analysis, feel free to move ahead to the end.

6. Visibility analysis (viewshed) [OPTIONAL]

One form of more complex GIS analysis which is often used by archaeologists and other GIS users is the assessment of what is visible (or not visible) from a point in the landscape. We shall give this a try now.

First, add the layer named Observer.shp. This is a point representing a person standing within the end of the rectangular feature seen in the geophysics and satellite image layers. We need to add an entry to the attribute table for this layer to represent the height of the person represented by the point. Open the attribute table. At the top left of the table is an icon that looks like a record card with a drop down arrow next to it. Click on the arrow to open the table menu and then select

Page 178: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

‘Add field…’: the field needs to be called OFFSETA and should be of ‘Float’ type, i.e. a floating point number. Set these and then click ‘OK’. Once you have added the new OFFSETA field, right click on its heading (i.e. where it says OFFSETA) and select the ‘Field Calculator’ option. In the large textbox below ‘OFFSETA =’, type 1.7 and press ‘OK’. This has set the height of our person to 1.7 metres. If this has worked, close the attribute table.

Next, we will venture into the ArcToolbox, where all of the tools built into ArcGIS are kept. However, we need to make sure that the Spatial Analyst extension is turned on, so under the ‘Customize’ menu, select ‘Extensions…’ and then make sure that Spatial Analyst is ticked.

Open ArcToolbox. This will either be in a tab at the right hand side of the window or can be accessed by clicking on the icon that looks like a red toolbox ( ). In the window that opens, you should see a lot of red toolboxes. Expand ‘Spatial Analyst’ and then ‘Surface’: you should now see a tool called ‘Viewshed’. Double click on it.

In the window that opens, under ‘Input raster’ open the drop down box and select the ‘dem’ layer: this is the surface which will be used to determine what our observer could see. Under ‘Input point or polyline observer features’ open the drop down box and select the ‘Observer’ layer: this is, naturally, the observing person. Click on the little yellow folder icon next to the ‘Output raster’ text box, make sure you are in the correct data directory (i.e. the one with all the other files in it) and name the new layer view1.tif, then press the ‘Save’ button. Then click on ‘OK’ in the viewshed window.

Page 179: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

The viewshed will probably take up to a minute to calculate, but should appear on the map when it is done. The resulting layer will be transparent (i.e. hold a value of ‘NODATA’) for areas which the person could not see and coloured for areas which he or she could see. Due to the relatively flat country and the slight local variation in pixel values, this is quite a messy picture, but it does raise some interesting questions in that it would appear that this monument was situated in an area which provided good views of the surrounding hills and the river valley, but which provided poor views of the middle distance around the observation point. Of course, this would only truly be interpretatively interesting if this was a real world case!

If you liked doing this, you could try creating viewsheds for some of the points in our SMR layer. To do so, you would need to select one of the points (using the manual select tool, via a query, or by selecting one of the rows in the attribute table) and then right click on the layer in the Table of Contents and select ‘Data’ then ‘Export data…’; then export the selected record to a new layer and repeat from the start of this section with your new observer layer. Of course, because this map covers quite a small area, any sites towards the top of the hills will have good views of most of the valley!

If you do finish early or do not want to try exercise 6, try to create some images to show what factors would cause concern about the archaeological

Page 180: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

impact of the development area seen on the map. Would you allow the developer to go ahead with their scheme? The geophysics and satellite image layers have features in them that appear to be archaeological (the long rectangular feature and also the partial circles): can you work out what date these might be from looking at the dates of pottery seen in the survey of those and nearby fields, particularly pottery recovered near the features?

Chris Green - 05/2012

Page 181: Digital.Humanities@Oxford Summer School 2012tei.oucs.ox.ac.uk/Talks/2012-07-dhoxss/booklet.pdf · 2012-06-21 · Introduction 2 Introduction The Digital.Humanities@Oxford Summer School

8 Workshop: A Humanities Web of Data: Publishing, Linking,Querying and Visualisation on the Semantic Web

[Materials will be provided to students at the workshop]

181