31
Open XML Developer Workshop Office Open XML Architecture A developer’s introduction to the file formats

Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Embed Size (px)

Citation preview

Page 1: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Office Open XML ArchitectureOffice Open XML Architecture

A developer’s introduction to the file formats

Page 2: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

DisclaimerDisclaimerThe information contained in this slide deck represents the current view of Microsoft Corporation on the issues discussed as of the date of

publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This slide deck is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this slide deck may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this slide deck. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this slide deck does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, email address, logo, person, place or event is intended or should be inferred.

© 2006 Microsoft Corporation. All rights reserved.Microsoft, 2007 Microsoft Office System, .NET Framework 3.0, Visual Studio, and Windows Vista are either registered trademarks or

trademarks of Microsoft Corporation in the United States and/or other countries.The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

Page 3: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

ObjectivesObjectives

• In this module, we will learn about the architecture of the Office Open XML formats.

• Primary focus is on concepts that apply to all three main document types.

• Details specific to word processing documents, spreadsheets, or presentations will be covered in separate modules for each of those document types.

Page 4: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Evolution of Document AuthoringEvolution of Document Authoring

Old approach: linear, static

Temporary electronic document, permanent paper documentFace-to-face collaboration using paper documents; requires physical presenceBinary formats optimized for the high cost of storage and bandwidth; proprietary

New approach: dynamic, interactive

Permanent electronic document, temporary paper documentDigital collaboration using electronic documents; participants in many locationsXML-based formats optimized for flexibility, reusability, and maintainability; open standards

Create Print Archive

Generate

Edit

SendEdit

Receive

Page 5: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

• Formats describe information• Define content appearance• Structure content for business processes

Enable machines (software) to use information

Software applications use informationProvide functionality for authoring, organizing,developing, representing, evaluating, reviewing,collaborating, validating, calculating, protecting,and printing information

Formats can influence application design, and vice versaOpen Document Format (ODF) and OpenOffice functionalityOffice Open XML and Microsoft Office functionality

Document Formats and ApplicationsDocument Formats and Applications

-5-

Page 6: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Features of Office Open XMLFeatures of Office Open XML

Open•Open XML Formats standardization: Ecma, ISO/IEC in process•XML-based formats for predictable long-term interoperability•Royalty-free licenses enable broad access to technology

Forward-looking

•ZIP compression of the format reduces file sizes•Segmented storage improves data recovery and programmatic access•Full accessibility support

Compatible•Full support for Microsoft Office functionality•Compatibility with Office 2000, XP, and 2003•Bulk document conversion tools available

Page 7: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Demo: Open XML In ActionDemo: Open XML In Action

DB

VM

Linux OS

Web Server

Tomcat JSP

Application

IE

Windows OS

Word 2007

1. Generate Document

3. Edit Document

4. Upload

5. Publish to Web

6. View in Browser

2. Download

ServerDesktop

DEMO

Page 8: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Levels of InteroperabilityReference and Custom-defined SchemasLevels of InteroperabilityReference and Custom-defined Schemas

Custom-defined SchemasData-oriented (e.g.: Price, Invoice)business informationEnable System Integration

XML Reference SchemasDisplay-oriented (Bold, Italics, Tables, Paragraphs, Styles,…)Document FormatEnable Archival and File Formats Interoperability

Page 9: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Levels of Interoperability Technical InteroperabilityLevels of Interoperability Technical Interoperability

<w:p> <w:r> <w:rPr><w:b /></w:rPr> <w:t>John Doe</w:t> </w:r> <w:r> <w:rPr><w:i /></w:rPr> <w:t>Health Agency</w:t> </w:r></w:p>

XML Reference SchemasDisplay-oriented (for example, Bold, Italics, Tables, Paragraphs, Styles)Document FormatEnable Archival and File Formats Interoperability

Page 10: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Custom-defined SchemasData-oriented (for example, Price, Invoice)business informationEnable System Integration

<ConferenceReport> <Date>3/24/2004</Date> <Attendees> <Attendee Name=“John Doe”> <Department>

Health Agency </Department> <Potential> <Sales>100</Sales> <Growth>25%</Growth> … </Attendee>

Levels of Interoperability Semantic interoperability

Page 11: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Word: a 24-year evolutionWord: a 24-year evolution

Office 2000

Word for Windows 1.0(Windows) 1989

Office XP

Office 2003

Office 2007

Word 12

Office 97

Multi-Tool Word 1983(Xenix)

Word 1.0 1983(DOS)

Word 3.0 1987

Word for Mac1985 (Mac)

Word 6.0 1993

Word 5.1 1992

Word v.X 2001

(OS X)

Word 5.5 1991

Word for OS/2 1992

Word 5.1 1992(UNIX)

.DOC .RTF 1990 (by DEC 1987) XML 2003

Page 12: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

XML in Office: a 10-year evolutionXML in Office: a 10-year evolution

Office 2000XML Document Properties

Office 97Binary formats

Office XPSpreadsheet XML

Office 2003WordProcessingML

SpreadsheetML

Custom schemas

2007 Office systemPresentationML

XML-based formats

Page 13: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

User View of Open XML FilesUser View of Open XML Files

Single file

Compact

Corruption resistantSegmented architectureCorruption of any part would not prohibit opening

Separation of macro-enabled contentMacro-enabled extension end with “m” instead of “x” (e.g. .docm)VBA, Excel Macro-Sheets, PowerPoint Action Commands

Enforced at runtime by 2007 Office programs

Page 14: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Programmer View of Open XML FilesProgrammer View of Open XML Files

ZIP ArchiveDocument Parts

XML PartsBinary PartsTyped (RFC 2616)

RelationshipsConnections between parts

Content Type StreamA specially-named streamDefines mappings from part names to content typesNot itself a part, not URI addressable

Folder structure for convenience only

Page 15: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Hello WorldHello World

demoCreating the minimal WordprocessingML document:• 3 parts: document body, content types, relationships• Each part is simple XML (text)• Parts are packaged in a ZIP archive• Result: a well-formed Open XML document

Page 16: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Ecma Office Open XML SpecificationsEcma Office Open XML Specifications

WordprocessingML SpreadsheetML PresentationML

ZIP XML + Unicode

DrawingML

Content Types

Custom XML Bibliography

Markup Languages

Relationships

Metadata

DigitalSignatures

VML (legacy) Equations

Open Packaging Convention

Core Technologies

Vocabularies

Page 17: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Ecma Office Open XML SpecificationsEcma Office Open XML Specifications

WordprocessingML SpreadsheetML PresentationML

ZIP XML + Unicode

DrawingML

Content Types

Custom XML Bibliography

Markup Languages

Relationships

Metadata

DigitalSignatures

VML (legacy) Equations

Open Packaging Convention

Core Technologies

Vocabularies

Module 06, 07AModule 03, 04

Module 07B Module 05

Module 08

Module 02

Module 01, 09

Module 02

Module 02

Page 18: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Office Open XML File Formats ExtensionsOffice Open XML File Formats Extensions

Macro-Free Macro-Enabled

Document Template Document Template

docx dotx docm dotm

pptx potx pptm potm

xlsx xltx xlsm xltm

Open Packaging Convention

Page 19: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Enforce organizational standards for document formatting.

Document

Style

part

Standardizedlook and feel

Developer Scenario: Styling ContentDeveloper Scenario: Styling Content

Page 20: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Remove confidential information, tracked changes or metadata from outbound documents:

Remove macros, inappropriate language, or other content from inbound documents:

Developer Scenario: Content InspectionDeveloper Scenario: Content Inspection

Open XMLProcessing

Open XMLProcessing

Page 21: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Back-end system(LOB/CRM/etc.)

Development Scenario: Consuming DocumentsDevelopment Scenario: Consuming Documents

Create expense reports as spreadsheet documents, which are loaded into a back-end system on the server:

Open XMLProcessing

Authoring environment(Microsoft Office, etc.)

Page 22: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Create sales reports from financial and forecast data stored in a CRM system:

Calculated

data

Manual

entries

Existingcontent

Web client or rich clientallows user to select orenter content criteria

Development Scenario: Document AssemblyDevelopment Scenario: Document Assembly

Page 23: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Tagging document content with custom semantics for processing by a back-end system.

Authoring environment

Development Scenario: Custom XML MarkupDevelopment Scenario: Custom XML Markup

Open XMLProcessing

Page 24: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Custom XML Data StoreCustom XML Data Store

Custom-defined XML partStored separately from document bodyAny XML can be stored

Document propertiesWSS meta-dataCustom XML (with or without XML schema)

External applications can easily read or write the custom XML part

True separation of data and presentation

Doc/Template

Doc Parts

XML

External App

Page 25: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Link content controls to nodes in the XML data storeMappings use standard XPath expressionsOffice offers built-in support for mapping to metadataDevelopers can bind custom XML to content controls2-way binding between user changes and custom XML

XML Data BindingXML Data Binding

Customers

Page 26: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

XML Data BindingXML Data Binding

demo

Page 27: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

Open XML InteroperabilityOpen XML Interoperability

Linux Java Microsoft COM

ZIP LibraryMinizip

zLibJ2SE

java.util.zip

.NET Framework 3.0System.IO. Packaging*

Microsoft SDK for Open XML Formats **

Xceed .NET controls

Xceed ActiveX controls

XML Library Apache Xerces JAXP .NET Framework 3.0System.Xml MSXML

* Includes abstractions for OPC concepts

** Includes classes for package parts (strongly typed parts)

Page 28: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

The Ecma SpecThe Ecma Spec

Where to get the final draftOpenXmlDeveloper.org home page has latest link

Organization of the spec1. Fundamentals2. Open Packaging Conventions3. Primer4. Markup Language Reference5. Markup Compatibility and Extensibility

Reference Schemas (XSD, RelaxNG)

Page 29: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

The Ecma Spec: Where To StartThe Ecma Spec: Where To Start

Where to get the final draftOpenXmlDeveloper.org home page has latest link

Organization of the spec1. Fundamentals2. Open Packaging Conventions3. Primer4. Markup Language Reference5. Markup Compatibility and Extensibility

Reference Schemas (XSD, RelaxNG)

Read 1st

ReferencematerialsRead 2nd

Page 30: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop

OpenXmlDeveloper.orgOpenXmlDeveloper.org

Formed by 40 companies to share developer information about the Office Open XML file formats

Articles with full source code for C#, VB, Java, XSLT

Forums for posting technical questions

Page 31: Open XML Developer Workshop Office Open XML Architecture A developers introduction to the file formats

Open XML Developer Workshop