34
Deep Dive Open XML and the Open XML SDK Zeyad Rajabi Program Manager Microsoft Corporation

What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Embed Size (px)

Citation preview

Page 1: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Deep Dive Open XML and the Open XML SDK

Zeyad RajabiProgram ManagerMicrosoft Corporation

Page 2: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Agenda

Open XML FormatsOpen XML SDK overviewOpen XML SDK architecture + roadmapLots! of demosOpen XML SDK toolsResources + linksQ&A

Page 3: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

OPEN XML FORMATSWhat are they?

Page 4: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Open XML Formats Architecture

Users see a single file

MyDoc.docx

Developers see a zip file with xml parts

Document properties

File container

Comments

WordML/SpreadsheetML, etc.

Custom-defined XML

Images, video, sound

Styles

Charts

Default format in Office 2007 and 2010

Word (.docx)Excel (.xlsx)PowerPoint (.pptx)

Open XML is an ISO standardDocument Parts

Most parts are XML

Page 5: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Open XML Formats

Zeyad RajabiProgram ManagerOffice

demo

Page 6: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Office Open XML Formats

Allows developers access to Office files without the need of the Office applications Current toolset for Open XML

WinZip, MSXML & Notepad System.IO.Packaging, System.XML, & LINQ

Future of Open XML development:Open XML SDK

Page 7: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

OPEN XML SDKWhat it is and what it’s not?

Page 8: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Open XML SDK Overview

Allows you to create and modify Open XML documents

SDK will support both Office 2007 SP2 and Office 2010 file formats

Based on .NET (C# and VB)Compatible with LINQ

Provides a unified platform for solutionsConsistent client and server solutions

This SDK does NOTReplace Office application Object ModelsPerform layout + recalculation tasksPerform file conversions to other formats, like PDF or XPS

Page 9: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Open XML SDK Road Map

Version 1.0 of the SDKProvides part level manipulationFinal bits released June 2008“Go-Live” license – Free to use and build/deploy solutions

Version 2.0 of the SDKProvides content level manipulation1st CTP released September 20082nd CTP released April 20093rd CTP released August 2009Final release around same time Office 2010 ships

Page 10: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Open XML SDK Architecture

System Support

.Net 3.5 System.IO.Packaging

Open XML Schemas

Open XML File Format Base Level

Reading/Writing

Low Level DOM

Packaging API

Open XML File Format Higher Level Schema

Level Validation

Semantic Level Validation

Helper Functions

Page 11: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Base level is the foundation of the SDKProvides strongly typed access to:1. Parts within an Open XML Format2. XML contained within a part Provides DOM-like and SAX-like reading and writing capability

Open XML SDK Base Level

System Support

.Net 3.5 System.IO.Packaging

Open XML Schemas

Open XML File Format Base Level

Reading/Writing

Low Level DOM

Packaging API

Open XML File Format Higher Level Schema

Level Validation

Semantic Level Validation

Helper Functions

Page 12: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

TODO

Page 13: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

The SDK is able to validate:Against Open XML schemasAgainst set of semantic constraints defined in standardAgainst package constraintsAgainst Office specific constraints

Helper functions – code snippets

Open XML SDK Higher Level

System Support

.Net 3.5 System.IO.Packaging

Open XML Schemas

Open XML File Format Base Level

Reading/Writing

Low Level DOM

Packaging API

Open XML File Format Higher Level Schema

Level Validation

Semantic Level Validation

Helper Functions

Page 14: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Potential Scenario Solutions

• From a database, other files, etc.

Push data into Office files

• Query, extract, etc.

Pull data from Office files

• Make a change to a file

Manipulate Office files

• Make sure files work in Office

Validate Office files

Page 15: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Pushing Data into Open XMLAutomated reporting in PresentationML

Zeyad RajabiProgram ManagerOffice

demo

Microsoft Confidential

Page 16: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Key Takeaways

FastIntellisense really helpsStarting from a template is always easiestEasy to search for specific contentManipulation easy with access to strongly typed objectsUse the tools…they help!

I will talk more about the tools later on

Page 17: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Pulling Data Out of Open XMLQuerying an Excel spreadsheet

Zeyad RajabiProgram ManagerOffice

demo

Microsoft Confidential

Page 18: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Key Takeaways

LINQ to SQL…why not LINQ to Excel?LINQ built into SDK makes querying easy

Page 19: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Manipulating Content Within Open XMLSanitizing a Word document on the server

Zeyad RajabiProgram ManagerOffice

demo

Microsoft Confidential

Page 20: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Key Takeaways

Again, fast!Personally identifiable information removed without client

Multiple types of solutions easily integrated into SharePoint

Workflow basedCustom action basedRibbon based

Page 21: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

OPEN XML SDK TOOLSUse the tools!

Page 22: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Open XML SDK Tools

SDK provides the following tools1. Open XML Diff 2. Class Explorer3. Document Reflector

Open XML DiffCompare differences in two Open XML files

Class Explorer Allows developers to navigate Open XML standard as it relates to the SDK

Document ReflectorAutomatically generates Open XML SDK code based on document

Page 23: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Open XML Power Tools

A set of 30+ cmdlets that create and modify Open XML documents

Removing comments, accepting tracked revisions, etc.

Supports the PowerShell piping architecture Documents are piped from cmdlet to cmdlet as objects

Built on Open XML SDK Available on CodePlex

http://www.codeplex.com

Page 24: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Open XML Power Tools

Zeyad RajabiProgram ManagerOffice

demo

Microsoft Confidential

Page 25: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Key Takeaways

IT and developers can perform batch scripts using PowerToolsReleased as open source, under the Microsoft Public License (Ms-PL)Works with PowerShellProvides a set of rich functionalities to build and modify rich Open XML Formats

Page 26: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

VSTO Power Tools

Open and edit Open XML documents directly in Visual Studio

http://www.microsoft.com/downloads/details.aspx?FamilyID=46B6BF86-E35D-4870-B214-4D7B72B02BF9

Page 27: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

What’s the Real “User” Value?

Save them time by automating tasks and reducing repetitive busy-workGive them what they need fasterReduce the need to context switch between applications or tasks

Page 28: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Demos...We Have More...

Come talk to us after this presentationBring a USB key to copy source code of demosEmail us: zeyadr and tristandOr, download demos at Eric White’s or Brian Jones’ blog

Page 29: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Links + Resources

BlogsEric White’s blog: http://blogs.msdn.com/ericwhite Doug Mahugh’s blog: http://blogs.msdn.com/dmahugh Brian Jones’ blog: http://blogs.msdn.com/brian_jones

MSDNContains how-to articles and documentationForums related to SDKhttp://msdn.microsoft.com/office/xml

ConnectAccess to more articles and forumsAbility to log bugs and vote for featureshttp://connect.microsoft.com

CodeplexOpen source projects related to Open XML solutionshttp://www.codeplex.com

Download site for the SDK:Version 1.0: http://go.microsoft.com/fwlink/?LinkId=120908 Version 2.0: http://go.microsoft.com/fwlink/?LinkId=127912

Page 30: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Q&AAny questions? Want to share scenarios/solutions?

Page 31: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Remember to fill out your evaluations on MySPC for your

chance to win two HD web cams and a designer mouse

(3 prizes awarded daily)

Page 32: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

Learn More about SharePoint 2010

Information forIT Prosat TechNet

http://MSSharePointITPro.com

Information forDevelopers

at MSDNhttp://MSSharePointDeveloper.com

Information forEveryone

http://SharePoint.Microsoft.com

Page 33: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after

the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Page 34: What are they? Users see a single file MyDoc.docx Developers see a zip file with xml parts Document properties File container Comments WordML/SpreadsheetML,