Upload
destiny-weber
View
40
Download
0
Embed Size (px)
DESCRIPTION
XML-based Web Publishing and Content Management at Seattle University School of Law. James Cooper Director of Technology & Media Services [email protected] Evan Lenz Content Management Architect [email protected]. Contents. Web site requirements and architecture - PowerPoint PPT Presentation
Citation preview
XML-based Web Publishing and Content Management at Seattle
University School of Law
James CooperDirector of Technology & Media Services
Evan LenzContent Management Architect
Contents
1. Web site requirements and architecture
2. Web site management with Cocoon• URI design discussion
3. Redhawk CMS
4. An acronym you should know: XSLT
5. Q&A
1. Web site requirements and architecture
SU Law Web site requirements (summer 2002)
• Must include a Flash-enhanced version• Must include an HTML-based version that approximates
the look-and-feel and navigational structure of the Flash-enhanced version
• Must include a version of the site that is designed for accessibility
• Must employ the separation of presentation and content through the use of XML technologies. Multiple published versions of the same content must originate in an automatic way from the same source.
• The publishing framework must employ a single point of control over navigational structure, e.g. using an XML configuration file.
Web site requirements, cont.
• Must allow an average Web developer to easily author new content, edit existing content, etc.
• Must accommodate the continued use of existing tools for authoring content, e.g. Dreamweaver.
• Particular kinds of content that have predictable, repeating structure should be converted into custom XML vocabularies to increase their flexibility and ease of management.
• The Web site must include search functionality integrated into all versions of the site.
Web content strategy today• Static pages were converted to and are stored as style-free XHTML
(in VSS, with latest versions shadowed on the staging server).• Apache Ant is invoked on the staging server to incrementally build
all versions (Flash, Standard, Text-only, and crawler) of each static page, using the page source, as well as global navigation and sidebar configuration files, as input.
• Cocoon powers the core functionality of the site, including setting the user’s version preferences and serving dynamic content. All static pages and files are served directly by Apache.
• Dynamic content pieces are identified by URI in the Cocoon sitemap, which is configured to assemble corresponding pages on-the-fly. Dynamic content examples include:– Specialized content in our home-grown CMS called “Redhawk”, which
provides end-user WYSIWYG editing of certain kinds of content– Google search results– Legacy ASP pages
• Traditional Web content management, e.g. WYSIWYG editing of all pages, is being considered, but not sorely missed at this time.
Benefits of using XML
• Separation of presentation from content– Ensures consistency of presentation across all pages
(eliminates layout errors)– Enables publication to multiple channels– Content re-use
• Many commercial and open-source tools available for processing/creating XML
• Integration between disparate systems (including legacy ASP pages, Google, Redhawk, etc.)
• Great for configuration files
Primary tools used in our Web site
Run-time:• Apache Cocoon (Java-based)• Apache Web server on Linux• mod_rewrite (for rewriting
incoming URLs, e.g. path?mode=flash, to /flash-html/path.html)
• Google Appliance (for integrated search inside our site template)
• IIS/ASP (legacy database access scripts, e-mail forms, etc.)
• 4Suite, for exporting content from the Redhawk CMS (based on 4Suite)
Build-time:• MS Visual SourceSafe (for
versioning of static content)• Samba (for mounting a VSS
shadow folder on the Linux staging server)
• Dreamweaver MX (includes XHTML support and VSS integration)
• Apache Ant (for building the bulk of the site statically)
• 4Suite, for end-user content management of specialized document types, aka Redhawk
2. Web site management with Cocoon
Introduction to Cocoon
• Cocoon is an open-source, Java-based XML Web publishing framework
• Recently gained status as a top-level Apache project, at http://cocoon.apache.org
• Designed to enable the separation of concerns between content, logic, and style
The Cocoon sitemap
• SAX-based pipeline mechanism allows XML content to go through a series of transformations, configurable by the sitemap, Cocoon's central point of configuration
• Each pipeline consists of:– Exactly one generator
• Produces XML content using any number of mechanisms: reading a file, submitting an HTTP request, calling a database, invoking a server page script, etc.
– Followed by zero or more transformers• Processes the XML, e.g. XSLT or Xinclude, for subsequent handling
by either another transformer or the serializer– Followed by exactly one serializer
• Serializes into a particular format, e.g. well-formed XML, browser-compatible XHTML, SVG, PDF (via XSL:FO and FOP), rasterized images (via SVG and Batik), etc.
Simplified Cocoon sitemap excerpt
<map:match pattern="accesstojustice/hague/cases"> <map:generate src="http://redhawk/?xslt=getCases.xsl"/> <map:transform src="stylesheets/case2html.xsl"/> <map:serialize type="xhtml"/></map:match>
Another sitemap excerpt <map:resource name="front-door"> <map:select type="request-parameter"> <map:parameter name="parameter-name" value="set-version"/> <map:when test="flash"> <map:call resource="check-flash"/> </map:when> <map:when test="flash-confirmed"> <map:call resource="set-preference-to-flash"/> </map:when> <map:when test="standard"> <map:call resource="set-preference-to-standard"/> </map:when> <map:when test="simple"> <map:call resource="set-preference-to-simple"/> </map:when> <map:otherwise> <!-- more logic --> </map:otherwise> </map:select> </map:resource>
URI design considerations
• The URI design of the SU Law Web site was inspired by Tim Berners-Lee's 1998 essay “Cool URIs don't change” – http://www.w3.org/Provider/Style/URI.html
• Aims to follow two of the essay's suggestions:– Leave out file extensions– Leave out topic/classification by subject
Leave out file extensions
• Cocoon makes it easy to map external URIs to internal filenames or other content generators
• In the SU Law Web site, the URLs of all HTML pages do not include any file extensions
• Other types of content use standard file extensions, e.g. JPG, GIF, Flash, Word, etc.
Leave out topic/classification by subject
• Difficult problem• Design URIs such that they are meaningfully
mnemonic and will never change, even though the corresponding pages may be classified into different topics later
• Berners-Lee: "Because the relationships between subjects are web-like rather than tree-like, even...people who agree on a web may pick a different tree representation."
Decouple navigational structure from URI structure
• URI structure is, of necessity, hierarchical• Site navigation tends to be hierarchical,
classifying pages into topics or subjects• To help in following the original suggestion, we
formulated the following mandate:– Decouple navigational structure from URI structure.
• We met this goal through the use of a custom XML configuration file (navigation.xml) that maps between the two independent hierarchies (navigation and URI structure)
Excerpt from navigation.xml<navigation xmlns="http://law.seattleu.edu"> <menu display="Welcome" sectionId="welcome"> <link href="/" display="SU Law Home"/> <link display="Contact Information" href="/contactus"/> <link display="Directions" href="/directions"/> <link href="/welcome" display="From the Dean"/> <link href="/history" display="History"/> <link href="/calendar" display="Master Calendar"/> <link href="/mission" display="Mission"/> <link href="/search" display="Search"/> <link href="/sitemap" display="Site Map"/> <link href="http://www.seattleu.edu" display="Seattle University Home"/> <hidden href="/news" display="News"/> <hidden pattern="/news"/> <hidden href="/privacy" display="Privacy Statement"/> </menu> <menu display="Students" sectionId="students"> <menu display="Academics"> <link href="/academics" display="Introduction"/> <link href="/academics/calendar" display="Academic Calendar"/> <link href="/courses" display="Course Descriptions"/> <link href="/classassignments" display="Class Assignments"/> <hidden pattern="/classassignments"/> <!-- more pages --> </menu> <!-- more submenus --> </menu> <!-- more menus --></navigation>
The benefits of URI-navigation independence
• Pages can be moved from one section of the site to another by simply editing one file (navigation.xml)
• Navigation structure can change without needing to update any links or change any URIs (thereby rendering them uncool)
• Files do not need to be moved around just because corresponding pages “move around” the site
XML-based configuration of the Web site “sidebar”
<sidebar xmlns="http://law.seattleu.edu"> <allButtons> <promotion id="laptop" img="laptoppurchase.gif“ alt="Student Laptop Purchase Program (Dell)“ href="/technology/purchase"/> <profile id="cmhall" alt="Christian Halliburton Video“ movie="cmhall.rm"/> <quote id="cumbow" img="cumbow.gif" alt="Cumbow Quote"/> ... </allButtons> ... <section id="faculty"> <profile idref="cmhall"/> <quote idref="cumbow"/> <promotion idref="giving"/> <promotion idref="newfaculty"/> <promotion idref="laptop"/> </section> ...</sidebar>
3. Redhawk CMS
Redhawk, home-grown CMS• Redhawk is a specialized XML content management system, based
on 4Suite, an open-source platform for XML and RDF processing• Named after SU mascot• Basic unit of storage is an XML document• Supports development of custom Redhawk "document classes",
which correspond to XML document types (or schemas)• Provides basic CRUD (Create, Read, Update, Delete) and role-
based workflow functionality• Two types of users for each document class: Author and Editor• Any Create, Update, or Delete requests by an Author must be
approved by an Editor before taking effect• Pluggable WYSIWYG editing environments; so far we have
developed support for Altova's free browser-based XML editor, Authentic 5
• Future plans to support Microsoft InfoPath and Word 2003
Create New Announcement form
Current Redhawk applications
• Announcements and events for the Docket (migration from custom production application in process)
• Access to Justice Institute’s Hague Project for managing Hague Convention-related case information (in production)
4. An acronym you should know: XSLT
The common denominator: XSLT (Extensible Stylesheet Language Transformations)
• Used in Cocoon to assemble all pages (XSLT is the default type of "Transformer")
• Used in our site build process, via Ant's <xslt> task for collectively applying transformations over multiple files
• Built-in to 4Suite and used throughout Redhawk to assemble pages, create documents, and implement the core CMS logic (with the help of extensions)
• Used in the Google Appliance to style the output of search results• Used in Redhawk in the browser to apply supplemental "clean-up"
transformations to the XML resulting from Authentic editing• Growing abundance of conformant XSLT processors, including IE6
and Mozilla support, as well as a growing number of powerful tools• And… XSLT is reaching mainstream technology status: Microsoft
Office 2003 will pervasively employ XSLT for the development of custom XML solutions, particularly in Word, Excel, Access, and InfoPath.
References
• http://cocoon.apache.org• http://4suite.org• http://ant.apache.org
• “Cool URIs don't change” – http://www.w3.org/Provider/Style/URI.html
• “Cocoon and 4Suite for Content Management: The Best of Both Worlds at Seattle University School of Law” - http://www.xmlportfolio.com/xmleurope2003/
Questions?