Handling XML and JSON in the Database

Embed Size (px)

Citation preview

PGDay UK 2013

Handling XML and JSON in the Database

Mike Fowler, [email protected]

12th July 2013

Overview

XMLXML Primer

History of XML in PostgreSQL

Using PostgreSQL's XML features

JSONJSON Primer

History of JSON in PostgreSQL

Using PostgreSQL's JSON features

About Me

Been using PostgreSQL for ~10 years

Contributed some XML supportXMLEXISTS/xpath_exists()

xml_is_well_formed()

Buildfarm member piapiacAmazon EC2 based build for JDBC driver

Has lead to a number of bugfix patches for JDBChttp://www.pgbuildfarm.org/cgi-bin/show_status.pl?member=piapiac

Reasons to store XML/JSON

Client application uses itConfiguration

Serialised objects

Data format/schema is highly complex/variable

You just don't care about the data!Audit data

Application log files

Primer: XML

eXtensible Markup Language

Human readable data interchange & serialisation format

Consists of a root element containing a mix of child elements and text content with any element having optional attributes

text based content

XML Support

ANSI and ISO standards existIntroduced in SQL/XML 2003

Augmented in SQL/XML 2006

Prior to 8.3 XML support was a contrib module

Added to core in 8.3 but remains a compile time option enabled with:

configure --with-libxml

XML Support

xml datatype (internally stored as text)

Keywords from the standardsDOCUMENT, CONTENT

XMLPARSE, XMLSERIALIZE ...

Predicates, also from the standardsIS [NOT] DOCUMENT

XMLEXISTS (9.1)

A number of support functions, not standardxmlconcat(), xpath() ...

XML: The Hard Way

Using standard SQL, XML is inserted:

INSERT INTO demo (data) VALUES (XMLPARSE (DOCUMENT ' Manual ... '))

To retrieve using standard SQL:

SELECT XMLSERIALIZE(DOCUMENT data AS text) FROM ...

XML: The Easy Way

It's a normal datatype, use normal casting!

INSERT (::xml is optional as text will implicitly cast to xml even in 9.3)


pgday=# INSERT INTO demo (data) VALUES (' pgday-# Manual
pgday-# ... '::xml);
INSERT 0 1

XML: The Easy Way

SELECT (::text is optional as far as rendering in the psql client is concerned)

pgday=# SELECT data::text FROM demo WHERE ... data ---------------------------------------------------------- Manual...(1 row)

XML: xmloption

When casting without XMLPARSE or XMLSERIALIZE the choice of DOCUMENT or CONTENT is determined by the value of the 'XML option' session variable

SET XML OPTION { DOCUMENT | CONTENT}

SET xmloption TO { DOCUMENT | CONTENT}

XML: Predicates

IS DOCUMENT / IS NOT DOCUMENTUse to filter between DOCUMENT and CONTENT

Only works with data that is already cast as XML

pgday=# SELECT XMLSERIALIZE(DOCUMENT data AS text) FROM
pgday-# WHERE data IS DOCUMENT;

pgday=# SELECT XMLSERIALIZE(CONTENT data AS text) FROM
pgday-# WHERE data IS NOT DOCUMENT;

XML: Predicates

xml_is_well_formed()Introduced in 9.1 and takes text as a parameter

Sensitive to XMLOPTIONSET xmloption DOCUMENT;

Also xml_is_well_formed_documet() and xml_is_well_formed_content()

XML: xpath()

xpath(xpath, xml [, nsarray])

Allows you to extract elements and text

Supports namespaces

Returns an array of XML

Also returns empty arrays when there is no match

SELECT xpath('//title/text()',data) FROM ...

XML: XMLEXISTS

XMLEXISTS(text PASSING [BY REF] xml [BY REF])

From the standard, useful for predicates

First parameter is an xpath expressionSELECT xpath('//title/text()',data) FROM demo WHERE XMLEXISTS('//title' PASSING data);

XML: xpath_exists()

xpath_exists(xpath, xml [, nsarray])

Serves the same purpose as XMLEXISTS but:Supports namespaces

Syntax is simplier

SELECT xpath('//title/text()',data) FROM demo WHERE xpath_exists('//title', data);

Primer: JSON

JavaScript Object Notation

Also a human readable data interchange & serialisation format

{ } denote objects containing a comma separated list of name value pairs where values can be nested objects or arrays

{name:value,nestedObject: {array:[value1,value2]},numeric:13}

JSON Support

Not a SQL standard

PostgreSQL is currently the only RDBMS with native support

Design borrows heavily from PostgreSQL's existing XML supportInternal storage format is TEXT

Basic validation (e.g. well formed)

Does not require special libraries to enable

JSON Support

Basic support introduced in 9.2JSON datatype

2 support functionsrow_to_json()

array_to_json()

Operators and additional functions in 9.3->,->>,#>,#>>

10 more support functions

JSON Support

INSERT always ensures type validity

pgday=# INSERT INTO demo VALUES
pgday-# ('{"name":"Mike","hungry":true}');
INSERT 0 1

pgday=# INSERT INTO demo VALUES
pgday-# ('{"name":"Mike",hungry:true}');
ERROR: invalid input syntax for type json at character 28

JSON: Operators

Retrieve the value of an attribute

SELECT data->'name' AS name FROM

Use the value of an attribute in a predicate

SELECT data FROM demo WHERE
data->>'name' = 'Mike';

JSON: Functions

row_to_json(record [, pretty_bool]) Returns a result set where each row is a JSON object

pgday=# SELECT * FROM demo;

username | posts | email ----------+-------+------------------- mlfowler | 121 | [email protected] fowlerm | 9 | [email protected]

pgday=# SELECT row_to_json(demo) FROM demo;

row_to_json ----------------------------------------------------------------- {"username":"mlfowler","posts":121,"email":"[email protected]"} {"username":"fowlerm","posts":9,"email":"[email protected]"}

JSON: Functions

json_object_keys(json)
Returns all the keys in the top level

pgday=# SELECT json_object_keys(data) FROM demo; json_object_keys ------------------ name hungry

Caveats

They are not indexableYou will need to plan how best to retrieve

Best used programmaticlySyntax of XML and JSON can be unwieldy

Non-UTF8 encoded databases

Not portable

Summary

Using the PostgreSQL XML and JSON datatypes allows you finer control of an otherwise free format field

PostgreSQL's XML non-standard support is much easier to use than the standard

JSON is still being very actively developed

Thank you!




Mike Fowler
[email protected]

Handling XML and JSON in the DatabaseMike Fowler, [email protected] PGDayUK 2013