If you can't read please download the document
Upload
mike-fowler
View
88
Download
0
Embed Size (px)
Citation preview
PGDay UK 2013
Handling XML and JSON in the Database
Mike Fowler, [email protected]
12th July 2013
Overview
XMLXML Primer
History of XML in PostgreSQL
Using PostgreSQL's XML features
JSONJSON Primer
History of JSON in PostgreSQL
Using PostgreSQL's JSON features
About Me
Been using PostgreSQL for ~10 years
Contributed some XML supportXMLEXISTS/xpath_exists()
xml_is_well_formed()
Buildfarm member piapiacAmazon EC2 based build for JDBC driver
Has lead to a number of bugfix patches for JDBChttp://www.pgbuildfarm.org/cgi-bin/show_status.pl?member=piapiac
Reasons to store XML/JSON
Client application uses itConfiguration
Serialised objects
Data format/schema is highly complex/variable
You just don't care about the data!Audit data
Application log files
Primer: XML
eXtensible Markup Language
Human readable data interchange & serialisation format
Consists of a root element containing a mix of child elements and text content with any element having optional attributes
text based content
XML Support
ANSI and ISO standards existIntroduced in SQL/XML 2003
Augmented in SQL/XML 2006
Prior to 8.3 XML support was a contrib module
Added to core in 8.3 but remains a compile time option enabled
with:
configure --with-libxml
XML Support
xml datatype (internally stored as text)
Keywords from the standardsDOCUMENT, CONTENT
XMLPARSE, XMLSERIALIZE ...
Predicates, also from the standardsIS [NOT] DOCUMENT
XMLEXISTS (9.1)
A number of support functions, not standardxmlconcat(), xpath() ...
XML: The Hard Way
Using standard SQL, XML is inserted:
INSERT INTO demo (data) VALUES (XMLPARSE (DOCUMENT ' Manual ...
'))
To retrieve using standard SQL:
SELECT XMLSERIALIZE(DOCUMENT data AS text) FROM ...
XML: The Easy Way
It's a normal datatype, use normal casting!
INSERT (::xml is optional as text will implicitly cast to xml
even in 9.3)
pgday=# INSERT INTO demo (data) VALUES (' pgday-# Manual
pgday-# ... '::xml);
INSERT 0 1
XML: The Easy Way
SELECT (::text is optional as far as rendering in the psql client is concerned)
pgday=# SELECT data::text FROM demo WHERE ... data ---------------------------------------------------------- Manual...(1 row)
XML: xmloption
When casting without XMLPARSE or XMLSERIALIZE the choice of
DOCUMENT or CONTENT is determined by the value of the 'XML option'
session variable
SET XML OPTION { DOCUMENT | CONTENT}
SET xmloption TO { DOCUMENT | CONTENT}
XML: Predicates
IS DOCUMENT / IS NOT DOCUMENTUse to filter between DOCUMENT and CONTENT
Only works with data that is already cast as XML
pgday=# SELECT XMLSERIALIZE(DOCUMENT data AS text) FROM
pgday-# WHERE data IS DOCUMENT;
pgday=# SELECT XMLSERIALIZE(CONTENT data AS text) FROM
pgday-# WHERE data IS NOT DOCUMENT;
XML: Predicates
xml_is_well_formed()Introduced in 9.1 and takes text as a parameter
Sensitive to XMLOPTIONSET xmloption DOCUMENT;
Also xml_is_well_formed_documet() and xml_is_well_formed_content()
XML: xpath()
xpath(xpath, xml [, nsarray])
Allows you to extract elements and text
Supports namespaces
Returns an array of XML
Also returns empty arrays when there is no match
SELECT xpath('//title/text()',data) FROM ...
XML: XMLEXISTS
XMLEXISTS(text PASSING [BY REF] xml [BY REF])
From the standard, useful for predicates
First parameter is an xpath expressionSELECT xpath('//title/text()',data) FROM demo WHERE XMLEXISTS('//title' PASSING data);
XML: xpath_exists()
xpath_exists(xpath, xml [, nsarray])
Serves the same purpose as XMLEXISTS but:Supports namespaces
Syntax is simplier
SELECT xpath('//title/text()',data) FROM demo WHERE
xpath_exists('//title', data);
Primer: JSON
JavaScript Object Notation
Also a human readable data interchange & serialisation format
{ } denote objects containing a comma separated list of name value pairs where values can be nested objects or arrays
{name:value,nestedObject: {array:[value1,value2]},numeric:13}
JSON Support
Not a SQL standard
PostgreSQL is currently the only RDBMS with native support
Design borrows heavily from PostgreSQL's existing XML supportInternal storage format is TEXT
Basic validation (e.g. well formed)
Does not require special libraries to enable
JSON Support
Basic support introduced in 9.2JSON datatype
2 support functionsrow_to_json()
array_to_json()
Operators and additional functions in 9.3->,->>,#>,#>>
10 more support functions
JSON Support
INSERT always ensures type validity
pgday=# INSERT INTO demo VALUES
pgday-# ('{"name":"Mike","hungry":true}');
INSERT 0 1
pgday=# INSERT INTO demo VALUES
pgday-# ('{"name":"Mike",hungry:true}');
ERROR: invalid input syntax for type json at character 28
JSON: Operators
Retrieve the value of an attribute
SELECT data->'name' AS name FROM
Use the value of an attribute in a predicate
SELECT data FROM demo WHERE
data->>'name' = 'Mike';
JSON: Functions
row_to_json(record [, pretty_bool]) Returns a result set where
each row is a JSON object
pgday=# SELECT * FROM demo;
username | posts | email ----------+-------+------------------- mlfowler | 121 | [email protected] fowlerm | 9 | [email protected]
pgday=# SELECT row_to_json(demo) FROM demo;
row_to_json ----------------------------------------------------------------- {"username":"mlfowler","posts":121,"email":"[email protected]"} {"username":"fowlerm","posts":9,"email":"[email protected]"}
JSON: Functions
json_object_keys(json)
Returns all the keys in the top level
pgday=# SELECT json_object_keys(data) FROM demo; json_object_keys ------------------ name hungry
Caveats
They are not indexableYou will need to plan how best to retrieve
Best used programmaticlySyntax of XML and JSON can be unwieldy
Non-UTF8 encoded databases
Not portable
Summary
Using the PostgreSQL XML and JSON datatypes allows you finer control of an otherwise free format field
PostgreSQL's XML non-standard support is much easier to use than the standard
JSON is still being very actively developed
Thank you!
Mike Fowler
[email protected]
Handling XML and JSON in the DatabaseMike Fowler, [email protected] PGDayUK 2013