32
Web Services and Data Integration Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems March 16, 2022 Some slides by Berthier Ribeiro-Neto

Web Services and Data Integration Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems September 20, 2015 Some slides by

Embed Size (px)

Citation preview

Web Services and Data Integration

Zachary G. IvesUniversity of Pennsylvania

CIS 455 / 555 – Internet and Web Systems

April 19, 2023

Some slides by Berthier Ribeiro-Neto

2

Reminders & Announcements

Assignment 3 now officially released

Midterm next Wednesday, 3/31

3

How Do We Declare Functions?

WSDL is the interface definition language for web services Defines notions of protocol bindings, ports, and

services Generally describes data types using XML

Schema

In CORBA, this was called an IDL In Java, the interface uses the same

language as the Java code

4

A WSDL Service

Service

Port Port Port

PortTypeOperation

Operation

PortTypeOperation

Operation

PortTypeOperation

Operation

Binding Binding Binding

5

Web Service Terminology

Service: the entire Web Service Port: maps a set of port types to a

transport binding (a protocol, frequently SOAP, COM, CORBA, …)

Port Type: abstract grouping of operations, i.e. a class

Operation: the type of operation – request/response, one-way Input message and output message; maybe

also fault message Types: the XML Schema type definitions

6

Example WSDL

<service name=“POService”><port binding=“my:POBinding”>

<soap:address location=“http://yyy:9000/POSvc”/></port>

</service><binding xmlns:my=“…” name=“POBinding”>

<soap:binding style=“rpc” transport=“http://www.w3.org/2001/...” />

<operation name=“POrder”><soap:operation soapAction=“POService/POBinding” style=“rpc” /><input name=“POrder”>

<soap:body use=“literal” … namespace=“POService” …/></input><output name=“POrderResult”>

<soap:body use=“literal” … namespace=“POService” …/></output>

</operation></binding>

7

JAX-RPC: Java and Web Services

To write JAX-RPC web service “endpoint”, you need two parts: An endpoint interface – this is basically like the

IDL statement An implementation class – your actual code

public interface BookQuote extends java.rmi.Remote {public float getBookPrice(String isbn) throws java.rmi.RemoteException;

}public class BookQuote_Impl_1 implements BookQuote {

public float getBookPrice(String isbn) { return 3.22; }}

8

Different Options for Calling

The conventional approach is to generate a stub, as in the RPC model described earlier

You can also dynamically generate the call to the remote interface, e.g., by looking up an interesting function to call

Finally, the “DII” (Dynamic Instance Invocation) method allows you to assemble the SOAP call on your own

9

Creating a Java Web Service

A compiler called wscompile is used to generate your WSDL file and stubs You need to start with a configuration file that

says something about the service you’re building and the interfaces that you’re converting into Web Services

10

Example Configuration File

<?xml version="1.0" encoding="UTF-8"?><configuration

xmlns="http://java.sun.com/xml/ns/jax- rpc/ri/config"><service name="StockQuote"

targetNamespace="http://example.com/stockquote.wsdl" typeNamespace="http://example.com/stockquote/types" packageName="stockqt">

<interface name="stockqt.StockQuoteProvider" servantName="stockqt.StockQuoteServiceImpl"/>

</service>

</configuration>

11

Starting a WAR

The Web Service version of a Java JAR file is a Web Archive, WAR

There’s a tool called wsdeploy that generates WAR files

Generally this will automatically be called from a build tool such as Ant

Finally, you may need to add the WAR file to the appropriate location in Apache Tomcat (or WebSphere, etc.) and enable it

See http://java.sun.com/developer/technicalArticles/WebServices/WSPack2/jaxrpc.html for a detailed example

12

Finding a Web Service

UDDI: Universal Description, Discovery, and Integration registry

Think of it as DNS for web services It’s a replicated database, hosted by IBM, HP,

SAP, MS

UDDI takes SOAP requests to add and query web service interface data

13

What’s in UDDI

White pages: Information about business names, contact info, Web site

name, etc.

Yellow pages: Types of businesses, locations, products Includes predefined taxonomies for location, industry, etc.

Green pages – what we probably care the most about: How to interact with business services; business process

definitions; etc Pointer to WSDL file(s) Unique ID for each service

14

Data Types in UDDI

businessEntity: top-level structure describing info about the business

businessService: name and description of a service

bindingTemplate: how to access the service tModel (t = type/technical): unique identifier

for each service-template specification publisherAssertion: describes relationship

between businessEntities (e.g., department, division)

15

Relationships between UDDI Structures

publisherAssertion

businessEntity

businessService bindingTemplate

tModel

n

2

1n

1 n

m

n

16

Example UDDI businessEntity<businessEntity businessKey=“0123…” xmlns=“urn:uddi-

org:api_v2”><discoveryURLs>

<discoveryURL useType=“businessEntity”>http://uddi.ibm.com/registery/uddiget?businessKey=0123 ...

</discoveryURL><name>My Books</name><description>Technical Book Wholesaler</description>…<businessServices>

…</businessServices><identifierBag>

<!– keyedReferences to tModels </identifierBag><categoryBag> … </categoryBag>

</businessEntity>

17

UDDI in Perspective

Original idea was that it would just organize itself in a way that people could find anything they wanted

Today UDDI is basically a very simple catalog of services, which can be queried with standard APIs It’s not clear that it really does what people

really want: they want to find services “like Y” or “that do Z”

18

The Problem

There’s no universal, unambiguous way of describing “what I mean” Relational database idea of “normalization” doesn’t

convert concepts into some normal form – it just helps us cluster our concepts in meaningful ways

“Knowledge representation” tries to encode definitions clearly – but even then, much is up to interpretation

The best we can do: describe how things relate

19

This Brings Us to XQuery,Whose Main Role Is to Relate XML

Suppose we define an XML schema for our target data and our source data

XQuery allows us to define mappings from input XPath matches to output trees

Can directly translate between XML schemas or structures Describes a relationship between two items

Transform 2 into 6 by “add 4” operation Convert from S1 to S2 by applying the query described by view V

Often, we don’t need to transfer all data – instead, we want to use the data at one source to help answer a query over another source…

20

Let’s Look at Some SimpleMappings

Beginning with examples of using XQuery to convert from one schema to another, e.g., to import data

First: let’s review what our XQuery mappings need to accomplish…

21

Challenges of Mapping Schemas

In a perfect world, it would be easy to match up items from one schema with another Each element would have a simple correspondence to an

element in the other schema Every value would clearly map to a value in the other

schema

Real world: as with human languages, things don’t map clearly! Different decompositions into elements Different structures Tag name vs. value Values may not exactly correspond It may be unclear whether a value is the same

It’s a tough job, but often things can be mapped

22

Example Schemas

Bob’s Movie Database<movie> <title>…</title> <year>…</year> <director>…</director> <editor>…</editor> <star>…</star>*</movie>*

Mary’s Art List<workOfArt> <id>…</id> <type>…</type> <artist>…</artist> <subject>…</subject> <title>…</title></workOfArt>*

Want to map data from one schema to the other

23

Mapping Bob’s Movies Mary’s Art

Start with the schema of the output as a template:<workOfArt> <id>$i</id> <type>$y</type> <artist>$a</artist> <subject>$s</subject> <title>$t</title></workOfArt>

Then figure out where to find the values in the source, and create XPaths

24

The Final Schema Mapping

Mary’s Art Bob’s Moviesfor $m in doc(“movie.xml”)//movie, $a in $m/director/text(),

$i in $m/title/text(), $t in $m/title/text()return <workOfArt>

<id>$i</id> <type>movie</type> <artist>$a</artist> <title>$t</title></workOfArt>

Note the absence of subject…We had no reasonable source,so we are leaving it out.

25

Mapping Values

Sometimes two schemas use different representations for the same thing ID SSN English Hungarian

We typically use an intermediate table defining correspondences – a “concordance table” It can be generated automatically, and then

corrected by hand (since there will often be exceptions)

26

An Example Value Mapping Problem

Penn student enrollment DB:…

<student><pennid>12346</pennid> <name>Mary McDonald</name>

<taking><sem>F03</sem>

<class>cse330</class></taking> </student> <student><pennid>12345</pennid> <name>Jon Doh</name> </student>

Penn dental plan: <patient><ssn>323-468-1212</ssn>

<treatment>Dental sealant</treatment> </patient>

Want to output student names + treatments…

27

Translating Values with a Concordance Table

return <student> {<name>{ $n

}</name><treatment> { $tr } </treatment>

</student>

28

Translating Values with a Concordance Table

for $p in doc (“student.xml”) /db/student, $pid in $p/pennid/text(), $n in $p/name/text(),$m in doc (“concord.xml”) /db/mapping, $f in $m/from/text(), $t in $m/to/text(),$d in doc(“dental.xml”)/db/patient, $s in $d/ssn/text(), $tr in $d/treatment/text()

where ____________________return <student> {

<name>{ $n }</name>

<treatment> { $tr } </treatment>

</student>

student.xml:<student><pennid>12346</pennid> <name>Mary McDonald</name>

<taking><sem>F03</sem>

<class>cse330</class></taking> </student>

$pid: PennID$n: name

29

Translating Values with a Concordance Table

for $p in doc (“student.xml”) /db/student, $pid in $p/pennid/text(), $n in $p/name/text(),$d in doc(“dental.xml”)/db/patient, $s in $d/ssn/text(), $tr in $d/treatment/text(),$m in doc (“concord.xml”) /db/mapping, $f in $m/from/text(), $t in $m/to/text()where ____________________

return <student> {<name>{ $n

}</name><treatment> { $tr } </treatment>

</student>

student.xml:<student><pennid>12346</pennid> <name>Mary McDonald</name>

<taking><sem>F03</sem>

<class>cse330</class></taking> </student>

dental.xml:

<patient><ssn>323-468-1212</ssn> <treatment>Dental sealant</treatment> </patient>

$pid: PennID$n: name$s: ssn$tr: treatment

30

Translating Values with a Concordance Table

for $p in doc (“student.xml”) /db/student, $pid in $p/pennid/text(), $n in $p/name/text(),$d in doc(“dental.xml”)/db/patient, $s in $d/ssn/text(), $tr in $d/treatment/text(),$m in doc (“concord.xml”) /db/mapping, $f in $m/from/text(), $t in $m/to/text()where ____________________

return <student> {<name>{ $n

}</name><treatment> { $tr } </treatment>

</student>

student.xml:<student><pennid>12346</pennid> <name>Mary McDonald</name>

<taking><sem>F03</sem>

<class>cse330</class></taking> </student>

dental.xml:

<patient><ssn>323-468-1212</ssn> <treatment>Dental sealant</treatment> </patient>

concord.xml:<mapping>

<from>12346</from><to>323-468-1212</

to></mapping>

$pid: PennID$n: name$s: ssn$tr: treatment$f: PennID$t: ssn

31

Translating Values with a Concordance Table

for $p in doc (“student.xml”) /db/student, $pid in $p/pennid/text(), $n in $p/name/text(),$d in doc(“dental.xml”)/db/patient, $s in $d/ssn/text(), $tr in $d/treatment/text(),$m in doc (“concord.xml”) /db/mapping, $f in $m/from/text(), $t in $m/to/text()where ____________________

return <student> {<name>{ $n

}</name><treatment> { $tr } </treatment>

</student>

student.xml:<student><pennid>12346</pennid> <name>Mary McDonald</name>

<taking><sem>F03</sem>

<class>cse330</class></taking> </student>

dental.xml:

<patient><ssn>323-468-1212</ssn> <treatment>Dental sealant</treatment> </patient>

concord.xml:<mapping>

<from>12346</from><to>323-468-1212</

to></mapping>

$pid: PennID$n: name$s: ssn$tr: treatment$f: PennID$t: ssn

32

Summary: Mapping, Integrating, and Sharing Data

Mappings based on XQuery rather than XSLT

Can do point-to-point mappings to exchange data

UDDI versus this approach?

What about search and its relationship to integration? In particular, search over Amazon, Google Maps, Google, Yahoo, …