Automating the Use of Web APIs through Lightweight Semantics

Preview:

DESCRIPTION

ICWE 2011 Tutorial:Automating the Use of Web APIs through Lightweight Semantics Maria Maleshkova, Carlos Pedrinaci and Dong Liu

Citation preview

Automating the Use of Web Automating the Use of Web APIs through Lightweight APIs through Lightweight

SemanticsSemanticsThe Open University

ICWE 2011, Paphos, Cyprus

Presenters

• Carlos Pedrinaci c.pedrinaci@open.ac.uk

• Maria Maleshkova m.maleshkova@open.ac.uk

• Dong Liu d.liu@open.ac.uk

Acknowledgements

• Guillermo Alvaro (iSOCO)

• Ning Li, Jacek Kopecky, Dave Lambert, John Domingue (OU)

• Reto Krummenacher, University of Innsbruck

• SOA4All Project

• 1 slide by Tom Gruber (Siri/Apple)

Structure of the Tutorial

Morning(Theory)

•Intro

•Background

•Describing Web APIs

•Discovering Web APIs

•Invoking Web APIs

Afternoon(Practice)

•Hands-On Session

Preparation for Hands-On

• The material shown in this session will be the basis for the hands-on session afterwards

• You need an up-to-date version of Firefox

• Tabulator extension for Firefox–http://dig.csail.mit.edu/2007/tab/

Interrupt!

Web Services Web Services or or

Services on the Web?Services on the Web?

Web Services

• Large number of standards and implementations– WSDL, BPEL, WS-Coordination, WS-

Transaction, WS-AtomicTransaction, WS-★

“Despite their name,Web Services have nothing to do with the Web”

Frank Leymann SSAIE 2009

Web Services on the Web

• The Web currently contains 30 billion Web pages–Nearly 100M active sites

–10 million new pages added each day

• The Web contains only 28,000 WSDL Web services (Seekda.com)

• Verizon have around 4,000

Geek and Poke

The Ecosystem of APIs and Online Data

Over 3500 APIs and 5100 Mashupsgrowing at accelerated rate...

©Siri (sligthly modified)

Web APIs Technologies

• Web APIs are based on a light technology stack ≅ URIs, HTTP, XML/JSON

• Very much aligned with Web technologies• Some are based on REST principles

–Resource identification through URIs–Uniform interface–Self-descriptive messages–Stateful interactions through hyperlinks

Challenges with Web APIs

• There is not a widely used IDL–Locating services is hard –Their use requires human interpretation of semi-

structured descriptions• The semantics of the services are not described in a machine processable manner

• Prevents automating discovery, invocation, and composition of Web APIs

Tutorial Coverage

• In this tutorial we shall cover existing approaches to Web APIs –Description–Discovery –Invocation

• We shall present an integrated approach based on the use of semantic technologies

BackgroundBackground

Semantic Web Principles

• Lift data available on the Web to a level where machines can manipulate it in “meaningful ways”

• Adding machine “understandable” annotations about Web resources

• All resources are identified by URIs• Use of Web oriented modeling languages

• RDF & RDFS• OWL (Lite, DL, Full) & OWL2 (EL, QL, RL)

• Use of conceptual models (ontologies, vocabularies)

RDF

• Resource Description Framework (RDF) is the HTML of the Semantic Web–Simple way to describe resources on the Web–Based on triples <subject, predicate, object>–Defines graphs

RDF Example

RDF Example

RDF Example

RDF Representation

• Several available. Only standard RDF/XML

@prefix ex: <http://www.example.org/rdf-example#>@prefix dc: <http://purl.org/dc/elements/1.1/>

ex:Person a rdfs:Class.ex:Carlos a ex:Person.ex:Doc a rdfs:Class.ex:Slide rdfs:subClassOf ex:Doc.ex:this a ex:Slide.ex:this dc:hasCreator ex:Carlos....

Ontology and Rule languages

• RDF Schema (RDFS)–A simple ontology language on RDF

• Web Ontology Language (OWL) is a more expressive ontology language than RDFS–Layered language based on Description Logics

• OWL Lite, OWL DL, OWL Full• OWL2 (EL, QL, RL)

• Rule languages –Rule Interchange Format (RIF)–Extend ontology languages with axioms

SPARQL

• Query language for RDF

• Can be used to express queries across diverse data sources

• SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions

• The results of SPARQL queries can be results sets or RDF graphs

SPARQL Select Example

PREFIX foaf: http://xmlns.com/foaf/0.1/

SELECT ?name1 ?name2 WHERE { ?x foaf:name ?name1 ; foaf:mbox ?mbox1 . ?y foaf:name ?name2 ; foaf:mbox ?mbox2 . FILTER (sameTerm(?mbox1, ?mbox2) &&

!sameTerm(?name1, ?name2)) }

SPARQL Construct Example

PREFIX foaf: http://xmlns.com/foaf/0.1/PREFIX vcard: http://www.w3.org/2001/vcard-rdf/3.0#

CONSTRUCT { ?x vcard:N _:v . _:v vcard:givenName ?gname . _:v vcard:familyName ?fname

} WHERE {{ ?x foaf:firstname ?gname } UNION { ?x foaf:givenname ?gname } . { ?x foaf:surname ?fname } UNION { ?x foaf:family_name ?fname } .

}

Describing Web APIs

Components Descriptions

• Software Engineering has traditionally aimed at reaching further abstraction

• Lead to the definition of objects and components for abstraction and reuse

• Remote components–RPC, RMI, CORBA, DCOM, etc

• They all rely on an IDL for describing the interface of the component

WSDL

Wikipedia

Semantic Web Services

• Web Service descriptions do not capture the semantics of services (data, functionality, and NFP)

• Limitations for Discovery, Invocation, Composition, etc

• Semantic Web Service technologies were proposed to circumvent these issues–OWL-S, WSMO, SAWSDL

Semantic Web Services

OWL-S WSMO

Semantic Web Services

• Existing approaches like OWL-S, WSMO are (perceived as) complex in terms of–Modeling and Computational complexity

• SAWSDL is purposely underspecified • Web APIs preferred over WSDL Web Services and SWS were up to now built on top of WSDL

Web APIs Description

• There is no standard–WADL is a W3C Submission but not widely used

• Different conceptual styles from RESTful to RPC

Service Nature Percentage of APIs

RPC-Style 47.8

RESTful 32.4

Hybrid 19.8

Describing Web APIs

State of Web APIs Descriptions

• Survey based on 222 Web APIs from ProgrammableWeb from 21 categories

• 40% of Web APIs do not state the used HTTP method!• Authentication

– 80% require authentication (37% use API Key, 14%HTTP Basic, 6% OAuth, etc)

• Input and Output information– 72% do not state the data type of the input parameters– 61% use optional parameters, 45% use default values– 90% have as output XML, 42% JSON– 84% provide example request and 75% example response

Kinds of Service Semantics

F NBI

Functional Semantics

• For service discovery, composition

• Category–Functionality categorization–E.g. eCl@ss–Or tagging, folksonomies

• Capability–Precondition, Effect–Needs using some rule language (WSML, RIF, etc)

F

Category Example

wl:FunctionalClassificationRoot

ex:eCommerceService

ex:TravelReservationService

ex:AccommodationReservationService

subclasses

type

Capability Example

ex:RomaHotelReservationPrecondition rdf:type wl:Condition ; rdf:value """ ?request [ numberOfGuests hasValue ?guests and city hasValue ?city ] memberOf ReservationData and ?guests <= 10 and ?city = 'Roma' """^^wsml:AxiomLiteral .

Non-functional Semantics

• For ranking and selection• Not constrained, any ontologies• Example:

ex:PriceSpecification rdfs:subClassOf wl:NonFunctionalParameter .ex:ReservationFee rdf:type ex:PriceSpecification ; rdf:value "15"^^ex:euroAmount .

N

Behavioral Semantics

• For invocation, composition, process mediation• Functionalities on operations

–Capabilities, categories• Client selects operation to invoke next

–Instead of being strictly guided by an explicit process• Example functional category for operations:

– Web Architecture: interaction safety

B

Information Semantics

• For invocation, composition,data mediation

• Not constrained, any ontologies

• Marked as wl:Ontology

I

Conceptual Model

Semantics for RESTful and WSDL

hRESTS

• "There's usually an HTML page"

• Identifying machine-readable parts–Service, its operations

–Resource address, HTTP method

–Input/output data format

• hRESTS microformat–Technically, a poshformat

hRESTS

• HTML for RESTful Service Description• Introduces the service model structure–service (+ label)–operations (+ address, method)–input, output

• Can also be in RDFa• Basis for extensions:

–MicroWSMO adds semantic annotations

MicroWSMO

• Extends hRESTS–model for model references–lifting, lowering

• Applies WSMO-Lite semantics presented earlier

Annotation Example

Annotation Example

Annotation Example

Service

OperationInput

Parameter

Annotation Example

Service

OperationInput

Parameter

Annotation Example

<div class="service" id="service1”><h1 class="header”><span class="label" id="label2">Last.fm WebServices</span>

[...]

<div class="operation" id="operation1"><h1><span class="label" id="label1">artist.getInfo</span></h1>

<div class="wsdescription">Get the metadata for an artist onLast.fm. Includes biography.</div>

[...]

<div class="input" id="input1"><span class="param">artist</span>

[...]

Annotation Example

<div class="service" id="service1”><h1 class="header”><span class="label" id="label2">Last.fm WebServices</span>

<a rel="model" href="http://www.service-finder.eu/ontologies/ServiceCategories#Music"></a>

<div class="operation" id="operation1"><h1><span class="label" id="label1">artist.getInfo</span></h1>

<div class="wsdescription">Get the metadata for an artist onLast.fm. Includes biography.</div>

[...]

Resulting RDF Model

<rdf:Description rdf:about="http://iserve…"><rdf:type rdf:resource=“msm:Service"/><sawsdl:modelReference rdf:resource="http://www.service-

finder.eu/ontologies/ServiceCategories#Music"/></rdf:Description>

<rdf:Description rdf:about=”http://iserve…"><rdf:type rdf:resource=”msm:Operation"/>

</rdf:Description>

<rdf:Description rdf:about="http://iserve…"><rdf:type rdf:resource=”msm#MessageContent"/>

</rdf:Description>

[…]

Web APIs Discovery

Outline

• Matchmaking and Ranking• Traditional Web Service Discovery and UDDI• Semantic Matchmaking

– Logic-based Methods– Non-Logic-based Methods– Hybrid Methods

• Discovery of Web APIs– ProgrammableWeb– Google API Discovery Service– iServe Approach

Matchmaking

• Find candidate Web Services that can provide the desired functionality using– Information Retrieval (IR) techniques

• Match natural language keywords in resource descriptions, similarity analysis

– Semantic Matchmaking• Reasoning applied over descriptions, i.e., Inputs/Outputs,

Preconditions, Effects, Classifications, QoS, Tags, …– Structural analysis– Hybrid IR/semantic solutions perform better

Ranking

• Rank and eventually select the best service given certain criteria

• Often applied after service matchmaking using– Different degrees of match for ranking (exact > plugin, etc.)– Non-functional properties (QoS, response time, user location,

ratings, etc.)• Techniques

– Weighted combination, skyline, fuzzy reasoning

Traditional Web Service Discovery

UDDI

• Universal Description, Discovery and Integration

• A Web service registry API specification

–Business-oriented service publication, discovery (initially limited search capabilities)

–Itself has Web service interface

–Useful for intranet registries

• A failed public service

–Not particularly useful discovery support

–Discontinued in 2006

Semantic Matchmaking

• Logic-based Methods–Logical Unfolding–Matching Degree–Input and Output–Functional Classification

• Non-logic-based Methods–Text Similarity–Structural Analysis–Collaborative Filtering

• Hybrid Methods– Logic-based + Non-logic-based

Logical Unfolding

• Name Symbols vs. Base Symbols• Example:

Matching Degree

• Exact– R and S are equivalent concepts

• Plug in– R is a sub-concept of S

• Subsumes– R is a super-concept of S

• Intersection– If the intersection of S and R is satisfiable

• Disjoint (Mismatch, Fail)– If the intersection of S and R is not satisfiable

Matching Degree

Input and Output

• Input– Exact > Plug in > Subsumes > Intersection > Disjoint

• Output– Exact > Subsumes > Plug in > Intersection > Disjoint

• Example– Requst

• Input: Fruit, Output: TaxedPrice– Service A

• Input: Food, Output: UKTaxedPrice• Input: Plug in; Output: Subsumes

– Service B• Input: Apple, Output: Price• Input: Subsumes; Output: Plug in

Functional Classification

• Find “Exact” or “Plug in” of Service Category• Example

– Taxonomy• Service Finder:

http://www.service-finder.eu/ontologies/ServiceCategories– Service

• <rdf:Description rdf:about="http://.../Service">• <sawsdl:modelReference rdf:resource="http://www.service-

finder.eu/ontologies/ServiceCategories#Content"/>• </rdf:Description>

– Query• SELECT ?service WHERE { ?service sawsdl:modelReference ?c .• ?c rdfs:subClassOf sf:Content . }

Non-Logic-based

• Text Similarity– Method

• Levenshtein, TF-IDF, Jaro, Averaged String Matching, etc– Library

• SecondString, SimMetrics, SimPack, etc

• Structural Analysis– XML schema of input/output messages

• Collaborative Filtering– Text similarity of tags

Example

• Example for Levenshtein Distance– Levenshtein("CarBicyclePrice service”, ”Car Bicycle Taxed

Price Service”) = 9– Play online: http://www.functions-online.com/

levenshtein.html• Ranking services by the Levenshtein Distance

between labels of request and candidate services

Hybrid

• Logic-based + Non-logic-based• Related Techniques

– Vector• Euclidean, Dice, Cosine, Jaccard, Manhattan, Overlap, Pearson, etc

– Tree• Bottom-up/Top-down Maximum Common Subtree, Tree Edit Distance

– Graph• Bipartite graph-matching, Conceptual Similarity, Graph Isomorphism,

Subgraph Isomorphism, Maximum Common Subgraph Isomorphism, Graph Isomorphism Covering, Shortest Path

– Set• Jaccard, Loss of Information, Resemblance

Reflections on Different Techniques

• Logic-based vs. Non-logic-based methods– “Integration of logic-based reasoning with text similarity may

significantly improve precision at the cost of higher avg query response time.”

– “Hybrid semantic matching can be less precise than mere logic-based matching in case of syntactic pre-filtering of services (two-phase vs. integrative hybrid).”

• Cache mechanism for ontologies helps improve performance

Discovery of Web APIs

• ProgrammableWeb: Keyword, Category, Company...

Discovery of Web APIs

• Google: Label, Name, Preferred

Discovery of Web APIs

• iServe approach–iServe architecture–Implemented Discovery Mechanisms

• Simple SPARQL-based• Inputs/Outputs logic-based using RDFS reasoning• Functional classifications with RDFS reasoning• Similarity analysis based on iMatcher• Support to re-use most of existing discovery

mechanisms• Atom-based discovery

– Discovery mechanisms return an Atom feed with the results

– Provides Atom feed combinators: Union, Intersection, Subtract

iServe Architecture

iServe Architecture

Minimal Service Model

SPARQL based

• Find services by executing SPARQL query• Example

–SELECT ?s WHERE {– ?s <http://purl.org/dc/elements/1.1/creator>

<http://iserve.kmi.open.ac.uk/foaf.rdf#iServe>–} LIMIT 10

Functional Classification

• Functional Classification Root– Defined by WSMO-Lite– Subclasses of instances of FCR are functional categories– Assigned to services through sawsdl:modelReferences– Available at

• http://iserve.kmi.open.ac.uk/data/disco/func-rdfs?{classes}

– Example• http://iserve.kmi.open.ac.uk/data/disco/func-rdfs?

class=http://www.service-finder.eu/ontologies/ServiceCategories%23SMS

Similarity analysis

• Implemented based on iMatcher– iMatcher provides a number of similarity-based approximate

matchmaking strategies– Two of them (Levenshtein, TF-IDF) are ported to iServe and

available at• http://iserve.kmi.open.ac.uk/data/disco/imatch?

strategy=levenshtein&label=L• http://iserve.kmi.open.ac.uk/data/disco/imatch?

strategy=tfidfd&comment=C• Example

– http://iserve.kmi.open.ac.uk/data/disco/imatch?strategy=levenshtein&label=ribbit

– http://iserve.kmi.open.ac.uk/data/disco/imatch?strategy=tfidfd&comment=service

Web API Invocation

Web Service Invocation

• Find and retrieve Web service entry in UDDI

• Extracts WSDL URL

• Retrieves WSDL file and use it to generate stub/hub code

• Get input data or prompt user for operation and operands

• Invokes web service with user input

• Displays result

WSDL-based Invocation

• Stubs provide a local interface to the remote Web service

ticketBooker_Appl{ // GUI code

wsProxy = new tixServiceStub (); wsProxy.book(…);}

ticketBooker_Appl{ // GUI code

wsProxy = new tixServiceStub (); wsProxy.book(…);}

Application

tixService{

book (…) { // Actual business // logic }}

tixService{

book (…) { // Actual business // logic }}

Web Service

tixServiceStub{

book (…) { // SOAPCall }}

tixServiceStub{

book (…) { // SOAPCall }}

SO

AP

SO

AP

Web API Invocation

• Manual discovery–Keyword search in search engines

–Search in API depositories

–Word of mouth

• Interpreting of the documentation–Completing missing information

• Custom implementation solutions–Low level of reusability

Current Invocation Support

• WSDL and WADL-based invocation

• Implementation support

–Jersey, Apache Axis, JOpera

• Pipe-based solutions–Yahoo pipes, Deri pipes, IBM mashup center

• Google ‘meta’ API

• Semantic approaches:–SPICES

–Lambert et. al

Challenges

• Lack of widely accepted IDL–Web APIs commonly described in HTML

• Underspecification–27.5% APIs that state the data-type of the parameters –60.4% Provide HTTP method

• Heterogeneity–47.8 % RPC-Style, 32.4% RESTful, 19.8% Hybrid–61.3% use optional parameters –51.3% use alternative values for a parameter –44.6% use default values for parameters –24.8% use coded values for a parameter

Why Lightweight Semantics?

• Non-invasive approach–Enriching the existing HTML documentation, NOT requiring

new description files

–Syntactic structuring via microformats inserted in the HTML

–Semantic enhancement via model references

• Based on a common Web API grounding model–Declarative specification of what the API does

–Abstraction layer over current heterogeneity

–Semantics as basis for task automation support

Invocation Step-by-Step

• Construct HTTP request:–Identify the HTTP Method

–Construct invocation URI

–Construct HTTP body and header

–Prepare the input data

• Actual invocation

• Process the HTTP response–Response handling

–Process the output data

–Present the output

–Error handling

• http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=Cher&api_key=***

Example Request

GET /2.0/?method=artist.getinfo HTTP/1.0 User-agent: curl/7.19.7 Mozilla/4.0

Host: ws.audioscrobbler.com Accept: */*

text/html, image/gif,image/jpeg Accept-language:fr

request line(GET, POST,

HEAD commands)

header lines

Example Response

HTTP/1.0 200 OK Connection: closeDate: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ...

status line(protocol

status codestatus phrase)

header lines

data, e.g., requestedhtml file

HTTP Response Status Codes

• 200 OK– request succeeded

• 301 Moved Permanently– requested object moved

• 400 Bad Request– request message not understood by server

• 404 Not Found– requested document not found on this server

• 505 HTTP Version Not Supported

• Custom Errors in about 50%

Requirements on Web API Descriptions

• Capture HTTP method• Operation definition

– If necessary, mapping of resource to operation definition

• Parameterized URI definition• Distinction between service and operation address• Input grounding• Input lowering / Output lifting• Distinction between the inputs and outputs as a whole and

their parts• Invocation relevant input (optional and output format

parameters)• Custom errors support

Current State

Web API Grounding Model

Data Grounding

• Which part of the input goes where?–Parameters belonging in the URI

–Parameters transmitted in the HTTP body

–HTTP header parameters

isGroundeIn property

• SchemaMapping expected transformation type–acceptsContentType and producesContentType

Granularity of the Input and Output

• MessageContent and MessageParts–Individual grounding of MessageParts

–Individual lifting and lowering schemas

–Optional and mandatory parts

–Parts relevant for invocation such as output format and authentication parameters

• Important for Invocation

• But also for discovery and composition

Invocation Engine

Data Transformations

Invocation Steps

LastFM API Invocation

Lowering

• Artist

• API Key

declare namespace foaf = "http://xmlns.com/foaf/0.1";declare namespace mo = "http://purl.org/ontology/mo/";{ for $artist_name $artist from <file:StaticInputFile>

where { $artist a mo:MusicArtist; foaf:name $artist_name. } return {$artist_name}}

declare namespace waa = "http://purl.oclc.org/NET/WebApiAuthentication#";declare namespace sioc = "http://rdfs.org/sioc/ns#";{ for $apikey $user from <file:StaticInputFile>

where { $user a sioc:UserAccount; waa:API_Key $apikey.} return {$apikey}}

Lifting

declare namespace foaf="http://xmlns.com/foaf/0.1";declare namespace mo="http://purl.org/ontology/mo/";

let $doc :=doc("OriginalOutputFile") for $listing in $doc//artist let $name := $listing/name let $id := $listing/mbid let $url := $listing/url let $image := $listing/image[@size='medium'] construct { _:p a mo:Artist;

foaf:name {data($name)}; mo:musicbrainz_guid {data($id)}; mo:homepage {data($url)};

mo:image {data($image)}; }

Invocation Example

• http://iserve.kmi.open.ac.uk/rest- invoke/service/{ServiceUID}/operation/{OperationName}– http://iserve-dev.kmi.open.ac.uk:8080/RestInvoke/service/db4b646a-4665-4337-

9626-4669cc8bce56/operation/ArtistGetInfo/invoke

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:waa="http://purl.oclc.org/NET/WebApiAuthentication#" xmlns:mo="http://purl.org/ontology/mo/" xmlns:foaf="http://xmlns.com/foaf/0.1" xmlns:sioc="http://rdfs.org/sioc/ns#"> <mo:MusicArtist rdf:about="#artist1"> <foaf:name>Cher</foaf:name> </mo:MusicArtist> <sioc:UserAccount rdf:about="#usr0"> <waa:API_Key>b25b959554ed76058ac220b7b2e0a026</waa:API_Key> </sioc:UserAccount></rdf:RDF>

Conclusions

• Current world of Services on the Web is very:–Heterogeneous

–Not conforming to standards or guidelines

–Suffering from underspecification

• Need of a unifying model capable of supporting invocation

• Web API Grounding model

• Invocation Engine

Wrap-up

• Current world of Web APIs is “messy”–Heterogeneity and underspecification

• Implications on discovery, composition and invocation–A lot of manual effort

–Low level of reuse of the implementations

Wrap-up

• Solution: lightweight semantics–Non-invasive approach

–Common abstraction layer overcoming heterogeneity

–Basis for tasks automation

• Web API Semantic Descriptions

• Web API discovery

• Web API invocation–Web API grounding model

–Invocation Engine

Hands-on Session

• Web API Annotation–Web API annotation with SWEET

–Semantic description publishing in iServe

• Web API Discovery–Without lightweight semantics

–Service search in iServe

• Web API Invocation–Invocation with the Invocation Engine

Thank You!Thank You!

http://sweet.kmi.open.ac.uk/ http://iserve.kmi.open.ac.uk/

http://www.soa4all.eu/

References

• SAWSDL– http://www.w3.org/2002/ws/sawsdl/

• WSMO-Lite–http://cms-wg.sti2.org/TR/d11/v0.2/

• hRESTS & MicroWSMO–http://cms-wg.sti2.org/TR/d12

• REST & RESTful Web services–http://en.wikipedia.org/wiki/REST

• Microformats–http://microformats.org/

References

• SPARQL–http://www.w3.org/TR/rdf-sparql-query/

• WSMO–http://www.wsmo.org and http://cms-wg.sti2.org

• OWL-S–http://www.daml.org/services/owl-s/

• SWSF & FLOWS –http://www.w3.org/Submission/SWSF/

• WSDL-S– http://www.w3.org/Submission/WSDL-S/

Recommended