Automating the Use of Web Automating the Use of Web APIs through Lightweight APIs through Lightweight
SemanticsSemanticsThe Open University
ICWE 2011, Paphos, Cyprus
Presenters
• Carlos Pedrinaci [email protected]
• Maria Maleshkova [email protected]
• Dong Liu [email protected]
Acknowledgements
• Guillermo Alvaro (iSOCO)
• Ning Li, Jacek Kopecky, Dave Lambert, John Domingue (OU)
• Reto Krummenacher, University of Innsbruck
• SOA4All Project
• 1 slide by Tom Gruber (Siri/Apple)
Structure of the Tutorial
Morning(Theory)
•Intro
•Background
•Describing Web APIs
•Discovering Web APIs
•Invoking Web APIs
Afternoon(Practice)
•Hands-On Session
Preparation for Hands-On
• The material shown in this session will be the basis for the hands-on session afterwards
• You need an up-to-date version of Firefox
• Tabulator extension for Firefox–http://dig.csail.mit.edu/2007/tab/
Interrupt!
Web Services Web Services or or
Services on the Web?Services on the Web?
Web Services
• Large number of standards and implementations– WSDL, BPEL, WS-Coordination, WS-
Transaction, WS-AtomicTransaction, WS-★
“Despite their name,Web Services have nothing to do with the Web”
Frank Leymann SSAIE 2009
Web Services on the Web
• The Web currently contains 30 billion Web pages–Nearly 100M active sites
–10 million new pages added each day
• The Web contains only 28,000 WSDL Web services (Seekda.com)
• Verizon have around 4,000
Geek and Poke
The Ecosystem of APIs and Online Data
Over 3500 APIs and 5100 Mashupsgrowing at accelerated rate...
©Siri (sligthly modified)
Web APIs Technologies
• Web APIs are based on a light technology stack ≅ URIs, HTTP, XML/JSON
• Very much aligned with Web technologies• Some are based on REST principles
–Resource identification through URIs–Uniform interface–Self-descriptive messages–Stateful interactions through hyperlinks
Challenges with Web APIs
• There is not a widely used IDL–Locating services is hard –Their use requires human interpretation of semi-
structured descriptions• The semantics of the services are not described in a machine processable manner
• Prevents automating discovery, invocation, and composition of Web APIs
Tutorial Coverage
• In this tutorial we shall cover existing approaches to Web APIs –Description–Discovery –Invocation
• We shall present an integrated approach based on the use of semantic technologies
BackgroundBackground
Semantic Web Principles
• Lift data available on the Web to a level where machines can manipulate it in “meaningful ways”
• Adding machine “understandable” annotations about Web resources
• All resources are identified by URIs• Use of Web oriented modeling languages
• RDF & RDFS• OWL (Lite, DL, Full) & OWL2 (EL, QL, RL)
• Use of conceptual models (ontologies, vocabularies)
RDF
• Resource Description Framework (RDF) is the HTML of the Semantic Web–Simple way to describe resources on the Web–Based on triples <subject, predicate, object>–Defines graphs
RDF Example
RDF Example
RDF Example
RDF Representation
• Several available. Only standard RDF/XML
@prefix ex: <http://www.example.org/rdf-example#>@prefix dc: <http://purl.org/dc/elements/1.1/>
ex:Person a rdfs:Class.ex:Carlos a ex:Person.ex:Doc a rdfs:Class.ex:Slide rdfs:subClassOf ex:Doc.ex:this a ex:Slide.ex:this dc:hasCreator ex:Carlos....
Ontology and Rule languages
• RDF Schema (RDFS)–A simple ontology language on RDF
• Web Ontology Language (OWL) is a more expressive ontology language than RDFS–Layered language based on Description Logics
• OWL Lite, OWL DL, OWL Full• OWL2 (EL, QL, RL)
• Rule languages –Rule Interchange Format (RIF)–Extend ontology languages with axioms
SPARQL
• Query language for RDF
• Can be used to express queries across diverse data sources
• SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions
• The results of SPARQL queries can be results sets or RDF graphs
SPARQL Select Example
PREFIX foaf: http://xmlns.com/foaf/0.1/
SELECT ?name1 ?name2 WHERE { ?x foaf:name ?name1 ; foaf:mbox ?mbox1 . ?y foaf:name ?name2 ; foaf:mbox ?mbox2 . FILTER (sameTerm(?mbox1, ?mbox2) &&
!sameTerm(?name1, ?name2)) }
SPARQL Construct Example
PREFIX foaf: http://xmlns.com/foaf/0.1/PREFIX vcard: http://www.w3.org/2001/vcard-rdf/3.0#
CONSTRUCT { ?x vcard:N _:v . _:v vcard:givenName ?gname . _:v vcard:familyName ?fname
} WHERE {{ ?x foaf:firstname ?gname } UNION { ?x foaf:givenname ?gname } . { ?x foaf:surname ?fname } UNION { ?x foaf:family_name ?fname } .
}
Describing Web APIs
Components Descriptions
• Software Engineering has traditionally aimed at reaching further abstraction
• Lead to the definition of objects and components for abstraction and reuse
• Remote components–RPC, RMI, CORBA, DCOM, etc
• They all rely on an IDL for describing the interface of the component
WSDL
Wikipedia
Semantic Web Services
• Web Service descriptions do not capture the semantics of services (data, functionality, and NFP)
• Limitations for Discovery, Invocation, Composition, etc
• Semantic Web Service technologies were proposed to circumvent these issues–OWL-S, WSMO, SAWSDL
Semantic Web Services
OWL-S WSMO
Semantic Web Services
• Existing approaches like OWL-S, WSMO are (perceived as) complex in terms of–Modeling and Computational complexity
• SAWSDL is purposely underspecified • Web APIs preferred over WSDL Web Services and SWS were up to now built on top of WSDL
Web APIs Description
• There is no standard–WADL is a W3C Submission but not widely used
• Different conceptual styles from RESTful to RPC
Service Nature Percentage of APIs
RPC-Style 47.8
RESTful 32.4
Hybrid 19.8
Describing Web APIs
State of Web APIs Descriptions
• Survey based on 222 Web APIs from ProgrammableWeb from 21 categories
• 40% of Web APIs do not state the used HTTP method!• Authentication
– 80% require authentication (37% use API Key, 14%HTTP Basic, 6% OAuth, etc)
• Input and Output information– 72% do not state the data type of the input parameters– 61% use optional parameters, 45% use default values– 90% have as output XML, 42% JSON– 84% provide example request and 75% example response
Kinds of Service Semantics
F NBI
Functional Semantics
• For service discovery, composition
• Category–Functionality categorization–E.g. eCl@ss–Or tagging, folksonomies
• Capability–Precondition, Effect–Needs using some rule language (WSML, RIF, etc)
F
Category Example
wl:FunctionalClassificationRoot
ex:eCommerceService
ex:TravelReservationService
ex:AccommodationReservationService
subclasses
type
Capability Example
ex:RomaHotelReservationPrecondition rdf:type wl:Condition ; rdf:value """ ?request [ numberOfGuests hasValue ?guests and city hasValue ?city ] memberOf ReservationData and ?guests <= 10 and ?city = 'Roma' """^^wsml:AxiomLiteral .
Non-functional Semantics
• For ranking and selection• Not constrained, any ontologies• Example:
ex:PriceSpecification rdfs:subClassOf wl:NonFunctionalParameter .ex:ReservationFee rdf:type ex:PriceSpecification ; rdf:value "15"^^ex:euroAmount .
N
Behavioral Semantics
• For invocation, composition, process mediation• Functionalities on operations
–Capabilities, categories• Client selects operation to invoke next
–Instead of being strictly guided by an explicit process• Example functional category for operations:
– Web Architecture: interaction safety
B
Information Semantics
• For invocation, composition,data mediation
• Not constrained, any ontologies
• Marked as wl:Ontology
I
Conceptual Model
Semantics for RESTful and WSDL
hRESTS
• "There's usually an HTML page"
• Identifying machine-readable parts–Service, its operations
–Resource address, HTTP method
–Input/output data format
• hRESTS microformat–Technically, a poshformat
hRESTS
• HTML for RESTful Service Description• Introduces the service model structure–service (+ label)–operations (+ address, method)–input, output
• Can also be in RDFa• Basis for extensions:
–MicroWSMO adds semantic annotations
MicroWSMO
• Extends hRESTS–model for model references–lifting, lowering
• Applies WSMO-Lite semantics presented earlier
Annotation Example
Annotation Example
Annotation Example
Service
OperationInput
Parameter
Annotation Example
Service
OperationInput
Parameter
Annotation Example
<div class="service" id="service1”><h1 class="header”><span class="label" id="label2">Last.fm WebServices</span>
[...]
<div class="operation" id="operation1"><h1><span class="label" id="label1">artist.getInfo</span></h1>
<div class="wsdescription">Get the metadata for an artist onLast.fm. Includes biography.</div>
[...]
<div class="input" id="input1"><span class="param">artist</span>
[...]
Annotation Example
<div class="service" id="service1”><h1 class="header”><span class="label" id="label2">Last.fm WebServices</span>
<a rel="model" href="http://www.service-finder.eu/ontologies/ServiceCategories#Music"></a>
<div class="operation" id="operation1"><h1><span class="label" id="label1">artist.getInfo</span></h1>
<div class="wsdescription">Get the metadata for an artist onLast.fm. Includes biography.</div>
[...]
Resulting RDF Model
<rdf:Description rdf:about="http://iserve…"><rdf:type rdf:resource=“msm:Service"/><sawsdl:modelReference rdf:resource="http://www.service-
finder.eu/ontologies/ServiceCategories#Music"/></rdf:Description>
<rdf:Description rdf:about=”http://iserve…"><rdf:type rdf:resource=”msm:Operation"/>
</rdf:Description>
<rdf:Description rdf:about="http://iserve…"><rdf:type rdf:resource=”msm#MessageContent"/>
</rdf:Description>
[…]
Web APIs Discovery
Outline
• Matchmaking and Ranking• Traditional Web Service Discovery and UDDI• Semantic Matchmaking
– Logic-based Methods– Non-Logic-based Methods– Hybrid Methods
• Discovery of Web APIs– ProgrammableWeb– Google API Discovery Service– iServe Approach
Matchmaking
• Find candidate Web Services that can provide the desired functionality using– Information Retrieval (IR) techniques
• Match natural language keywords in resource descriptions, similarity analysis
– Semantic Matchmaking• Reasoning applied over descriptions, i.e., Inputs/Outputs,
Preconditions, Effects, Classifications, QoS, Tags, …– Structural analysis– Hybrid IR/semantic solutions perform better
Ranking
• Rank and eventually select the best service given certain criteria
• Often applied after service matchmaking using– Different degrees of match for ranking (exact > plugin, etc.)– Non-functional properties (QoS, response time, user location,
ratings, etc.)• Techniques
– Weighted combination, skyline, fuzzy reasoning
Traditional Web Service Discovery
UDDI
• Universal Description, Discovery and Integration
• A Web service registry API specification
–Business-oriented service publication, discovery (initially limited search capabilities)
–Itself has Web service interface
–Useful for intranet registries
• A failed public service
–Not particularly useful discovery support
–Discontinued in 2006
Semantic Matchmaking
• Logic-based Methods–Logical Unfolding–Matching Degree–Input and Output–Functional Classification
• Non-logic-based Methods–Text Similarity–Structural Analysis–Collaborative Filtering
• Hybrid Methods– Logic-based + Non-logic-based
Logical Unfolding
• Name Symbols vs. Base Symbols• Example:
Matching Degree
• Exact– R and S are equivalent concepts
• Plug in– R is a sub-concept of S
• Subsumes– R is a super-concept of S
• Intersection– If the intersection of S and R is satisfiable
• Disjoint (Mismatch, Fail)– If the intersection of S and R is not satisfiable
Matching Degree
Input and Output
• Input– Exact > Plug in > Subsumes > Intersection > Disjoint
• Output– Exact > Subsumes > Plug in > Intersection > Disjoint
• Example– Requst
• Input: Fruit, Output: TaxedPrice– Service A
• Input: Food, Output: UKTaxedPrice• Input: Plug in; Output: Subsumes
– Service B• Input: Apple, Output: Price• Input: Subsumes; Output: Plug in
Functional Classification
• Find “Exact” or “Plug in” of Service Category• Example
– Taxonomy• Service Finder:
http://www.service-finder.eu/ontologies/ServiceCategories– Service
• <rdf:Description rdf:about="http://.../Service">• <sawsdl:modelReference rdf:resource="http://www.service-
finder.eu/ontologies/ServiceCategories#Content"/>• </rdf:Description>
– Query• SELECT ?service WHERE { ?service sawsdl:modelReference ?c .• ?c rdfs:subClassOf sf:Content . }
Non-Logic-based
• Text Similarity– Method
• Levenshtein, TF-IDF, Jaro, Averaged String Matching, etc– Library
• SecondString, SimMetrics, SimPack, etc
• Structural Analysis– XML schema of input/output messages
• Collaborative Filtering– Text similarity of tags
Example
• Example for Levenshtein Distance– Levenshtein("CarBicyclePrice service”, ”Car Bicycle Taxed
Price Service”) = 9– Play online: http://www.functions-online.com/
levenshtein.html• Ranking services by the Levenshtein Distance
between labels of request and candidate services
Hybrid
• Logic-based + Non-logic-based• Related Techniques
– Vector• Euclidean, Dice, Cosine, Jaccard, Manhattan, Overlap, Pearson, etc
– Tree• Bottom-up/Top-down Maximum Common Subtree, Tree Edit Distance
– Graph• Bipartite graph-matching, Conceptual Similarity, Graph Isomorphism,
Subgraph Isomorphism, Maximum Common Subgraph Isomorphism, Graph Isomorphism Covering, Shortest Path
– Set• Jaccard, Loss of Information, Resemblance
Reflections on Different Techniques
• Logic-based vs. Non-logic-based methods– “Integration of logic-based reasoning with text similarity may
significantly improve precision at the cost of higher avg query response time.”
– “Hybrid semantic matching can be less precise than mere logic-based matching in case of syntactic pre-filtering of services (two-phase vs. integrative hybrid).”
• Cache mechanism for ontologies helps improve performance
Discovery of Web APIs
• ProgrammableWeb: Keyword, Category, Company...
Discovery of Web APIs
• Google: Label, Name, Preferred
Discovery of Web APIs
• iServe approach–iServe architecture–Implemented Discovery Mechanisms
• Simple SPARQL-based• Inputs/Outputs logic-based using RDFS reasoning• Functional classifications with RDFS reasoning• Similarity analysis based on iMatcher• Support to re-use most of existing discovery
mechanisms• Atom-based discovery
– Discovery mechanisms return an Atom feed with the results
– Provides Atom feed combinators: Union, Intersection, Subtract
iServe Architecture
iServe Architecture
Minimal Service Model
SPARQL based
• Find services by executing SPARQL query• Example
–SELECT ?s WHERE {– ?s <http://purl.org/dc/elements/1.1/creator>
<http://iserve.kmi.open.ac.uk/foaf.rdf#iServe>–} LIMIT 10
I/O based
• Use ontological annotations of inputs and outputs
• Do RDFS reasoning• Available at
–http://iserve.kmi.open.ac.uk/data/disco/io-rdfs?f={and|or}&i=I1&i=I2&o=O1&...
• Example–http://iserve.kmi.open.ac.uk/data/disco/io-rdfs?f
=and&i=http://purl.org/iserve/ontology/owlstc/SUMO.owl%23Vehicle&o=http://purl.org/iserve/ontology/owlstc/concept.owl%23Price
Functional Classification
• Functional Classification Root– Defined by WSMO-Lite– Subclasses of instances of FCR are functional categories– Assigned to services through sawsdl:modelReferences– Available at
• http://iserve.kmi.open.ac.uk/data/disco/func-rdfs?{classes}
– Example• http://iserve.kmi.open.ac.uk/data/disco/func-rdfs?
class=http://www.service-finder.eu/ontologies/ServiceCategories%23SMS
Similarity analysis
• Implemented based on iMatcher– iMatcher provides a number of similarity-based approximate
matchmaking strategies– Two of them (Levenshtein, TF-IDF) are ported to iServe and
available at• http://iserve.kmi.open.ac.uk/data/disco/imatch?
strategy=levenshtein&label=L• http://iserve.kmi.open.ac.uk/data/disco/imatch?
strategy=tfidfd&comment=C• Example
– http://iserve.kmi.open.ac.uk/data/disco/imatch?strategy=levenshtein&label=ribbit
– http://iserve.kmi.open.ac.uk/data/disco/imatch?strategy=tfidfd&comment=service
Web API Invocation
Web Service Invocation
• Find and retrieve Web service entry in UDDI
• Extracts WSDL URL
• Retrieves WSDL file and use it to generate stub/hub code
• Get input data or prompt user for operation and operands
• Invokes web service with user input
• Displays result
WSDL-based Invocation
• Stubs provide a local interface to the remote Web service
ticketBooker_Appl{ // GUI code
wsProxy = new tixServiceStub (); wsProxy.book(…);}
ticketBooker_Appl{ // GUI code
wsProxy = new tixServiceStub (); wsProxy.book(…);}
Application
tixService{
book (…) { // Actual business // logic }}
tixService{
book (…) { // Actual business // logic }}
Web Service
tixServiceStub{
book (…) { // SOAPCall }}
tixServiceStub{
book (…) { // SOAPCall }}
SO
AP
SO
AP
Web API Invocation
• Manual discovery–Keyword search in search engines
–Search in API depositories
–Word of mouth
• Interpreting of the documentation–Completing missing information
• Custom implementation solutions–Low level of reusability
Current Invocation Support
• WSDL and WADL-based invocation
• Implementation support
–Jersey, Apache Axis, JOpera
• Pipe-based solutions–Yahoo pipes, Deri pipes, IBM mashup center
• Google ‘meta’ API
• Semantic approaches:–SPICES
–Lambert et. al
Challenges
• Lack of widely accepted IDL–Web APIs commonly described in HTML
• Underspecification–27.5% APIs that state the data-type of the parameters –60.4% Provide HTTP method
• Heterogeneity–47.8 % RPC-Style, 32.4% RESTful, 19.8% Hybrid–61.3% use optional parameters –51.3% use alternative values for a parameter –44.6% use default values for parameters –24.8% use coded values for a parameter
Why Lightweight Semantics?
• Non-invasive approach–Enriching the existing HTML documentation, NOT requiring
new description files
–Syntactic structuring via microformats inserted in the HTML
–Semantic enhancement via model references
• Based on a common Web API grounding model–Declarative specification of what the API does
–Abstraction layer over current heterogeneity
–Semantics as basis for task automation support
Invocation Step-by-Step
• Construct HTTP request:–Identify the HTTP Method
–Construct invocation URI
–Construct HTTP body and header
–Prepare the input data
• Actual invocation
• Process the HTTP response–Response handling
–Process the output data
–Present the output
–Error handling
• http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=Cher&api_key=***
Example Request
GET /2.0/?method=artist.getinfo HTTP/1.0 User-agent: curl/7.19.7 Mozilla/4.0
Host: ws.audioscrobbler.com Accept: */*
text/html, image/gif,image/jpeg Accept-language:fr
request line(GET, POST,
HEAD commands)
header lines
Example Response
HTTP/1.0 200 OK Connection: closeDate: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ...
status line(protocol
status codestatus phrase)
header lines
data, e.g., requestedhtml file
HTTP Response Status Codes
• 200 OK– request succeeded
• 301 Moved Permanently– requested object moved
• 400 Bad Request– request message not understood by server
• 404 Not Found– requested document not found on this server
• 505 HTTP Version Not Supported
• Custom Errors in about 50%
Requirements on Web API Descriptions
• Capture HTTP method• Operation definition
– If necessary, mapping of resource to operation definition
• Parameterized URI definition• Distinction between service and operation address• Input grounding• Input lowering / Output lifting• Distinction between the inputs and outputs as a whole and
their parts• Invocation relevant input (optional and output format
parameters)• Custom errors support
Current State
Web API Grounding Model
Data Grounding
• Which part of the input goes where?–Parameters belonging in the URI
–Parameters transmitted in the HTTP body
–HTTP header parameters
isGroundeIn property
• SchemaMapping expected transformation type–acceptsContentType and producesContentType
Granularity of the Input and Output
• MessageContent and MessageParts–Individual grounding of MessageParts
–Individual lifting and lowering schemas
–Optional and mandatory parts
–Parts relevant for invocation such as output format and authentication parameters
• Important for Invocation
• But also for discovery and composition
Invocation Engine
Data Transformations
Invocation Steps
LastFM API Invocation
Lowering
• Artist
• API Key
declare namespace foaf = "http://xmlns.com/foaf/0.1";declare namespace mo = "http://purl.org/ontology/mo/";{ for $artist_name $artist from <file:StaticInputFile>
where { $artist a mo:MusicArtist; foaf:name $artist_name. } return {$artist_name}}
declare namespace waa = "http://purl.oclc.org/NET/WebApiAuthentication#";declare namespace sioc = "http://rdfs.org/sioc/ns#";{ for $apikey $user from <file:StaticInputFile>
where { $user a sioc:UserAccount; waa:API_Key $apikey.} return {$apikey}}
Lifting
declare namespace foaf="http://xmlns.com/foaf/0.1";declare namespace mo="http://purl.org/ontology/mo/";
let $doc :=doc("OriginalOutputFile") for $listing in $doc//artist let $name := $listing/name let $id := $listing/mbid let $url := $listing/url let $image := $listing/image[@size='medium'] construct { _:p a mo:Artist;
foaf:name {data($name)}; mo:musicbrainz_guid {data($id)}; mo:homepage {data($url)};
mo:image {data($image)}; }
Invocation Example
• http://iserve.kmi.open.ac.uk/rest- invoke/service/{ServiceUID}/operation/{OperationName}– http://iserve-dev.kmi.open.ac.uk:8080/RestInvoke/service/db4b646a-4665-4337-
9626-4669cc8bce56/operation/ArtistGetInfo/invoke
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:waa="http://purl.oclc.org/NET/WebApiAuthentication#" xmlns:mo="http://purl.org/ontology/mo/" xmlns:foaf="http://xmlns.com/foaf/0.1" xmlns:sioc="http://rdfs.org/sioc/ns#"> <mo:MusicArtist rdf:about="#artist1"> <foaf:name>Cher</foaf:name> </mo:MusicArtist> <sioc:UserAccount rdf:about="#usr0"> <waa:API_Key>b25b959554ed76058ac220b7b2e0a026</waa:API_Key> </sioc:UserAccount></rdf:RDF>
Conclusions
• Current world of Services on the Web is very:–Heterogeneous
–Not conforming to standards or guidelines
–Suffering from underspecification
• Need of a unifying model capable of supporting invocation
• Web API Grounding model
• Invocation Engine
Wrap-up
• Current world of Web APIs is “messy”–Heterogeneity and underspecification
• Implications on discovery, composition and invocation–A lot of manual effort
–Low level of reuse of the implementations
Wrap-up
• Solution: lightweight semantics–Non-invasive approach
–Common abstraction layer overcoming heterogeneity
–Basis for tasks automation
• Web API Semantic Descriptions
• Web API discovery
• Web API invocation–Web API grounding model
–Invocation Engine
Hands-on Session
• Web API Annotation–Web API annotation with SWEET
–Semantic description publishing in iServe
• Web API Discovery–Without lightweight semantics
–Service search in iServe
• Web API Invocation–Invocation with the Invocation Engine
Thank You!Thank You!
http://sweet.kmi.open.ac.uk/ http://iserve.kmi.open.ac.uk/
http://www.soa4all.eu/
References
• SAWSDL– http://www.w3.org/2002/ws/sawsdl/
• WSMO-Lite–http://cms-wg.sti2.org/TR/d11/v0.2/
• hRESTS & MicroWSMO–http://cms-wg.sti2.org/TR/d12
• REST & RESTful Web services–http://en.wikipedia.org/wiki/REST
• Microformats–http://microformats.org/
References
• SPARQL–http://www.w3.org/TR/rdf-sparql-query/
• WSMO–http://www.wsmo.org and http://cms-wg.sti2.org
• OWL-S–http://www.daml.org/services/owl-s/
• SWSF & FLOWS –http://www.w3.org/Submission/SWSF/
• WSDL-S– http://www.w3.org/Submission/WSDL-S/