Upload
ganesh-prasad
View
315
Download
2
Embed Size (px)
DESCRIPTION
Many enterprise IT folk seem to believe that REST is only suitable for lightweight integration or for relatively simple data manipulation (CRUD). On the contrary, by applying well-understood design patterns, REST can provide capabilities that only traditional enterprise integration tools have been able to provide - high performance, asynchronous messaging, reliability, etc.
Citation preview
Implementing SOA* in the Enterprise using a REST** Approach
Ganesh Prasad
* SOA = Service-Oriented Architecture ** REST = REpresentational State Transfer
Intended AudienceThis presentation is addressed to the Enterprise Architect responsible for setting out a target and roadmap for the middle tier of an enterprise, what is commonly known as the “SOA direction”.
You, the architect, understand:
- Organisational structure, business drivers, strategies, politics- Product systems, applications, functionality overlaps and gaps- Interface and integration complexity- Qualities of service demanded and currently provided- Scale, volume and performance- Costs and risks
You are knowledgeable in a variety of disciplines.Yet you are also intensely pragmatic. You seek solutions to your problems without preconceptions and other ideological baggage.
This presentation is about a simple, cost-effective and extremely practical way to achieve your key deliverable, i.e., an implementable vision of SOA.
Recap: Fundamental Principles of SOASOA is an architecture for business and technology components wherein implicit dependencies are eliminated and all legitimate dependencies between components are stated as explicit contracts.
SOA can deliver greater connectivity, flexibility and reusability, with the ultimate business benefits being agility, lower time-to-market and sustainably lower costs.
The fundamental principles behind SOA are:
1. Explicit Boundaries2. Shared Contract and Schema, not Class3. Policy-Driven4. Autonomous5. Wire Formats, not Programming Language APIs6. Document-Oriented7. Loosely-Coupled8. Standards-Compliant9. Vendor-Independent10. Metadata-Driven
- Ten Principles of SOA, Stefan Tilkov (InfoQ) http://www.infoq.com/articles/tilkov-10-soa-principles
Observation: Web Technology Satisfies SOA PrinciplesFact: The Web is a simple, flexible, scalable, low-cost platform for application development. (Proof: The many millions of Internet/Intranet applications developed since 1994, when the Web officially began.)
Observation: The Web is not restricted to the transfer of visual HTML markup for consumption by human users with browsers. It can be used to transfer non-visual data (XML documents) between computer systems as well.
Insight: SOA objectives are achievable very simply using Web technology. There is no need to define specialised protocols like SOAP or use specialised adapters, brokers, registries and other new infrastructure.
SOA Principles
1. Explicit Boundaries2. Shared Contract & Schema3. Policy-Driven4. Autonomous5. Wire Formats, not APIs6. Document-Oriented7. Loosely-Coupled8. Standards-Compliant9. Vendor-Independent10. Metadata-Driven
Web Technology measured against SOA principles
1. URLs define endpoints, abstract away implementations2. Contract: URLs, HTTP verbs and XML document payloads3. Client and server can negotiate capabilities* 4. Dependencies on interfaces alone, not implementations5. Only needs a wire format - HTTP protocol + XML payload6. (XML) document is the HTTP payload7. Satisfies many dimensions of loose coupling**8. No dependence on proprietary/non-standard technology9. No dependence on vendor-specific features10. Easy to describe, easy to consume based on description
* E.g., “Accept” and “Content-type” headers** Location-transparent, interface dependency only, proxyable, can support asynchronous models (polling and callback)
Yes, but is REST truly “Enterprise Class”?This is really a question of confidence. A new concept (even a sudden return to basics) does not inspire confidence because it challenges established “truths”.
As enterprise architects, we are familiar with web technology.Many important product systems and applications have been built as web apps, and they work quite well.
But...
“HTTP is a synchronous request/response protocol. We have many enterprise requirements for asynchronous communication. A synchronous constraint would be too limiting in our context.”
“HTTP is not a reliable protocol. We already have far more reliable communication infrastructure in our enterprise, e.g., message queues.”
“I'm frankly skeptical. I don't believe everything can be reduced to a URI.”
“I also think that a CRUD interface is too simplistic to cover all possible use cases when dealing with resources.”
“A lot of very smart people from large organisations that understand enterprise issues have worked on SOAP, WSDL and all the WS-* standards. Are you telling me these REST people know something that they don't?”
Reframing situations leads to new insights
Old woman or young?
Practically, asynchronous behaviour refers to either fire-and-forget, polling or callbacks. Can't these be implemented with HTTP? Think Design Patterns.
Often, when we say “at-most-once delivery”, we really mean “at-most-once processing”. That's idempotence, not reliable messaging – a different problem. Also, transactional integrity is different from uncertainty over transaction status. What is the real problem we are trying to solve?
The URI abstraction is analogous to the file abstraction in Unix. Unix treats everything as a file, even processes (/proc/43437) and hardware devices (/dev/mouse). That abstraction more than just works. It makes the design of Unix elegant.
A CRUD interface isn't necessarily simplistic. It's polymorphic. Each resource responds differently to the same request. We know that polymorphism, more than inheritance, gives OO its power.
The smartness of committees: Everybody but Copernicus knew the Earth was the centre of the universe. Everybody but Columbus knew the Earth was flat. Who were these upstarts anyway?
What REST is not
Despite sharing the basic technology, REST is more than just traditional web application development. There are principles that an application must adhere to, to be considered RESTian. Many, perhaps most, web applications that have been built so far consciously or unconsciously violate REST principles. Indeed, complacency around REST (“We've been doing this for years”) can prevent exploitation of its benefits.
REST is not a set of hard-and-fast rules or a set of DOs and DON'Ts. It is not a methodology. It is considered an architectural style, which makes it too abstract for some. Nevertheless, it has a discipline that must be understood and applied before its benefits can be realised.
REST is not a product one can purchase from a vendor. It is not even an Open Source product. However, RESTian applications can be implemented using commodity products from any source. The only software component required is typically a programmable (dynamic) web server, ideally with XML processing capability.
High Level Overview of a RESTful System
URI
URI
URI
URI URI
URI URI
1. The System is modelled as a set of uniquely identifiable resources and collections thereof, based on how it needs to appear to external parties.
2. URIs are used to uniquely identify resources.
3. URIs are used to hyperlink resources together into a 'resource graph'.4. Regardless of the nature of the application domain (Banking,
Insurance, Airlines, etc.), consumers of the system's services do not use specialised verbs to interact with it. Generally speaking, four standard verbs corresponding to Create, Retrieve, Update and Delete are sufficient, although a limited superset is also possible. If this seems too fine-grained, remember that the resources on which they act can be defined to be arbitrarily coarse-grained.
ServiceConsumer
Create
Retrieve
Delete
Update
5. The resource graph is only a logical representation of the actual domain model that is exposed to external parties (service consumers). This representation is mapped back to domain objects as part of the SOA implementation. This is how loose coupling is achieved. The actual domain model is never exposed to external parties, only its representation as a resource graph supporting a few standard operations. That is the RESTful service contract.
Domain Model
Implementation
Semantics of HTTP Verbs in REST
GET means Retrieve Details when applied to a specific resourcehttp://www.xyz.com/customers/76772374 <= retrieve details of this customer
GET means List when applied to a collectionhttp://www.xyz.com/customers <= retrieve list of customers
GET can also mean Search (selective retrieval) when appropriate parameters are passed
http://www.xyz.com/customers/?postcode=2345 <= search for customers in this area
DELETE is applied to a specific resource, not to a collectionhttp://www.xyz.com/customers/76772374 <= delete this customer
DELETE only means the resource will no longer be accessiblethrough this URI after this operation. Further attempts to access this URI will result in a 404 error (resource not found). The actual domain entity may still exist (say in an archived fashion) and may be accessible by back-end applications that don't use the REST interface.
Semantics of HTTP Verbs in RESTPOST and PUT have more nuanced semantics.
PUT means “Create or Update” and always refers to a particular resource, not to a collection. If the resource does not already exist, it is created and the given URI is associated with it from then on, provided the data payload satisfies the correctness and completeness requirements for resource creation. If the resource already exists, the associated data payload determines which attributes of the resource must be updated.
http://www/xyz.com/customers/76772374 <= change billing address<billing-address>
... <= payload specifying attribute for update</billing-address>
POST means Create (Insert) and is normally applied to collections. A URI is generated by the system, associated with the newly-created resource and returned as part of the response to this request. POST is also a catch-all verb and may be used to handle odd cases that don't neatly fit the semantics of the other verbs.
http://www.xyz.com/customers/ <= create a new customer<customer-details>
... <= payload with full details of customer</customer-details>
Idempotence and Safety of HTTP Verbs in REST
Non-idempotent (hence unsafe) operations
Idempotent (but unsafe) operations
Safe (hence idempotent) operations
POST
PUTDELETE
GET
Venn Diagram showing the inter-relationship between Idempotence* and Safety**, and where the HTTP verbs used by REST lie within this area.
*An idempotent operation has the same effect when it is performed multiple times as when it is performed exactly once.**A safe operation has no side-effects. Queries/reads/retrievals are the canonical safe operations.All safe operations are idempotent, but the reverse is not necessarily true.
HTTP Status Codes and their MeaningsREST is not just about 4 verbs to formulate requests. The status codes of responses convey many nuances of meaning.
1xx – Informational100 Continue101 Switching Protocols
2xx – Successful200 OK201 Created202 Accepted203 Non-authoritative Information204 No Content205 Reset Content206 Partial Content
3xx – Redirection300 Multiple Choices301 Moved Permanently302 Found303 See Other304 Not Modified305 Use Proxy306 (unused)307 Temporary Redirect
4xx – Client Error400 Bad Request401 Unauthorized402 Payment Required403 Forbidden404 Not Found405 Operation Not Permitted406 Not Acceptable407 Proxy Authentication Required408 Request Timeout409 Conflict410 Gone411 Length Required412 Precondition Failed413 Request Entity Too Large414 Request-URI Too Long415 Unsupported Media Type416 Request Range Not Satisfiable417 Expectation Failed
5xx – Server Error500 Internal Server Error501 Not Implemented502 Bad Gateway503 Service Unavailable504 Gateway Timeout505 HTTP Version Not Supported
A REST Example
What would RESTful Internet Banking look like?The Customer resource:
http://www.xyz.com/customers/76772374 <= A particular customer, uniquely identified
http://www.xyz.com/customers <= The set of all customers
The Account resource:
http://www.xyz.com/customers/76772374/accounts <= This customer's accountshttp://www.xyz.com/accounts/675653-767973 <= A particular account (may be jointly held)
Account List Query:
GET => http://www.xyz.com/customers/76772374/accounts
may return many records in this form (note the hyperlink):<account-list> <account href=”http://www.xyz.com/accounts/675653-767973”
account-number=”675653-767973” account-balance=”5042.74”/>
...</account-list>
This data may be processed or displayed, and also used for further queries (below):
Account Statement Query:
GET => http://www.xyz.com/accounts/675653-767973 <= use hyperlink from previous query
What would RESTful Internet Banking look like?
Updates:
Change of Address (Idempotent but not Safe):
PUT => http://www.xyz.com/customers/76772374 <= update a particular customer's details
<address street-number=”23/56” street-name=”Rest Mews” suburb=”Epping” post-code=”2121” />
Funds Transfer (Inherently neither Idempotent nor Safe):
GET => http://www.xyz.com/transfers/?new
Returns a confirmation URI (one-time access only):
http://www.xyz.com/transfers/R5YU780A32JK9Y
Perform the transfer using this one-time URI:
POST => http://www.xyz.com/transfers/R5YU780A32JK9Y <= cannot be re-accessed, ensuring idempotence
<transfer-request from-account=”657653-767973” to-account=”876456-676786” amount=”1000.00”/>
Returns 202 Accepted (HTTP Status) the first time if successful, or an error status if not.
Accidentally repeating the transfer request will return a 405 (Operation Not Permitted) responsebecause the URI, being one-time, is no longer accessible after the first successful POST.This is REST's simple way of ensuring idempotent operations.
Implementing “Enterprise-Class” features with REST
PerformanceAsynchronous InteractionsReliabilitySecurity
Caching and Performance Optimisation in REST
Because GET is safe (and idempotent), it can be cached, improving performance.
Caveats:
1. GET must never be implemented with side-effects:
GET http://www.xyz.com/accounts/675653-767973?action=delete <= This is not RESTful usage
2. Time-sensitive data must support negotiation around expiry:
Send “If-modified-since” HTTP header along with the GET requestIf content remains unchanged, server responds simply with “304 Not Modified” headerand does not re-send the dataIf content has changed, server responds with “200 OK” response, a “Last-Modified” headerand the new data
Etags are another way to determine if content has changed.Etags are like a hash or digest of the content that indicates whether content has changed or not. A caching proxy can send a lightweight HEAD request to the origin server. If the Etag in the header has not changed, the proxy can safely serve up the response from its own cache without hitting the origin server with the full request.
Comparison with SOAP: Note that because SOAP-based service operations are arbitrarily named, there is no automated way to determine whether they are safe and idempotent. Hence it is not possible for infrastructural components like caching proxies to seamlessly improve performance without special (read: application-aware) configuration.
Enterprise Queueing vs REST – which is more performant?
Enterprise Message Queueing products are highly optimised for performance and support both vertical and horizontal scalability. At high levels of scalability though, such infrastructure can be quite expensive.
A standard web server is moderately scalable, but a load-balanced web server farm is much more scalable, especially for stateless interactions. Since REST is a stateless architecture, adequate performance for most applications is achievable with inexpensive commodity hardware.
The justification for enterprise queueing products is therefore much lower with a REST model.
Asynchronous Communications using REST
REST uses HTTP, which is a synchronous request/response protocol.
How can we use RESTful techniques to support asynchronous interactions? E.g., Long-running processes when service consumer cannot afford to “block” on the response.
Three standard patterns (independent of REST):
- Fire-and-forget (service consumer does not wait for a response)- Polling (service consumer periodically polls the status)- Callback (service provider calls consumer back when done)
How REST implements Asynchronous Interactions – 1Fire-and-Forget (Reliable one-way messaging)
ServiceConsumer
LoadBalancer
ActiveWeb Server
POST
202 Accepted
<data/>
ActiveWeb Server
FailedWeb Server
Heartbeat
Heartbeat
Heartbeat
Response
Response
Timeout
...
...
POST
<data/>
202 Accepted
Choose physical server based on LB algorithm
Merely an acknowledgement of the request, not the response
Web Server Farm
A single web server is less reliable and available than enterprise queueing infrastructure. But a load-balanced web server farm approaches queueing infrastructure in availability. Reliable one-way messaging (fire-and-forget) can thus be implemented simply and inexpensively.
How REST implements Asynchronous Interactions – 2Polling
ServiceConsumer
ServiceProvider URI
Status QueryURI
POST
Status Query URI
GET
404 Not found
GET
404 Not found
GET
200 OK
...
<data> ...</data>
202 Accepted Merely an acknowledgement of the request, not the response
Repeated Polling
<data/>
How REST implements Asynchronous Interactions – 3Callback
ServiceConsumer
ServiceProvider URI
CallbackSubscription
URI
POST
Callback Subscription URI
202 Accepted Merely an acknowledgement of the request, not the response
POST
Callback URI202 Accepted
CallbackURI
POST
<data> ...</data>
NotificationService
...
<data/>
Reliable Messaging/Guaranteed Delivery using REST
In many cases, even if we say we want guaranteed delivery, we are really concerned about the uncertainty surrounding transaction status.
1. “Guaranteed Delivery” is a chimera when we also want acknowledgements of the delivery. This reduces to the “Two-Army Problem” of networking theory which is proven to be unsolvable.
2. “At-Most-Once Delivery” is in fact a requirement for “At-Most-Once Processing” (i.e., Idempotence).
3. Uncertainty is less of a problem if idempotence can be guaranteed. If in doubt, retry the request!
Enterprise Queueing vs REST – which is more “reliable”?
ServiceConsumer
ServiceProvider
Request Queue
Response Queue
1. The Service Consumer places a request message on the Request Queue.
2. The Service Provider pulls the request message off the Request Queue.
3. The Request Queue confirms to the Service Consumer that the message has been delivered.
4. The Service Provider processes the message (e.g., updates a resource in a non-idempotent way).
5. The Service Provider places either a Success or Failure message on the Response Queue.
6. The Service Consumer pulls the status message (Success/Failure) off the Response Queue.
7. The Response Queue confirms to the Service Provider that the message has been delivered.
The two queues guarantee message delivery and even confirm delivery to the application at the other end. However, the possibility of a fatal error between steps 2 and 4, or between steps 4 and 5, means that end-to-end guarantees of the business transaction are not possible.
In the first case (fatal error between steps 2 and 4), the message is not acted upon and the resource is not updated.In the second (fatal error between steps 4 and 5), the message is acted upon, the resource updated, but no status message is placed on the Response Queue.In either case, the Service Consumer fails to receive a response, hence is uncertain whether the transaction was processed or not. Since the operation is non-idempotent, it cannot be safely retried.
Hence guaranteed message delivery alone is not a solution to the critical requirement of an Exactly Once business transaction.
Enterprise Message Queueing infrastructure is expensive, but ironically does not address this business requirement.REST addresses the business requirement through the POST-Exactly-Once pattern which guarantees idempotence and makes uncertainty a non-issue. It is also inexpensive.
Even with reliable queues, round-trip reliability is impossible to achieve
The Service Consumer wants to
be certain about the status of their
service request, whether success or
failure. Is this possible using
reliable queueing?
How REST implements Idempotent Operations
ServiceConsumer
ServiceProvider URI
One-Time URI
GET
One-time URI
200 OK
The consumer is asking for a one-time URI. This is an idempotent operation and can safely be done any number of times, as long as only one of the returned URIs is POSTed to.
POST
<data>...</data>202 Accepted
...
405 Operation Not Permitted
The first time the POSTed data is received and processed, the URI is marked “used” and will no longer be valid.
The URI is no longer accessible. This indicates that a previous POST was successful.
POST
Even if we assume that the consumer
does not receive the acknowledgement of
the transaction, there is no danger of a duplicate update if
the POST is retried. Idempotence is
guaranteed by the one-time URI.
The POST-Once-Exactly (POE) Pattern
Many enterprise requirements for “guaranteed delivery” (where the concern is not timeliness but the impact of erroneous retries due to uncertainty) can be satisfied by ensuring idempotence. A focus on the real underlying issue makes REST an attractive and less expensive solution compared to strong queueing infrastructure.Besides, the point-to-point guaranteed delivery of queueing systems does not provide round-trip guarantees, which idempotence addresses (next slide).
<data>...</data>
The one-time URI may be
persisted in case a crash occurs
between the POST and the receipt of its
response.
REST and the Presentation/Service Divide
Traditionally, this is the way we have viewed Presentation and the Service Tier:
PresentationSupport
(Web server)Business Logic
Non-visual interfaceVisual interface
But this is the REST model:
Browser
Application
RESTfulResource
GET (Accept: text/html)
GET (Accept: application/xml)
HTML response
XML response
Browser
OtherApplication
The same resource can return various representations of itself, both visual and non-visual. In effect, a web app can itself provide a service interface.
The resource responds with a representation of itself that suits what the service consumer says it wants through the “Accept” HTTP header.
How does REST compare in terms of infrastructure cost and development effort?
SOAP/WS-*:
● Application Server
● SOAP Engine
● ESB/Broker
● Message Queues
● Legacy/Infrastructure Adapters
● Registry/Repository
● Specialised Dev Tools
● Specialised Management Tools
$$$ + development effort
REST:
● Any programmable Web Server, e.g., Tomcat or Apache/PHP (Full-fledged JEE App Servers are overkill)
● Legacy/Infrastructure Adapters
● DNS Server
● Standard Web + XML Dev Tools
● Standard Web Management Tools
Most infrastructure already exists.
Much less development effort
Infrastructural components required for REST
Programmable Web Server/Servlet Engine
CICSTransaction
Gateway
Web container
Hibernate/JDBC
JMS
IMSResourceAdapter
Mainframe
CICS
IMS
Queue
Database
DO
DODO
JavaDomain Objects
ResourceCollections
IndividualResources
GET
PUT
DELETE
GET
POST
Service export(loosely-coupled mapping)
REST ServiceInterface
ServiceImplementation
Client App
HTTPClientlibrary
Browser
HTTPNative
Protocols
Servlet, Restlet,JSR 311 annotations
Legacy Resources
If using Java, a web container is sufficient to host domain objects. There is no need for an EJB container. The domain model represented by these domain objects may be translated to the REST service interface using servlets, restlets or the newer JSR 311 annotations. Client applications need an HTTP Client library to consume these services.
Industry support for REST
New Java standard to expose REST services through annotations:JSR 311
REST implementations:IBM – Project Zero (PHP based REST server)Microsoft – Astoria (.NET implementation)Sun – Jersey (JSR 311 implementation)WSO2 – Mashup Server (JavaScript-based server)
REST APIs:Amazon eCommerce APIeBay Developer APIYahoo! Web Service API
But why is REST so much simpler than SOAP/WS-*?From one angle, any SOA implementation is just a means of moving XML documents around, because XML documents formalise the contract between components in a technology-neutral way.
SOAP/WS-* defines one kind of “plumbing” to move XML document payloads around.
REST defines another kind of plumbing. The only known implementation of REST is based on HTTP. This has proven to be sufficient for virtually every enterprise use case.
SOAP places unnecessary emphasis on transport-neutrality. Transport-neutrality is a feature with no practical benefit. The downsides of transport neutrality are (1) a failure to exploit the many useful features of the HTTP protocol and (2) a necessity to reinvent the same features at a higher level of the stack.
New infrastructural components are required to understand and speak the SOAP protocol. Such components already exist for the REST protocol (HTTP), i.e., web servers.
“Things should be as simple as possible, but no simpler” - Albert Einstein
Salesman: “This machine will cut your work in half.”Customer: “Fine, I'll take two!”
REST is not for the intellectually lazy. It demands rigour in design.
The XML documents corresponding to the various service contracts must be carefully designed. The data modelling effort remains significant.
But fortunately, that is the only major component in designing and building RESTian systems.
REST provides much simpler plumbing, so the complexity of the infrastructure and the related configuration effort are dramatically reduced.
In other words, REST makes SOA simpler by eliminating needlesscomplexity.
What REST will not do for you
Tasks that still need to be done:
Need for domain data modelling does not go awayNeed for service contract does not go away, must decide
what resource abstractions to exposeNo built-in security model, need to leverage SSL or IPSec
at wire protocol level or implement bespoke end-to-end security model at payload level.
No built-in reliability model, must rely on design patterns to achieve same outcome (e.g., idempotence)
Governance tasks still remain, although REST, being “web style”, is inherently more federation-friendly
Conclusion
SOA design need not be hard. There are a few simple and basic principles that need to be applied consistently (fundamentally, it's about loose coupling between systems).
These principles are harder to apply with the SOAP/WS-* model.
The REST style involves an order of magnitude less complexity than SOAP/WS-*. Everything is simpler – the conceptual model, the infrastructural components, the tooling, the metadata required, the level of governance, etc.
The biggest impediment to the adoption of REST:“Fear of the unknown”
References and further reading1. How I explained REST to my wifehttp://tomayko.com/articles/2004/12/12/rest-to-my-wife2. REST for Toddlers (HTTP Status Codes explained)http://diveintomark.org/archives/2006/12/07/rest-for-toddlers3. REST Eye for the SOA Guyhttp://dsonline.computer.org/portal/site/dsonline/menuitem.9ed3d9924aeb0dcd82ccc6716bbe36ec/index.jsp?&pName=dso_level1&path=dsonline/2007/01&file=w1tow.xml&xsl=article.xsl (or do a Google search on the title)4. A Brief Introduction to RESThttp://www.infoq.com/articles/rest-introduction5. The Lost Art of Separating Concernshttp://www.infoq.com/articles/separation-of-concerns6. Common REST mistakeshttp://www.prescod.net/rest/mistakes/ 7. Sample REST APIs from the real world:
a. Blinksale (a Paypal-like service):http://www.blinksale.com/apib. Backpack (a Travellers guide):http://www.backpackit.com/api/c. Assembla (Development and Issue Management Tool vendor):https://www.assembla.com/wiki/show/breakoutdocs/Assembla_REST_APId. WSO2 Registry (and Repository):http://wso2.org/projects/registrye. Mule Galaxy SOA Governance Tool:http://www.mulesource.com/products/galaxy.php