Upload
chinh-nguyen
View
421
Download
0
Embed Size (px)
Citation preview
Web API Design
Ngo Nguyen Chinh
Hanoi 2016
(*) Here are my notes after completing a course of study offered by Pluralsight.
1. Introducing Web API Design
Web API Design
This course includes pragmatic advice on what are Web APIs, some basic Archetypes
for API Designs, how to Version Web APIs, how to Secure Web APIs, and finally we're
going to talk about Hypermedia, and what people mean when they talk about it.
This course isn't meant to teach you how to build APIs. This course is meant to help you
design APIs. The focus of this course is to help you Design an API, and the course looks
at the API from the view of the developer that is using your API. This is not specific to
Microsoft's ASP.NET Web API technology, it's about APIs on the web in general. This
includes REST APIs, we're going to touch a little bit on RPC, and Hypermedia or
HATEOAS APIs. These APIs can be written in a variety of languages; this course is not
about any specific language.
Let's get started.
What are Web APIs?
We're going to talk about the introduction of what we mean by Web API and the different
technologies that are involved. We're going to start by simply talking about What are
Web APIs? We will then move on to Why Design is Important to designing your APIs,
and then we'll discuss the different kinds of APIs, the nature of REST, The Role of
HTTP, and finally what is Hypermedia. All of these are going to be related to building or
APIs.
So the central theme of this course is going to be the nature of what APIs really are. We
want to be able to take a look at how people are going to be consuming services or data
that you're going to be exposing. This is the essential interface to your application, to
your system, to your architecture. So these APIs represent a way for the consumers of
your API to be able to access those services and data in the simplest way possible.
Now back in the day we would design these APIs, and then we would publish them with
some printed materials or a help document, or things like that, that would help them
understand exactly how to use them. The APIs we're going to talk about should be much
more self-describing. We're going to be using technologies like REST and Hypermedia
to help us understand the nature of what we're building without being bogged down in
the dogma of some of these key words that has caused arguments through the
community; we're going to try to cut through the dogma and really talk about the
pragmatic way to design these APIs.
This design is important because the developers that are going to be consuming these
APIs are going to want to be able to see how to use them in a fairly simple way. They
should be not only self-describing but self-documenting as much as possible. Being able
to look at the nature of what calls are available, and should be natural to take the next
step and get more data or use more services in your applications, whether this be if
you're exposing something like customers from an API, the orders for the customers and
the line items for those orders should be a natural progression in the API; I shouldn't
have to go back and look for each call in my system to look at interrelated data. In that
same way, when I want to deal with the different operations on that data, it should be
fairly clear whether that's using Hypermedia to describe those operations, or whether
using HTTP verbs to really hint at what are the other operations that are allowed.
Let's talk about the API ecosystem on the web today.
The API Ecosystem
When we say Web API, what do we really mean?
In the beginning of web development, when we needed APIs we typically relied on
something called a Remote Procedure Call. This was really a hearken back to an
earlier time when we were building systems that needed to talk to each other by using
things like COM+ and DCOM or CORBA in the Java world, in order to create systems
that could talk to each other over a network connection. When we came into the web, we
decided, hey we already know how to do this idea of Remote Procedure Call, let's just
do it over the HTTP layer that we're using to communicate and deliver our websites as
well. Remote Procedure Call is typically identified by a couple of ideas. One is that
they're going to use URI Endpoints or address to get at certain pieces of functionality,
services or data, as we talked about before. But unlike some of the technologies we'll
look at, the verbs are typically included in these APIs. For example, we could look at
an API that said Get Customers as part of the URI. So, if we had /API/GetCustomers,
that is more of the type of operation that we would see in typical Remote Procedure Call
systems.
Back in 2000, this idea of REST sort of blossomed, and there was some discussion
about REST versus things like SOAP, but REST has become a common pattern for
building these systems, and REST is different from RPC, it also uses URI Endpoints,
but it typically dictates that the URI should be resource-based. So, instead of Get
Customer it would just be Customers and Orders and Invoices, and the other types of
objects in your system, instead of including the verbs in the API name in the URI
Endpoint, it does this with HTTP verbs. So, if you want to be able to get at the customer
list, you would simply issue a Get HTTP command to that customer's endpoint; if you
wanted to create a new one you would post to it, etc. And REST also dictates that the
server be stateless, so that as we make additional calls into the system, the server isn't
holding on to some state that we need to be remembered by, and this is something that
happens an awful lot in RPC where we have some token or some session state that
knows about us every time we call. REST-ful is indicated by trying to stateless. And
then finally, the kind of data that you're pulling back or the type of result from those
services typically isn't tied to a single type of format. There's a Content Negotiation,
which we'll talk about in a few minutes, to help the services figure out how the client
needs the data. If the client is a webpage, something like JSON or JSONP is
appropriate, whereas if they're dealing with something like a rich client they might be
more comfortable or easier to manage with something like XML. The server should care
less about what that content is and allow a negotiation to happen to determine how to
return that data.
And finally, somewhat more recently, this idea of something called HATEOAS, not an
acronym I'm particularly fond of, but it stands for Hypermedia As The Engine of
Application State. Essentially this adds onto the idea of REST-fullness or REST
interfaces, and includes in the payload links to do other operations. So, there will be
links inside of the payloads of the data from these services that will indicate to the user
and to the user of the API other operations that can be successful. You might have the
idea of submitting a new invoice, and getting an old invoice might give you the URL on
how you would submit an updated version of that invoice. This is the idea behind
Hypermedia; we'll talk a little bit more about that soon.
Resource-based Architecture
Before we dive into the actual design, let's start with some of the foundations, and one of
these foundations is the idea behind a Resource Based Architecture.
Resources are simply put, Representations of Real World Objects or Entities. We
can think about these as People, Invoices, Payments, other things in systems you're
building; you're probably already doing this in your existing software development
career, your created classes, or structures, or databases to store this sort of information
and consume this sort of information. We're simply saying that what we're talking about
is going to start with this idea of resources. In these resources, relationships are typically
nested down a path of those resources. So, if we have the idea of a customer, the
customer may have a relationship to its own orders, and those orders might have a
relationship to their own order items, and those order items might have a further
relationship to the products that are being purchased in each of those line items. You
should think of these as Hierarchies or Web information, not necessarily Relational
Models, because the kind of data that you're going to be dealing with and you're going to
be producing in these APIs, and consuming, is going to be typically Hierarchies and
Webs, not Relational Models in the sense of tables and related tables in the strict sense
of relational databases.
In these Architectures, these Resources are normally Represented as URIs. These
URIs are Paths to those Resources, so when you want to create your APIs, you're going
to want to use URIs to get at those Resources. Query Strings are often used in these
URIs as well, but for non-data elements. So you don't want them to represent verbs, you
know operations against those resources, and you don't want them to represent the
actual data. Often they're used for different purposes, like sorting, maybe filtering, and
sometimes what formats you're getting it back at.
Let's see what this looks like in REST.
Introducing REST
So when I started to do the research for this course, I reached out to some other authors
of courses out there to talk about what does REST mean to them, and unfortunately, this
has become a fairly contentious discussion.
I found that a lot of people are either on the one side of saying that REST as a
philosophy isn't super useful because it's so dogmatic; the constraints it puts on a
system to be blessed as being REST-based becomes overwhelming and not useful to
the day-to-day developer.
On the other side I heard arguments that REST is really important because it helps us
dictate how we want to design those APIs, and that the specific constraints of the
original discussions about what REST is and what REST isn't aren't as useful.
So my goal here is to talk about REST in a very pragmatic sense; how REST can help
you design good APIs, APIs that are going to stand the test of time, that won't have to
change often in order to deal with new constraints, but also leverage what is good about
developing APIs on the web.
So what is REST?
The term REST simply means Representational State Transfer. When we talk about
Representational, we typically mean the resources that we want to transfer across the
wire. These Representational States are typically resources, customer's orders, details,
products, etc. But in order to be considered REST-ful, there are some concepts that Roy
Fielding included in his original papers on this, and we want to really understand what at
the end of the day is useful in REST, and take what will help us build great APIs from
REST, and sort of leave the dogmatic strictness of REST sort of on the table. The
concepts that come clear from REST that I think are important is the Separation of
Client and Server, that the clients are going to call into the server based on URIs, and
the server is going to try to meet those URI requests, whether that's returning data,
whether that's adding data to the server, whether that's changing or deleting data, and
that each of these Requests should be Stateless. There's no notion of who the client
is so that the servers can be scaled out more seamlessly. And that where possible, as
many of these requests can be Cached as possible. When we talk about Cachability,
we're typically talking about caching of data results, so typically gets into a system, you
aren't really going to be able to cache insert of new items or deletion of items, but being
able to cache what Requests are there for getting data, and as long as the data hasn't
changed, you can be pretty aggressive about your caching. And we also want to make
sure that we're talking about these Uniform Interfaces, that when someone comes up
to our API that we're really saying that if you were able to get through an API a customer
object, and you know that there are order objects out there, that you can probably get
them with the same pattern that you got them from the customer object, and so that
really means that you may be walking down the URI by saying customer/1 for the first
customer, /orders to get the orders for those customers; I should also be able to say
orders/want to get the first order in the system as well. These URIs are going to look like
each other, and that's what we mean by Uniformity or Uniform Interfaces.
All of this is really good when we think about what is useful for defining what REST is.
Some problems come in that the specific constraints in Roy Fielding's work to qualify
your interface as being strictly REST-ful or not REST-ful, tends to add a lot of
constraints to the system. In my experience, trying to make your API strictly REST-ful,
or adhere to the REST principles, means you're spending a lot more time trying to follow
the letter of the law instead of the spirit of the law, and at the end of the day, worrying
about whether you're in this walled garden of what it is to be REST-ful or not REST-ful
isn't getting your job done. So I like to take what is good about REST, bring it in
pragmatically into what I'm building, but not worry so much about strict adherence to
those principles in a black and white way. And a lot of this comes about that when we
talked to different experts in the community or even developers in the community,
there's a split about how important the idea of REST is, because REST can
become very dogmatic. REST can be worried about strict adherence to a defined set
of rules instead of getting our job done as developers. We can learn a lot and pattern a
lot of what we should be doing from REST, but worrying about never straying from that
wild garden can get us in trouble in my experience. So, I'm going to teach this course
really from a pragmatic sense.
I'm going to try to take what is best from REST and apply it to your API designs.
Hypermedia
So the last big piece of the puzzle is this notion of Hypermedia, but let's step back a
minute and talk about Hyperlinks. The web is really drawn in by this notion of Hyperlinks,
and it was a core concept in the creation of the web initially, the ability to have
documents and websites that are linked to each other within themselves, to create really
a web of information, to have each of these different pieces around the internet really
tied to each other.
Hypermedia is a little bit like this. When we talk about Hypermedia, it's really a way for
the results of our API calls being as Self-Describing as possible. Hypermedia is simply
a way to have links of resources that describe how to process the data or how to
get at the data in special ways. These are Hyperlinked for Resources, so you can
imagine that the links may include ways to get the cover for an album, it might be a way
to add new items to collection; it's a way that the messages that we're getting from our
APIs are going to tell us more about how to use the service itself.
Is this important?
Hypermedia is HATEOAS. This Hypermedia As The Engine Of Application State, is a
design pattern that you're seeing in more and more APIs. Using it doesn't make your API
better or worse. Depending on what you're trying to accomplish, this can be very useful
or just additional overhead. If you are creating APIs that have special ways of describing
what they need to do, or maybe needing to be implemented by machine systems, this
can become very important. The idea of HATEOAS or Hypermedia is there so that you
can create these self-documenting APIs that can be very dynamic. But you can have
great APIs without HATEOAS. In fact, there are many, many APIs out there that are
considered solid and dependable and well-documented, that don't have any
Hypermedia. Again, don't get caught up in that dogma of your API isn't good enough
unless you're using every part of that REST stack, including Hypermedia.
What kind of API to use?
With all this information about how REST works, you still have to tackle the question of
should you be using REST, and should you be using it in the strict sense of what REST
mavens out there expect your API to be.
The Archetypes or the types of APIs that are out there do vary, and so you have to look
at what you're trying to accomplish and see what makes the most sense for your specific
project. REST or REST-ful APIs are easy to use and maintain, but if your API doesn't fit
this Resource-based Model, using something like Remote Procedure Call style or some
custom APIs, is acceptable. You may find that REST is too limiting for what you're trying
to do; maybe it doesn't fit into a resource model, or maybe you really need something
that is more driven by procedures, so something like Remote Procedure Call or a
custom API may be the right solution. Trying to take what you need to get accomplished
done and fit into the REST or REST-ful model can be counterproductive, so don't get
caught up in the idea that your API has to be REST or REST-ful to be a good and valid
API, but remember that REST matches many, many use cases, so make sure that
you're not avoiding REST just to avoid REST. I generally say to clients that if you're
building a new API, starting by designing and using a lot of the REST Symantec we've
looked at here, is the way to start the approach. If you find that it becomes too limiting or
you're trying too hard to make it REST-ful, then you can start breaking out of that box,
but starting with REST as the natural starting point for your API design is usually what I
suggest.
Summary
Now that you're through the first module, you should see that API design is really
important.
Designing APIs are as important to developers as designing UI layers are to users.
By making your APIs easy to use, obvious, and simple, that's going to increase the
adoption of your API; you're going to get more and more developers using that API,
which is often the goal of any API.
Looking at the requirements of your API and then deciding whether your API should
follow a strict REST pattern or something more flexible is going to be a key to whether
you're successful or not. There are no right and wrong answers here. You're going to
have to make clear decisions about how you want to design these APIs, and know that
you may make mistakes. There are better and worse suggestions here, but there are no
black and white right or wrongs.
And I know I've been saying it a lot in this module, because I think it's really important:
Avoid the dogma of what everyone thinks is the perfect and great API and understand
that as developers we need to be pragmatic about these designs. If your decisions about
how to create your APIs continue to be pragmatic, and what I mean by that is that it's
going to serve the final use cases and serve the developers who want to use your API,
then it is probably the right decision. If you're making design decisions about whether it
will be classified as a true REST-ful API, you're probably making the wrong decision.
2. Designing The API
Introduction
In this module we're going to be talking about designing the actual API itself. We're
going to start by talking about how to Design for the URI, Understanding the role of
Verbs, dealing with Status Codes, Associations, Designing the actual Results that are
returned, ETags, Paging and Partials, and finally Non-Resource based APIs.
Let's get started.
URI Design
So to begin our URI design, we're going to want to look at what the URIs actually look
like, and one of the ideas I want you to get a sense of is that the URIs should contain
Nouns not Verbs.
The problem with verbs is that we start very innocently with something like
getCustomers and saveCustomers, but very quickly this starts to balloon into a bunch of
different sorts of things we want to do with the data. Now instead of having just one
endpoint that we're maintaining, we're having to maintain a number of different endpoints
as different parts of our URI.
The solution is to use Nouns, or Resources, as REST likes to call them. The idea here is
that you want to have endpoints that are described as Plural versions of whatever nouns
you're going to expose. For example, this API has an endpoint called Customers; you
may have Games, Invoices, so you're going to use a Noun to indicate the endpoint that
is going to allow you to manipulate a type of object that you have on the server and that
you want the users of your API to be able to use.
Against those endpoints you're going to use identifiers to point to individual items in the
collection. This does not have to be the key that you use naturally. It does not have to be
the primary key that is contained in the database or some magic key, it can be one that
is generated. For example, being able to get a customer by its key, the API is simply
saying look at that noun, the Customers, and then give me the item that is identified by
the number 123. This may be the ID or something else. An example of something else is
something like Games where you have some unique identifier to find the individual
game, like halo-3; or in the case of Invoices, a date for that invoice. The important idea
here about that key is that it does not have to be the internal key that you're used to
using. It can be something generated, but it has to point at one and only one item. It
can't be generated every time someone comes to the API, because there's this notion of
item potency, and what that means is that when someone retrieves data or pushes data
using this key inside of your API, that it always refers to the same object. Very commonly
this is going to be some primary key that you have in your storage mechanism, but it
could be something that is unique, like the full title of a story in a blog, or it could be the
concatenation of data as it's related; maybe customernumber-invoice number for
invoices. It's up to you to determine what that key is, but it has to remain tied to that
single entity.
Understanding Verbs
So, if your URI design is supposed to be nouns, where do Verbs come in?
In the case of verbs, we're talking about HTTP verbs, and these verbs can be easily
matched to the create, read, update, and delete that we're probably used to when
dealing with things like database data. So, if we take our resource endpoint like
customers, we know that if we do a GET it's going to return a list of those
customers. If we POST a new customer to that endpoint, it's going to create a New
Customer. If we PUT a collection of customers to this endpoint, it will do a Batch
Update of those customers, and if we try to issue a DELETE against that resource,
it's going to give us an error, because we cannot delete the entire list of customers.
Conversely, if we issue these same verbs against the item collection itself, it's going to
do something a little different. So, if we do a GET it's going to get that individual item for
us. If we do a POST it's going to give us an Error, because we can't POST a brand new
item to an existing item. If we do a PUT it will Update that Item, and if we do a DELETE it
will Delete that Item.
But what should you return from those verbs as you encounter them?
o In these cases we're going to look at those same verbs, but figure out what we
have to return back to the client so the client knows what to do with them.
o So in the case of GET it should be pretty obvious; the customer's endpoint
returns a List of those customers. A GET to the Item endpoint is going to return
just the individual Item that the user has pointed at.
o If we POST, like we talked about before, to the customers with the data that
represents what is a customer, it will create a New Item, assuming it's all correct.
And we should return from that POST a new version of the item that was
inserted; not only that the creation happened, but a formatted object that
represents that New Item. And the reason we do that is sometimes part of that
creation process is setting things like default properties and generating the key
that they're going to need and things like that. So returning them that brand new
object is very useful for them to be able to consume what is really the last version
of the object as it existed on the server, which is just a moment ago when we
created it. If we attempt to POST to an individual item, because we can't POST to
an item like we talked about a minute ago, we should return a Status Code, and
that Status Code should be an Error Status Code, probably a 400, in that the
user of the API has done something incorrect.
o In the case of PUT, if we take a collection of updatable objects and PUT them to
the customer's endpoint, it's going to attempt to update them all and then return a
Status Code to say whether it succeeded or not. But if we do a PUT to the item
resource, it will return the updated item because it makes sense to return an
individual item that was updated, and it may have been updated with more than
just the data that was sent to the server; it could include things like the last
updated date or update-related keys, and so you always want to return that
updated item.
o In the case of DELETE, because we can't delete the entire list of customers, we
should return a Status Code that is an Error Status Code when a DELETE is
done on the resource of customers. But if we delete an individual item, we should
return a Status Code of whether you were able to delete that individual item.
Status Codes
When we talk about returning Status Codes, we're talking about HTTP Status Codes.
So, the HTTP Spec Defines certain Status Codes, and there's quite a number of them;
not all of them are listed here, but a number of them that are pretty common that you're
going to see, 200 OK being the most common in that a request has succeeded, and of
course everyone knows about things like 404 Not Found, and 500 Internal Error, and
you can see some of the other Status Codes here. Now our services can use all of these
Status Codes if we want, but we find that in a well-defined API, that the number of
different sorts of Status Codes that can be returned from your particular service can be
simplified into maybe 8 to 10 sort of Status Codes. There are exceptions to this, and you
may need more than 8 or 10, but trying to get a sense of what different Status Codes
each type of call can return, simplifying that is going to make it easier for user of your
API.
Pragmatically, you want to use these Status Codes in returning from your API.
o At a minimum you should really support 200 for everything worked well, 400 for
you did something wrong, you made a bad request, or 500 for the server doesn't
know what it's doing and something has gone bad.
o Most likely you're going to also include these Status Codes: 201 for Created for
when you're doing a POST, Not Modified for when you're returning a cached
object, 404 for Not Found, and then 401 and 403 for when you're dealing with
Authentication and Authorization. Again, you may use more Status Codes than
just these, but this is a good simple set to start with.
Associations
So far we've talked about simple collections in your URIs or the list of customers, and
then individual items in that collection using a key. Associations are that next level of
object.
Associations are about sub-objects of other objects
o And we want to use the URI Navigation path to imply that there's a relationship
between them. So, for example, to get all of the invoices for a particular
customer, you could add another part of the path that says get the invoices for
this particular customer. You could see getting the Ratings for a game, or getting
the Payments for an Invoice. So, we're talking about getting information that is
contextual to the object that it's behind.
o These Associations should return a List of those Related Objects or a single
Object if that's the kind of relationship it has. So that if we look at the API for
getting the Invoices of Customers 123, the shape of that result should be the
same as if we just went and got a list of invoices. That way the user of your API
can really deal with it in that same way; you're really only telling it that by using
this path, you don't have to issue a query against Invoices or walk through to find
Invoices for a customer, you're going to return only the ones that are relative to
that item in the collection.
There may be multiple Associations for the same object. So, while we've looked at
Customers having Invoices, Customers also may have Payments, and they also may
have Shipments, so you can have multiple Associations for each type of object, you just
have to make sure that you're dealing with each of those in the correct way.
If you have more complex needs, you should just use query string parameters to
deal with them. For example, instead of having states that have Customers and those
Customers have Invoices, it might be just simpler to allow you to do something like a
query where you can say Customers?state=GA or Customers?state=GA and is from this
individual salesperson. So, instead of trying to fit everything into simple entities or simple
Associations, use Associations where it makes sense, but understand you can go to
query strings in order to get very specific data as necessary. Associations are an
important part of what you're going to design for your API, but don't try to rely on
Associations to solve every related entity problem you have in your API.
Formatting Results
So what about the formatting of the results that are returned? How do you know what
format they should be in?
The best practice is really to use Content Negotiation
o And what that means is to use an Accept header in the request to the server
to determine what formats are supported. The idea behind the Accept header
is simply to tell the server what kind of data you can accept; I can accept HTML,
text, RSS, whatever it may be. So here you can see a simple request going after
our endpoint of games, and looking for that second game like we saw in our
example earlier. But here we're telling an Accept header to hint at what kinds of
data we can expect. Now the Accept header takes a common delimited list of
types of data that you can accept. Ordinarily when a user is calling your API,
they're only going to list one type here, and that's the type they expect to get
back. When there's more than one, the server looks at the list and finds the first
one that it can match. So, in this case, if the server supports JSON, it will always
return JSON. But, if for some reason the service didn't support JSON but did
support XML, it would fall back to XML.
o You do not have to support all of the types of data that the Accept header
puts in it as well; it may accept quite a few different formats, many of which
you're not going to support, so you'll be able to look at the list and find when it
matches the kind of data that you can format, and it's also a good idea to have a
sane default; XML or JSON as the default is pretty common, and being able to in
case the Accept header isn't included to fall back to a simple well known format,
and to be able to fall back to a sane default, usually JSON.
So the MIME types for Content Types are probably useful for you to know and
understand. JSON is application/json, XML is text/xml, JSONP is
application/javascript. JSONP is a JSON message wrapped in a JavaScript function.
This is often used for cross domain calls, so if you want to be able to support JSONP,
which is a pretty good idea, you're going to want your API to be looking for
application/javascript as something that's different from application/JSON. It's important
to know that when you're using JSONP, your API is going to require something like a
callback query parameter, because that's going to be the name of the method that's
going to be called with the JSON data when it returns. So, in this case you can see our
API here is going to need a parameter, usually called callback, and then the name of
some function to call, which the user of your API would specify. RSS is application, you
can see it here, and ATOM, those are two other fairly common formats; I find that most
of the APIs I've written in the last few years are really focused on the top there; I wanted
to show you that there are different sorts of Content Types. You might even find APIs
that returned non-textual data, so you might support things like img jpeg and img png.
So let's see how this Content Type matters.
There is another approach that in some cases can be helpful, but I would really lean on
Content Negotiation when you can, and this other format is being able to use URI
components to do this formatting. I certainly don't consider this a best practice, but
sometimes it is easier to do this when you have specific requirements for your data.
These are often cases where the consumer of your API can't modify things like Accept
headers, so if you are running an API that you also want to be able to get at from let's
say, Excel or something like that, being able to add a URI component to do that
formatting can be helpful. So an example here would be to include a query string
parameter that defined what format you were going to support. I've seen some other
cases where some APIs also used an extension; I'm not a big fan of this style, I'd rather
have the query string, but that is one approach. And in the case of JSONP, you're going
to be able to not only do the format but also include other query parameters you may
need like the callback parameter.
Result Design
So in designing the actual results you're going to send back, there are simple rules of
thumb I like to talk about.
When your API returns single results, those single results or individual items, should
be just simple objects, whether they be XML or JSON objects. So, if I'm going after
Customer/123, I should expect that the format of the data coming back should be just an
object that represents the data in that object. Now it may contain related or complex
types in it, like we can see with the address here, but it is simply an object that
represents that item.
When defining these objects, I suggest that Member Names shouldn't expose who
wrote the server. I see this a lot where you can see if Ruby and Rails is used there are a
lot of underscores, and if NoJS is used it's camel-cased, if .NET is used sometimes it is
even Pascal-cased, and I hate to do that; I like to pick one format that all my APIs are
going to use regardless of what the background is. I tend to prefer, because most of the
clients that I'm writing APIs for are JavaScript, to just use camelCasing. CamelCasing
ends up being the one that most developers are used to and camelCasing is the way
objects are going to look most natural when you're consuming them from JavaScript. If
you don't like using camelCasing, if you want to choose another way of defining your
Member Names, if you are using Ruby and Rails, and you want to use sort of the
underscore approach, that's fine really, the only thing I would ask is to at least be
consistent.
When Designing Collections it's a little different than just returning the collection. We
actually saw this in one of our earlier examples, and my suggestion is to wrap the
collection around a simple object, that way you can send additional information in the
body of the collection. So, if we simply return an object that contains both data about the
result that we're returning, as well as the actual result, it can be very useful. So if we
want to be able to return certain kinds of data, like in this case the number of results that
were found on the server as well as the actual results themselves, we can do that, so
that we have a container for information about the collection not about items in the
collection.
ETags
Another important part of designing your API is to work with Entity Tags.
The idea behind Entity Tags is to help the server cache better, and so when you're
developing your API you're going to want to support this notion called Entity Tag
Headers. Entity Tags support both Strong and Weak Caching, and they're returned
as headers in the response.
For example, when we make a request, the return of the response can include this
ETag. This is an identifier from the server that is basically a version number for the entity
that was returned.
We can also have a Weak version, and the Weak versions start with W/, and this is for
the server to tell you that this is a Weak Cache. ETags also support the notion of a Weak
Tag.
A Weak Tag starts with the W/, and the difference between the Strong and the Weak
type of ETag is that the Weak Tag says the two objects are semantically the same,
whereas a Strong tag indicates that they are byte- by-byte identical, and so depending
on the type of data you're dealing with, you may want to deal with objects as being a
weak ETag or a strong ETag.
Now, what would you do with this ETag?
That's really where the important part of the story comes in, because this is returned with
a response, and so the user of the API is expected to be able to test for this ETag when
it goes and makes a request, so the client should be sending this ETag back to see if
a new version is available instead of getting a brand new object and dealing with a
brand new object, even if the data is stale. This is typically done with an If-None-
Match header, so if I go and request this game object that I did earlier, I would take the
value from the ETag and put it in the If-None-Match header, and if it matches this, if the
server says this was the ETag for this object, it will simply return a 304 or not modified
status, instead of a 200, and the body of this request would be empty.
This is the same notion of cached images when you're dealing with them in typical web
development. This allows you to do it at the entity level itself, and you will use the If-
None-Match header with the same value of the ETag that was sent to you, and if this did
indeed match, the server would return a 304, which is the not modified status; it
wouldn't have the body of the entity or the individual item anymore, but simply say it
hasn't been modified. Therefore, if the last version you had used this ETag, go ahead
and don't return the new copy, which is just a literal copy of what the client should
already be dealing with.
If I switch over to Fiddler, we can see that I can make a request to the server to get an
object, so I'm going to go ahead and Execute this, and this returned our object as a
JSON call, but in the headers of that object is this ETag.
So, as a client this is going to allow me to, if I choose to, as a smart client, I'm going to
try and use this as much as possible, I can use this to test whether this object has
changed. So, if I go to the Raw view and let's copy that ETag value, and then in the
Composer I'm going to use an If-None-Match;
This essentially says if the object I'm about to request doesn't have the same ETag, go
ahead and return it to me, otherwise I'm going to get a 304 error as shown here, 304
meaning not modified.
And the body of this result, if we look at the Raw version, is the same ETag and no
body, because it didn't need to send back a copy of the object because it knew that I
already had a copy that was the same one that was on the server.
This is often used for optimistic concurrency as well.
So this ETag can check to see if there's a new version when it's doing some like a PUT.
So, for a PUT I can use the If-Match. If the object on the server matches the ETag I have
here, then go ahead and update it with this new data. If it doesn't match this, then I
should probably go back to the server, get the new version without the If-None-Match,
present it to the user, have them make their modifications again, and then do the PUT
again. This allows me to test in the header of my PUT that I'm dealing with the same
object version on the server as I had before.
And if I issue a PUT with the If-Match and it fails, it's not going to return a 200 or a 404,
or any of those, it's going to return a Status Code of 412, Preconditioned Failed; this
means one of the preconditions in the header, in this case, If-Match, failed, so I know
that the update did not really happen, because the If-Match didn't match the ETag of the
object that was on the server.
Paging
In your APIs whenever you're going to deal with returning collections, these lists should
always support paging.
Now, you can support paging in a number of ways, but let's talk about the importance of
paging. The idea behind paging is to prevent your sever from returning voluminous
amounts of data that the client can't really deal with anyway. If you returned 1,000
records, the user probably isn't going to look through all 1,000 unless the client really
wanted to deal with the paging. You also don't want to have to deal with the load of
building up those large result sets when your server is busy, and trying to return them to
a number of clients. And so it's not about just supporting paging but really requiring
paging.
So, you can use Query String parameters to accept the paging information, but one
of the important aspects of this is making sure the first set of list that you return is
only the first page of that data.
It's often common to use the Object Wrapper that we talked about earlier for lists,
indicating the next and previous links so that it's easy for a client to walk through the
pages by just using these additional links.
So here's an example of a result that's going to tell us in the body, oh this is how many
objects there are for us to get, and by using simple properties we can see the next page
and the previous page as URIs back to our service, so we can very easily do this sort of
paging.
So let's see what that looks like. I have our sample API here. In fact, I'll go ahead and
issue a request just to GET the entire games collection, which happens to be more than
1,000 results. When I Execute it, what it's actually going to show me is a smaller number
of results; this number of results is actually 25 by default, so I'm not overwhelming the
client with the amount of data, I'm telling him how many are available in the server with
this total result, but I'm just supplying the first 25 results, and then providing a link or two
to the next result. So here, the next page, href, is just ?page=2. Now, I could document
this in my API, but it's really useful to be able to put it in the actual package that's being
returned back.
So this means if I go back to Composer and I simply say page=2, what it's not going to
return is the next set of 25 elements, and notice now that I'm not on the first page I can
include a previous page, which is pretty common. Obviously, the first result doesn't have
a previous page, so in this example API we're not even computing that, we're simply
saying, hey, the next page is this, so we can see previous is page 1 and next is page 3,
but 1 is the default, so in fact getting just games is going to give that first page. And this
allows people to build clients that use their APIs in a much simpler fashion.
When you're doing paging, even though you might have a default page size, like our
example a minute ago had a page size of 25, you might also want to support different
page sizes; you might want to support them getting a different amount than the default
by maybe supplying a parameter. You should limit this page size to a reasonable
amount so as to not incur extra server load. We saw in the API example a moment
ago that we could indicate the page here, but we could also indicate the pageSize.
Now the terms of your page and page size aren't actually terribly important; different
APIs use different semantics here, some use Take-And-Skip so that there isn't an
implicit page size, you can just sort of do what you will. Many OData REST feeds really
lean on this because this is a common strategy in things like LINK and .NET.
But using the page number, page size, or result size, or whatever you want to call it, the
name isn't as important as the actual functionality.
Partial Items
The last consideration for designing your data-driven API, like most of the examples
we've looked at so far, is to deal with partial items. Now it's a pretty typical request to
request partial items from the service. Query string parameters is a common pattern
for this, and you can see a lot of example on the web that do this. The idea behind
partial is to allow a user of your API to pick what fields it needs for a particular request,
so that the payloads can be smaller instead of you always returning these very verbose
objects that the clients themselves aren't really even using.
A good example of this would be using the ?fields Query parameter where you simply
list the fields that you want the result to include. This pattern of including the names here
could also include the names of fields in sub-objects or associations as well; that's really
up to you, but the idea would be to allow the user of your API to decide what fields are
important. Now this is sort of an optional part of your design, but doing this will really
allow users to consume only the parts of the data that they need, as well as reducing the
footprint of your service, because you're going to be producing smaller serialized objects
and the clients are going to be consuming smaller objects, therefore the roundtrip should
be quicker.
You can also support Updating of those Partial Items as well, and there's a special
verb that's often used for this that's called PATCH. The idea behind PATCH is to be
able to send in a partial object or a subset of the original object with just the fields
that are updated, and check for concurrency based on the ETag that we talked
about a few videos ago.
So here's an example.
We're using PATCH against an individual item, and we're using the If-Match header to
make sure that our ETag is going to match the original requested object. And you can
see here that we're sending back a small subset of the full set of fields that this service
can return.
A service normally has about 10 or 12 different fields, but we are only really updating a
couple of them here, and so we're only going to send this partial object back; it's going to
be the responsibility of your API based on a PATCH to look at this partial object and map
it to the full object in order to do that updating.
Using the ETag will allow you to do actual concurrency here without having to rely on
field by field checking or whatever other semantics you use for doing that. It will know
that the version of the object on the server is the same version as was originally
requested, so that it should simplify the partial item update story.
Non-Resource APIs
So what about parts of your API that aren't really dealing with entities or domain models
in the same sense that you may be used to?
What if you really need to have some Functional Part of your API.
Now this normally breaks the rigorousness of a REST-based API, but in a pragmatic
sense, you should be able to add these elements as necessary, because we're trying at
the end of the day to solve business problems, solve technical problems, provide the
sorts of functionality that our users of our API are going to need.
These functional parts of your API should be well documented that they are in fact
functional parts of your API and not resource APIs; that you're not going to be able to
necessarily do things like PUT and POST and DELETE these elements, that they're
really about calling GET and doing some functional basis.
It's important that you make sure that these parts of the API continue to be
completely functional not resource-based.
The problem is that you can very quickly get into a case where you start to build
functional parts of your API that really should be resource parts of the API. You start to
do things like match the idea of something like a stored procedure to a REST-based API,
and you're going to very quickly fall into sort of the morass of a badly designed API.
So here's an example of one, calculateTax, where you're sending in with Query
parameters some definite data that will help you do the calculation. Now, instead of
Query parameters you could also send in the body of a formatted data, or JSON data, or
XML data to do this sort of operation, but it is doing some sort of functional piece of work
here; we're not asking it to add a new invoice, we're not asking it to create a new
customer, we're really doing a non-resource based part of our API. You could even see
things like restartServer or beginWorldDomination, things that are functionally part of
what you really need your APIs to be able to accomplish.
Summary
Let's wrap up this basic API design part of the story.
Remember, you can design a great API, but you need to be careful not to surprise your
users. By following some basic tenets of the way REST works, you can create APIs that
should be familiar to people that have used other APIs, especially other REST-based
APIs. You can certainly invent something yourself in creating APIs, and sometimes even
create something very functional. But by taking some of the lessons learned in this
module, I really hope you follow the patterns of other APIs that are out there.
At the end of the day, part of your job as a developer is to protect the server from the
user and protect the user from server, and so getting a good balance in the middle of
being very useful for the user but not allowing a single user to do something bad to the
server, like make really large requests, is really what you're after. By making sure you're
using aggressive caching and the use of ETags, you can really allow the user to be a
good citizen to your server without you having to go do the work every time someone
hits your server for the same data.
At the end of the day you need users to make your API a successful API, so making it
easy to use and fulfilling the needs of those users are what's most important, not making
conference speakers happy or not fitting into what I consider maybe a too rigorous
definition of what we would normally call a REST-based interface.
3. Versioning
Introduction
In Module 3 we're going to talk all about Versioning your APIs. This is going to include
why Versioning is important, we're going to show some examples from public APIs and
how they're doing Versioning, and we're then going to talk about patterns for Versioning,
including URI Path Versioning, URI Parameter Versioning, Content Type Versioning,
Custom Header Versioning, and which one to choose and when. We're also going to
touch on the topic of Versioning your Resources themselves.
Let's get started.
Why Version your API
So the first thing is we want to talk about why Versioning is important.
Once you publish an API it's set in stone, and it's set in stone because this
publishing isn't a trivial move. You're telling Users and Customers that your API is
out there and they can start to write code again, but as you make changes to the API
that you're not going to break their code, it's an implicit contract between you and your
customers and users. But requirements for your API are likely to change, and so
you're going to need a way to keep the users and customers happy so that their
code doesn't break, but also support new requirements or changes to your API. You
need to have a way to evolve this API without breaking those existing clients. And one
thing to keep your head around is that API Versioning isn't the same as your Product
Versioning. Releasing a new API version every time you release a product isn't really
useful; only version your API when the semantics, the signatures, and the shapes of the
data you're dealing with are changing, and so you should resist the temptation to change
your API, do your best not to tie the two together.
And so at the end of the day you have one and only one commandment when dealing
with releasing your API, and that is you will not break existing clients, so your API
changes themselves aren't going to cause your clients to have to write new code, unless
they want the new features, new shapes, new support that your API provides. This
doesn't mean that you can't get rid of old versions of your API, but you will need to get
rid of those old versions of the API with some care, with a lot of communication, so that
when you eventually do stop supporting those APIs, your customers and users have
plenty of notice that they're going to have to upgrade or move to a new version of the
API.
Is there a right way?
So many of you may be viewing this module to find out the one and best way to version
an API, and unfortunately there isn't one.
When you look across the web at the different types of APIs out there, they're versioned
in sometimes very different ways, and the methods that are used to version APIs can be
pretty different; they have different pros, they have different cons, so you have to really
find the version of your API that works best for you, and we're going to present a few
options for doing that Versioning, but the important idea here is that you're going to
Version your API.
So there isn't one way to Version your API. We can see existing APIs out there and
see some of the options that are chosen for Versioning of those APIs, but many of those
public APIs have done it very specifically to meet internal requirements, so it may not be
at the behest of the users of an API why Versioning happens, it may be really driving the
way that the developers of the API needed to do the Versioning. There are some
external requirements as well, and that is how difficult it is to use the API. You may
decide to Version using one method or another method specifically about the difficulty in
that. I would love to be able to give you the single option that I would recommend, but I
can't. There simply is no one right way to Version your API.
Examples of Versioning
So let's look across the web at some public APIs out there that do Versioning in different
ways.
Let's start with Tumblr, a pretty popular API out there. The Tumbler API uses a URI
path to do the Versioning. They essentially have a version embedded in the path of
the URI that we can see as the v2 here, so that everything after the v2 is subject to
change as the versions of the APIs change, so there is no guarantee that in v3 of the
Tumblr API there's going to be a user object at all; they make small changes or they may
make large changes to the API. This pattern of using the URI path is really common,
you've probably run into APIs using this method.
Another pattern you can see here is from Netflix, and that is using a Query parameter.
So, instead of embedding it in the API, it's dictating with a Query string parameter what
version of the API to go after.
Another style is the Content Negotiation type, and this is where instead of using
anything in the URI to indicate the version, the content type that is requested in an
Accept header is used, and so this is a custom MIME type that includes the version
information. We can see the 1 here indicates the version of that object that is contained
in the GitHub API.
And the last type of Versioning we'll talk about is a Request Header. With Azure, when
you're going against their API, they're using a special Request Header called x-ms-
version, that is saying this is the version of the API that this is written against, and the
version in this case is just a date from when this API was released. Instead of using
simple version numbers, they're using release dates to do this Versioning, so you're not
tied into having specific version numbers for your entire API, for individual objects, or for
individual types of resources.
Let's walk through some more details and talk about the pros and cons of each of these
four Versioning patterns.
Versioning in the URI Path
So using the URI Path to do your Versioning, the Version becomes Part of the Path
to your API. This allows you to make big drastic changes to your API in later
versions. Everything below that version number is open to change, though the
amount of change you make will really be dictated by how much pain your users and
customers can take in their client code.
Here's an example where the v1 in the API is dictating what is available in that version of
the API. This is a very common pattern; it's probably the most common of all the
Versioning I've seen out there in public APIs. And in this case, you can see instead with
their version 2 of their API they might decide instead of including CurrentCustomers as a
customer type that they now just expose it as a different kind of resource, so that the two
APIs don't have to be that similar, though, of course, what they're doing at the end of day
is ultimately similar.
So the pros of this pattern is that it's very simple to segregate these old APIs, and
what that means is that you can really change the patterns of your APIs as time goes
along, and so you may decide to support the old APIs and implement a brand new API.
The problem here is that this pattern requires a lot of client changes whenever you
change the version. So, even if the whole API changes and you're just adding some
additional pieces, all your users and customers are going to have to go into their code
and change that v1 to v2, unless they only want to support what is in the old APIs. This
also increases the size of the URI surface area that you have to support, so that
when you release the v2 version, you may have a whole new set of code that is
supporting that version, and still having to maintain and fix bugs in the v1, and so often
it's an easier decision to use this type of Versioning if you want that sort of broad reach,
but at the end of the day you may decide against it because it can be a larger amount of
technical debt.
Versioning with a URI Parameter
The next pattern is using a Query String Parameter. One of the interesting parts of
using this is that the version can be an Optional Parameter, which means that you can
make sure that your API always without the Parameter is tied to a specific version,
usually the latest version.
So here's an example of using a simple API. There's no version in the URI right now, but
if I decide I wanted to go get the Customers of a very specific version of the API, I could
then include some Query Parameter that defined what version I was going after.
The pros here are that without a version the users are always going to get the
latest version of the API. It's going to encourage users and customers to use the edge
version of your API, even in some cases when they don't necessarily need to. There are
little changes as the versions mature; this also assumes that you're not going to make
great big changes as the version also changes.
The problem here is that because you have the optional version included as a Query
String, you can surprise developers with changes that they don't expect, and at the
end of the day, you may be breaking client code because they didn't include the specific
version they were going after. Now some of this can be mitigated by not making the
version number optional, and by not making it optional you're making it part of the URI
syntax, and it's in a lot of ways semantically the same as using the URI path we saw in
the last video.
Versioning with Content Negotiation
The next type of Versioning we'll talk about is with Content Negotiation.
And Content Negotiation simply means using a Custom Content Type and Accept
Header in the request.
Instead of using standard MIME types for the Accept types, application/JSON,
text/XML, etc., you're going to use custom MIME types.
Here's an example of a GET where the Accept type includes a custom content type.
Here's myapp with a version, and then the kind of object I'm looking at; this is a pretty
common pattern.
You can include formatting information in this Accept Header as well. So you can
see here putting a .JSON or a .XML in the content type could also tell the server what
kind of content it wants back, which is normally what the Accept header is being used for
anyway. This type of Versioning is becoming increasingly popular. It's becoming
increasingly popular because the version itself is separated from the surface area of the
API itself.
When defining your own MIME type, there is a standard for this. The standard indicates
that the "vnd." or vendor prefix can be used as a starting point and usually is. This is a
reserved beginning of the MIME type, and this indicates that this is a vendor-specific
content type. For example, here, we're doing the same sort of request we saw on the
previous slide. The Accept header could begin with vnd, and that's more typical of what
you're going to want to do in your own API content types.
Let's also look at the pros and cons.
The pro here is that the API and Resource Version are all in one. So, when we're
looking at the version of what our API looks like, but also the resource that we're
returning, we're getting a version that's really tying the two together. It takes that version
out of the API surface area or the URI so that clients don't have to change except when
it comes to including that Accept Header.
The con here is that it adds complexity. Understanding how headers work and
adding headers isn't easy on all platforms, and isn't easy for all levels of
developers. This type of Versioning could also encourage more versions
throughout your code, so you might have specific versions of a number of your
different kinds of resources. This is good in one sense in that you can have more finer
grained versioning, but it also means you're going to have to support and understand the
complex nature of Versioning across your API. This can encourage your developers to
create more versions for different small parts of your API, instead of understanding that
making no change to your API version is often better so that clients don't have to make
their changes.
Versioning with Request Headers
And finally, the last type of Versioning we'll look at is using Custom Headers inside the
request.
This should be a header value that is only a value to the API, so is specific to your
API. You're going to use an x- type of header, that's a name that most routers or
interrogators of traffic are going to ignore.
So here's an example of a header that includes a name that your application is going to
look for. Here's MyApp-Version, and then some text after it that's going to indicate that
version.
Now, it's pretty common for these sorts of custom headers to use dates of numbers, so
what you include as the actual App-Version is completely up to you, it does not have to
be a numbering scheme, like as developers we may be used to the with product
versions, or assembly versions, or jar versions; we should get away from that and just
think of something that is semantically important to what this specific call should be
pointed to.
The pro here is that it separates the Versioning from the API call signatures much
like the Content Negotiation Versioning does, and in this case it's not tied to the
resource versioning, so you're really talking about the version of the API itself, not just
the version of the resource.
The con here is that it adds complexity; much like the Content Negotiation, adding
headers isn't easy on all platforms or for all developers.
Which to Choose?
So ultimately you're going to be asking yourself which one of these patterns should I
chose?
And there isn't an easy answer for you. Versioning with Content Negotiation and
Custom Headers is very popular right now, it's sort of the trend of where Versioning is
going, but it does add that complexity.
Versioning with URI components is more common because there are more APIs out
there that have chosen that pattern. Versioning with URI components tends to be easier
to implement but can add technical debt to the backend of your project.
Ultimately you're going to need to make a decision based on the kind of
requirements you have. In many cases I would probably start with URI Component
Versioning to see whether the technical debt is a hindrance to your project, and switch to
something like Content Negotiation if you need something finer grained, as well as if you
find out that sophistication of your users is high enough that headers aren't a big deal.
An important part of your decision here is how you're going to do Versioning, but
understand it's incredibly important that you version your API from the very first release,
so that makes it easier for your users to move from version to version as your API
matures.
Versioning Resources
So we've talked about Versioning of the API itself, but what about versions of your
Resources.
In most cases, unless the nature of your Resources is very strict or set in stone by other
standards, your Resources Should Be Versioned as well. So the Versioning of the
API calls usually isn't enough.
The structures and constraints of the kinds of objects you're dealing with and
returning via your API and accepting from your API tend to change, and so
Versioning your Resources becomes important.
If you're already using Versioning with Content Negotiation or Custom Content
Types, this is pretty easy because it will know in the Accept header or in the Content
Type what the version of the object that you're expecting and sending, but this does add
complexity as we've talked about. Including a version number in the entity body is
another option, but it does pollute the data; it adds a piece of data that is about the API
and not about the nature of the data, so I don't tend to recommend this approach. If you
need Resource Versioning separate from your API, you should probably be doing
Content Negotiation Versioning.
Summary
So to wrap up this module, you must version your API; that's sort of the mantra I'm
trying to push towards the viewers of this video. Version your API whether you like it or
not; it will help with the maturation of your API as time goes on. If your API is public, it
has to be versioned, period.
There is no one way to do this API Versioning, so starting with something simple
and moving to something more complex is a good approach, but if you feel like
you're going to have a lot of version churn, choosing one of the more complex
approaches like Content Negotiation or Custom Headers is probably the place to start.
You're going to want to pick one that matches the maturity level of your users as
well your internal team. If your internal team is not well versed in dealing with a large
set of code, you may decide with one of the approaches that sort of leans on less
technical debt, but if your team is not as comfortable dealing with worrying about routing
based on things like Content Headers, then choosing one of the simpler approaches, like
the URI Path approach, may be better, so understanding that maturity level is going to
help you pick the right one. Using complex versioning isn't evil in itself, but it can
increase friction with developers. So, if you decide on using a versioning scheme that
is more complex to implement, you're going to have a tougher time reaching out and
getting more developers to work.
Ultimately, you have to be pragmatic about these decisions. Usually using just
enough Versioning to start is where I start new API projects, and then allows us to
make changes as the API matures. Remember that as long as you have a resilient
community around your APIs, you can sunset APIs at a certain point and choose a
whole new scheme.
If we look at the way that GetHub when from one version of an API scheme they had
several years ago to the Content Negotiation type they're using now, they knew that the
API wasn't the one thing holding their customers to their product, so you have to be
pragmatic about how much Versioning you're going to deal with to protect your users, as
well as incurring extra effort on their part to use your API.
There's a balance there that you're going to have to make, and understanding who your
users really are is going to be part of that.
4. Securing Web APIs
Introduction
In this module we're going to be talking about how to secure your Web APIs, which
Threats are coming after your APIs, how to Protect Your API, Cross Domain Security,
Who Should You actually Authenticate with, Working with API Keys, understanding User
Authentication, and finally making sense of OAuth.
Understanding the Threats
Before we can look at how to secure your API, we need to really understand the nature
of security as it relates to developing Web API. Who are the people that are going to
come after your secrets, your work, and even the people that are going to come to just
purely create disruption to your business.
To begin with, do you even need to secure your API? You may be thinking I'm
creating an API, I'm going to use it within my own enterprise for my own applications,
who is going to care about these APIs?
Ask yourself some questions and we can talk about whether you should secure them or
not. Are you using any private or personalized data, data that represents individual
people that could be at risk? This could be social security numbers; this could be
information about your users or employees. If you are, then you should secure it.
Are you sending any of this sensitive data across the wire to your applications? If
you are, then you need to secure it.
If you're using credentials of any kind in order to do authentication, you need to
secure it.
And finally, are you trying to protect people from getting to your servers but
overwhelming them, maybe even to the point of stopping you from being able to serve
your real customers? If so, you're going to need to secure it.
So, securing a Web API typically becomes a 1st class citizen of your design. Security
isn't something you can just throw on top of your existing design and hope that it will
work. You have to think about security through the entire process.
Who's coming after your API?
Well we have users and the browser that are coming across the internet to get at these
APIs, and we are going to have threats from different places here.
We have the typical man in the middle attacks where we have Eavesdroppers that are
looking at the traffic as it goes back and forth and seeing whether there is interesting
data there. So, if you're trading any sensitive sort of data across that wire, you're going
to have to protect against these eavesdroppers.
In addition, you can have Hackers or even your own Personnel that are going after that
personal data directly at the servers themselves; this is often behind the API some
place. This includes intrusion into your systems through your firewalls or even physical
security of your server locations.
And finally, you have the Users and Hackers themselves, which are working on the
other side of the internet, that are taking the code that you may be publishing, or maybe
looking at the website that is using those APIs in order to access your servers through
your API.
These different kinds of threats are the ones you're going to need to make decisions
about how you're going to protect against.
So at the end of the day you're going to want to protect your API in almost every case.
Securing your server infrastructure itself, protecting your data centers with firewalls, and
protecting it against physical intrusion, is outside the scope of protecting your API. We'll
assume that you're working in an organization that knows that the data center needs to
be protected.
When you're communicating with your API, you need to have Security In-Transit, so as
the clients are calling into your servers, how can you protect that data while it's traveling
across the wire?
o And this is usually where SSL is used to protect the actual payloads of the
API calls, so that they can't be modified or changed, or even inspected as it
crosses over the internet.
o SSL does have a cost to it, but is usually worth the expense. So
understanding that the overhead of actually doing the encryption and decryption
on both sides, and even the handshake between the browser and your server, to
do the SSL encryption, there is a cost associated with it. But, in terms of
protection from people interrogating your traffic, you're going to want to do this as
much as possible.
And finally, and what we're going to mostly talk about in this module, is Securing the
API itself.
o And part of the security is to protect yourself from Cross Origin calls, so
knowing what domains are using your API and allowing them to make those calls
where appropriate.
o Additionally, you're going to want to have methods for dealing with Authorization
and Authentication, so determining who is coming into the system and what
rights to those system they have.
Cross Domain Security
So, the first piece we'll talk about is Cross Domain Security.
The question you have is should you allow your API to be called from different
domains. You may be creating your API directly for your public website, and then
maybe this isn't something you want to deal with, you want to only allow your actual
website to go after it. Because the way the browsers work is that when they make a call,
an ajax call, into a Web API, if the browser itself is hosted in the same domain as the call
that's being made, it just simply allows it. If it's in a different domain, if you're crossing
domain, let's say going from foo.com to rd.com, the browser itself is not going to allow it,
unless there are some special circumstances that will allow it to happen.
Making the decision about whether to allow your API from different domains really
depends on whether it's a public or a private API. If it's an API simply for use by your
application or your web property, then you probably don't need to worry about it. But if
it's a public API, because it's going to be called from different parts of the web, you're
going to want this to be supported. Now this whole notion of Cross Domain Security only
matters when it's being called from a piece of client script on someone else's web
property. If someone is writing an app like an iOS, or an Android, or a Windows phone
app, to get at this the API is going to work in either case; this really is about Cross
Domain access from within the browser.
There are Two Approaches to solve this.
o The first is to support a different format called JSONP as the type of data
coming back.
o The other is to allow something called Cross-origin Resource Sharing, which is
a standard out there for doing sort of a handshaking to see whether a domain is
allowed to make those calls.
So let's talk about each of these individually.
What is this format we're talking about with JSONP?
It's simply JSON with Padding, JSON being JavaScript Object Notation. JSONP is
actually JavaScript. It is a small snippet of JavaScript that's returned, instead of a
JSON-formatted body. It typically contains a JSON-formatted body, but it's surrounded
with a small function call. The expectation is that when it comes back from the server it
will be executed, and so the browsers deal with it in a different way, because we very
commonly go get JavaScript that we're going to execute in the browser from different
domains. If you're getting JQuery from a CDN, or using other sources of CSS or
JavaScript, the browser expects to get those from a variety of different domains, so
allows this call to go across to that domain, if the return type is JavaScript.
When the data comes back, this JSONP package, which again is just a small piece of
JavaScript, is evaluated, which ends up calling a function that contains all the data that
you are looking for.
So, let's see how this works. I've created a function ahead of time in my client code
called updateUser, and this is going to accept some data that I want from a cross
domain server. I can then issue a GET to some API, and here I'm calling an API called
games, and the host is going to be some different host than I'm actually hosted on. I
might be hosted at foo.com, but I'm going after some cross domain host, and notice that
part of the API call is passing into the API a callback. What is the name of the function I
want to call, and so this updateGames matches the function that I already have existing
on my page. And the Accept header here also includes the information about what kind
of data I want to come back, and this is application/javascript, it's not application/JSON,
which is the way we would get normal JSON. This is actually application/JavaScript, so
that the content type will be actually JavaScript, because when this GET is executed,
what is returned is a small snippet of JavaScript, and it's used that callback mechanism
here to say wrap the results in a call to a local function, in this case updateUser, and
then inside is a JSON-formatted object that will be passed in as the data to my
updateUser call; that's the core of what JSONP does.
Let's look at this in a live API. If we go over to Fiddler, I can make a call here to get an
object from an API. We've been doing this throughout the course here or there, and if I
tell it that I want JSON as the data, when I Execute this, the result is going to be a JSON
object that is returned, and in fact, if you look at the JSON result, we'll see we're getting
this object from the sever, and it's formatted as JSON. But, if we go back to the
Composer and change this to JavaScript; it's important to put on this the parameter of
callback, this is the parameter that is usually used for APIs to define what is the name of
the callback to use when I'm returning JSONP. What am I going to wrap my JSON result
in? Now, you may decide to make this different, but the convention is actually to call a
callback. So, I'll call it foo in this case, and Execute this. Again, I'm including the
callback, and specifying that I want JavaScript not JSON. And when I execute this, the
Raw body that is returned is wrapped in a function called foo. This assumes that when
this is evaluated, that I actually do have a function that I called foo that will accept that
data as the callback. Now when you're not calling cross domain, this little extra bit of
code and bit of ceremony to using a callback may seem kind of odd and unnecessary,
but it isn't unnecessary. But, if we were going to do the same thing in cross domains, call
from a separate domain, the browsers would allow us to do this, whereas if we tried to
do this otherwise, it would fail to because it is a cross domain call.
When you're designing your data for JSONP, remember that JSONP is just JSON; it's
just the same sort of results you're going to return to the clients, but they're going to be
wrapped by the single function. So, the data passes just the same as with the
JavaScript, it's just packaged as this JavaScript callback.
The other approach is to use something called CORS.
CORS allows Cross Site support from any of the browsers, but it involves a little
handshaking to make it actually work. Now, the different platforms implement CORS in
different ways if you want to add to it. So, we're not going to talk about how to actually
write the CORS, but I want you to understand what's going on in order for this to actually
work.
There is some handshaking that goes on between the browser and your service
before your service is allowed to make the cross domain call.
Implementing this yourself is possible, but usually if you look at the platform, there
are plugins in to help you implement this forward, because it is not a matter of
changing the way your servers work, but actually implementing the handshaking that's
going to happen before your service is executed.
So let's talk about how it works so you can get your head around what the browser is
actually doing.
This is a little difficult to see because the browsers hide the handshaking part, or even
using something like Fiddler hides the handshake, so that if it doesn't work you can sort
of see what's going, but if it does work, you aren't going to see the handshake at all.
So, CORS starts by making a Cross-Origin Request as it's called. I'm on food.com,
and I'm calling Ebay.com to make a request.
The server is asked if this Cross Domain object is allowed, and it does this by
issuing a command from the browser, this isn't something you write it's something the
browsers does automatically, because CORS is a standard, and what it does is it issues
an OPTIONS call to the server, requesting the type of method that it was attempting to
do. In this example, the original Cross-Origin Request was a POST request. This would
say GET if it were a GET request, etc. And the Origin is the name of my site, the site that
I'm coming from, whereas, the Host is pointing at where I'm going to.
The Server Responds with what the Rules are. We're going to allow these methods
and we're going to allow these methods from this Origin, and as long as the calls on the
page after this adhere to these rules, it will continue to work.
So, then the browser actually makes my request, and it adds onto it the Access-Control-
Request-Method that matches what I'm trying to do, in this case the POST of some data
to the Games API. It also includes the Origin so it knows where this is actually coming
from, so that it can then still check to see that this is allowed, but this handshaking of
getting the options and then receiving the rules and caching those rules, are the part that
need to be implemented on the server for CORS to be allowed.
Typically this handshake option is done at a pretty CORS level. You're not necessarily
going to allow it just for individual API calls or methods, but you may decide to do things
like allow Cross Domain only forget but not allow things like POST or PUT or DELETE.
Who Should You Authenticate?
So in many cases you're also going to want to guarantee who the caller is to your API.
You need to figure out who is calling in order to figure out who I'm really authenticating
as.
You're really doing Server-to-Server, or you might think of it as Service-to-Service
Authentication, and in this case, it's most common to design it to work with API Keys
and Shared Secrets, and we'll talk a bit about how that works in a minute.
There's also this thing called User Proxy Authentication. So I've written some piece of
code and I want to work with some 3rd party API, but I don't want to have to collect and
be responsible for storing the user information, so I want to simply have the right to go
over to this 3rd party API and use that API, and that may be your API. And so in this
case you're going to use something like OAuth, something that allows you to proxy the
actual Authentication schemes to themselves.
And finally, there is Direct User Authentication as well. And this is where you're going
to simply piggyback on existing systems. So your API may use cookies or tokens that
you use as part of normal Authentication with your website, so, if you're using some like
ASP.NET, you may be using forms authentication here and also use that same cookie
for your API authentication. This Direct User Authentication is almost always used when
you're writing an API for your same property; it's not a public API, it's more of a private
API that you're using to communicate for your own single page applications or your own
apps.
There are some important definitions for us to get our head around before we dive in
here. First, what is a Credential?
We talk about Credentials an awful lot, but I want to make sure that you, the viewers,
have a sense of what that word really means. And a Credential is a fact that can
describe an entity. Most commonly this fact is something like an identifier, or like an
email address or a user name, and another fact may be something like a
password. So, a set of Credentials is really a list of those facts that helps the server
determine you are who you really are.
Authentication is the way the server will validate a set of credentials to figure out
who you actually are. Now this who you actually are is a curious one, because it's not
necessarily a user of the system, it also may be a developer API Key, so it may be
validating that when you signed up for an API Key that it is you, the developer, that
created that relationship. So, this authentication idea of credentials is true whether you're
calling server-to-server, or app-to- server, or whether you're actually authenticating with
user credentials.
And Authorization. Authorization is the verification that some known entity, an entity
that has been validated with authentication, has rights to access a certain
resource or a certain action, so that I can say that Bob is logged into the system, and
Authentication has validated that it is in fact Bob on the other side of the wire. Now Bob
wants to delete a customer. Does Bob have the right to delete that customer or not? And
that's were Authorization comes in. Is that entity allowed to do these certain things? Can
it read this, can it delete this, can it insert this, can it modify this?
Working with API Keys
So let's talk about API Keys.
A very common method when developing Web APIs is to issue developers a set of
credentials to identify who the developer is instead of the user, and those are normally
thought of as these API Keys. There are even a number of services that APIs can
register with that will do this management of the API Keys for you. So whether you
implement it yourself or use some service, understanding how API Keys work is a pretty
important part of it.
API Keys are for non-user specific API usage.
o For example, if you're writing some code to go after Amazon's Web Services, or
to look at the Amazon catalog, you're not representing a user that wants to look
at what orders they've made, you're simply using the API to get at some data,
and those APIs, instead of being truly open and public, still require a relationship
with those APIs so that it knows who the person calling the API is. And this is
primarily so that when someone uses the API, then it can monitor their usage. If I
see someone is looking at the catalog of my products and they're just walking
through and reading them all, I should be able to look at logs and see who is just
dumping data out, or maybe calling it so often that it's slowing down the service
for others, and identify who the developer that's causing that problem is, and then
mitigate it in one of a number of ways.
o These API Keys are just to verify who the developer is making the call so that
I can make some of those decisions.
So typically, and you're going to see this from lots and lots of public APIs, and you can
implement this yourselves, there's this notion of having an API Key and Signing your
requests.
So, to start out, the developer will go to the API's website and sign up for the API,
it's going to give them some personal information, so we can figure out who the person
is, and then they will be returned two pieces of information; they'll be returned a
magic string that contains an API Key, and then a Shared Secret. The Shared
Secret is normally used for encryption, and we'll see how that encryption works in just a
moment.
So using these two bits of information, when I make a call to one of these services, I'm
going to need to use my API Key to make a request and to sign my request so that they
can guarantee it was me in fact making the request.
So the developer is going to create a request, maybe a call just to a REST-based
service, and this is going to include what do I want to do, what my API Key is, and
what the Timestamp is. So, the API key is being used here to say who I am, but it's not
being secured yet. This API key itself is going to be transmitted across the wire so that
the API itself can determine who I am. And then the developer is going to sign the
request with the Shared Secret. The Shared Secret that was passed to the developer
when they registered is not actually passed in as part of the request; it's going to sign
that request. Now what signing the request means is to take the complete request itself
and use a Shared Secret to run it through a one way encryption, to get a signature for
this request when it is being signed. The developer then takes the request that it
generated, plus this signature, which is this one way encryption that they have
determined, and sends that whole thing to the service. The service then looks up
who's making the call through the API Key, oh, there's Bob the developer, I know
who Bob is, and I can also get that Shared Secret that I had given them before, because
I now who Bob is. We're using the Shared Secret on both sides of the wire, but we're
never transmitting it. The service then takes the request that was given and signs
the request with the Shared Secret just like the developer did. And it does this so it
can then look at the two signatures and make sure they are the same. So it's doing
one way encryption on its side, the developer did theirs, and then it sees what the
developer sent in as the signature to the request and verifies that the signatures are the
same. It does this to verify that that Shared Secret is the same that the developer is
using, so that it knows, oh, this is actually a developer, because the developer wouldn't
just give out his Shared Secret. So the developer knows something that only he and the
API knows, and I'm using the signature to verify that. It also looks at the timeout of
this request and verifies that the signature is within that allotted time. When the
developer created the request it included a timestamp that described the time of the
request so that it could make this check to see that the signed request isn't old, so that
someone couldn't steal the request that was signed and try to issue it an hour, or two
hours, or two weeks later, and mimic that they are actually the developer. If it's valid, it
goes ahead and executes the request and returns the data. If it's not valid it then
returns an error.
The API signing is a way to verify that the developer is who the developer says it is.
When you're designing your APIs, and you're designing APIs that are going to be used
outside the scope of individual users, simply using an API Key and a Shared Secret will
allow you to validate that the developer calling into your service is actually them, so that
you can have a way to register developers, and have them use the service, and be able
to monitor what developer is using your service, without the need for going down and
creating user authentication for each user of the system.
User Security
Identifying individual users is a little different.
So if you have the notion of users, how do you verify that API is calling as them? So the
developers themselves might be identifying themselves with API Keys, but you also are
asking those developers to act in the role of the user they're trying to serve, and how do
you verify that that user is actually them?
If you're building an API for only use on your website, don't worry too much about it,
because you can piggyback on the existing website security. Again, you can take
the forms authentication, in the case of ASP.NET, or any sort of authentication scheme
that you're using on the website and apply it to the API, because if they're logged into
the system that means that they can then use the API in the same way. And if you're
building clients for these 1st party APIs, those clients might be able to collect those
credentials and send them in as header information when it calls into your API. If you're
developing Apps against these 1st party APIs, it tends to be a little bit more
painful because your Apps will need to collect user credentials and secure them.
Securing them is often the harder one, so making decisions about, oh we're going to
keep the user name but force the user into typing the password every time, which isn't
necessarily a clean and easy way to do, or maybe storing the password in hopefully a
secure way, depending on the platform you're on; it can cause additional problems with
that.
If you're expecting 3rd party developers to use your API, you're not going to want
them to identify individual users themselves. You don't want to ask them to collect
those credentials, because you don't know how good they are at protecting those
credentials, and since they are a window into getting into your system, you don't want
them to know user names or passwords at any point, and that's where you would use
something like OAuth. OAuth will allow these 3rd party API developers to have access
to your system while maintaining that only your code is actually accepting those user
credentials and mapping them back to something that the developer can use as that
individual user; let's see how that works.
OAuth
So, in order to protect the user, we need a way to allow the developer to act as the
user in the system, but allow you to maintain control over accepting those actual
user credentials. Once you accept the user credentials, you can then trust that 3rd
party with some magic token that represents the developer, and the developer,
whenever they call into you with this magic token, you'll know that this 3rd party
developer is acting as if some real user in your system. The developer themselves
won't ever receive these user credentials, and more importantly won't be responsible
for storing them and securing them.
So this is how it works. Let's talk about what the Developer will do, what the API will do,
and ultimately what the User will do; they all have a role in how OAuth works.
o The developer is going to request an API Key from the API, much like we saw
earlier with pure API Key authentication.
o And the API is going to supply an API Key and a Shared Secret, again just like it
was before, because you still need to have a way for the developer and the API
to know who is who.
o Using this API Key and Shared Secret, the developer requests a token called a
Request Token. This Request Token is the magic string that the API is going to
return to allow it to make this handshaking by forcing the identification of a user
in their system and having the API give them permission to act as that user.
o The API looks at the API Key and Shared Secret being signed, and returns that
token.
o That token is then used to redirect the user to a specific page in the API to allow
them to give the credentials.
o So the developer redirects to the APIs authentication URI, and the API is going to
display a UI for the user.
o The user is going to supply their credentials if they're not logged in, or once they
log in they're going to confirm that the user wants to give the developer the rights
to call as them. If anyone's done anything like Facebook or Twitter integration, or
even a user allowed a Facebook App to be installed or allow an application to
use Twitter, you as a user have done this before. It forwards you over to the
Twitter.com page; it will say Bob's development check wants you to give access
to your Twitter account, you say yes because you want whatever Bob is going to
give you.
o Once the user has confirmed this authorization, the API itself redirects back to
the developer. So the developer can then request an Access Token.
o This is a separate token that the developer is going to keep, sometimes for quite
a while, in order to make requests to the API.
o When the developer requests this Access Token, the Access Token is going to
come back with a Timeout. Here's a token to make calls into my API, and this is
for how long you can use it.
o From that point the developer can use the API with the Access Token until that
timeout occurs.
They can make multiple calls as the user, as far as the API is concerned, until that
timeout happens, and sometimes this Access Token is good for quite awhile; sometimes
it's 20 minutes, sometimes it's two weeks, it depends on the nature of your API. If you're
developing a banking system, it should be good for a couple of minutes. If you're
developing something like Twitter you might want to make it a sliding expiration so that
the timeout is good for quite awhile. But using that Access Token allows them to use the
API, and the API, when you're designing your API, you have to look at this Access
Token and be able to determine that they're calling the API as the user.
When you're developing your API, you shouldn't expect that the user credentials are
going to be part of the header, or the user credentials especially are not going to be part
of the URI. You're not going to develop Get All Messages From User? User=Bob, right?
You're going to assume that the Access Token that's going to be sent in is going to be
mapped to Bob before you determine what data that resource is going to return. And so
developing your API as it relates to individual users is going to be very clear and obvious
when you start to work with something like OAuth, because it's going to assume that
you're going to identify the user without the need for identifying the user by name using
something like a query string or a path variable.
So how do you design for OAuth?
First of all, I want to make it clear that you should probably not implement OAuth directly.
Most platforms are going to have a way to implement the OAuth for you. Understanding
the flow is going to be useful, but depending on your platform, you're going to want to
allow a library or service to implement the OAuth for you, because there are a lot of
little moving pieces. Most of the time when I'm developing an API, the last thing I want to
do is build a lot of plumbing code. Because it's complex and there are a lot of moving
pieces, getting it wrong means you're likely going to have an insecure API, so rely
on the benefit of more mature code, to mean the OAuth is going to be as secure as
OAuth can be.
You might also decide to integrate your OAuth using 3rd party identities, so you may
use Facebook, Google, or Microsoft ID to determine who the logged in person is.
Even if they are an individual person in your system, you may be using these 3rd party
identities. Using 3rd party identities can be very helpful when you don't want to store
your own identifies; you just want to be able to individually ID users. Users don't want
their own IDs with your system anyway in most cases, they don't want necessarily to
have to remember a username and password for your system and your system alone,
unless it's a big part of your environment, like if you're building Enterprise Apps. Users
will do it if there's a big payoff. So if you're providing them a service, especially if it's a
free service that has a lot of benefit, they will want their own IDs; it's just not that
common.
Summary
So let's wrap up some of these ideas.
When you're securing your API, you should make it part of your original design.
Don't hope that you can tack on security later, or that some higher level piece will just
make it secure on its own.
Don't try and just drop security on top of your API and hope it works well, think
about securing it from the very first step to the last step of developing your API.
You want to make sure that the default behavior is secure, that not going the extra
step by developers will make it secure. That means never returning results that may be
insecure. A common case for this, I've seen quite a lot, is if you have an existing system
that already has data resources that it wants to return, let's say employees as an
example. Even though you probably wouldn't ever write code through the API that used
something like a social security number or a spouse's name, you may end up leaking
some of that information through the API because you simply just want to return the
same entity objects you're using throughout the system. But be sure that your APIs are
secured by default by making sure that the data that you're returning back is pruned to
keep data that may be fine inside your organization or behind the firewall, not getting out
over the internet.
5. Hypermedia
Introduction
In this module we're going to be talking about Hypermedia, and what that means to you
as the designer of a Web API. This module is going to cover the notions of Hypermedia,
and this is going to include explaining what exactly we mean by Hypermedia, how this
relates to REST and HATEOAS, what are Links, looking at some standard formats for
HATEOAS, including what is HAL, and what is Collection+JSON?
Let's get started.
REST and HATEOAS
So what exactly is Hypermedia?
When the web was being envisioned, part of the magic that makes the web work so well
is the idea that pages can have hyperlinks over to different parts of the web, so that the
web becomes interconnected. And so if we look at something like a standard HTML
page, we can see that using typical anchors allows us to link over to other parts of the
web just by using an href.
We can also use a property of the anchor tag called rel to describe the kind of link we're
talking about.
There's actually a formalized link tag in HTML that many of you probably already use.
The link tag is often used to link over to the style sheet for your page, but this link syntax
is actually used for a variety of reasons to link this document over with other documents,
so I can say that there's an alternative version of this page, this is the language using the
hreflang,
that there's a version of this for Arabic, and this is the URL for that, and so we're linking
this document to another version of the same document but that is for a different type of
reader. We can also do the same thing for an alternate link to a print version of this
page.
There's also the notion of a type of link that can indicate what is the next or previous
page in a cycle of pages?
If you're doing something like an article where there's a page 1, and a page 2, and a
page 3, you can use these links to indicate to the browser that there is the notion that
this is a page and knowing how to go to the next or back to the previous page. So these
are different ways that HTML allows us to link a single document or a single URL to
other URLs by what is returned back, and hypermedia is meant to do this at the API
level.
So, essentially hypermedia are just links for an API. These links are essentially
documentation to the developer so that they can know how to use your API. In
many ways this will help achieve the goal of having the API being as self-describing as
possible. In most cases, you can't have the API becoming completely self-describing, but
this can really help inform the users of the API how to do different things with your API.
These links become the State of your Application, and becomes the model for how to
take data that you may be returning as the core of what your API does, but also
indicates verbs so that you know how to insert a new invoice or delete an invoice, and
allows you to include states that's not just the state of the data on the server, but also
ways to take that data and do something with it. This is where the notion of HATEOAS
comes in, or Hypermedia As The Engine of Application State. This is something that
the REST thesis talks about as an important idea of creating these APIs that are
interrelated. Unfortunately, this is a really awful acronym, and you're going to hear me
saying it a lot in this course, mostly with my teeth clenched. A long acronym like this can
make it a little more confusing than it needs to be. Essentially, you should think of
Hypermedia as simply a way to link results with other results or operations in your API.
So let's go back to a slide that you may have seen in the first module to really
understand where this Hypermedia or HATEOAS fits in.
We talked about how simple HTTP and remote procedure calls allow us to create verbs
in the API, have URI endpoints, and that the REST-ful nature of APIs allows us to sort of
layer on top of that, so that we can have resource-based URIs, verbs, and
statelessness, and even caching in our APIs, much like up to this point you've probably
learned. This last piece is allowing you to have these relationships between parts of your
API using these things called links, and that's really what HATEOAS is adding to the
picture. So taking what we've learned about creating these REST-ful but pragmatic APIs,
and adding the ability for you to indicate information about the results you're giving back,
and how to do things with that data.
What are Links?
So in designing your API, if you want to include links, what are we really talking about?
Any links you include should be about helping the developer use the API, so that
they don't have to go craft URLs and go look up documentation, when what they may be
doing may be a next logical step.
Some common scenarios for this are things like Paging, Creating New Items,
Retrieving Associations, or other sorts of actions like updating an invoice, submitting
a new work order, those sorts of things, are common scenarios that may be
communicated as these links.
Let's look at a simple example using JSON.
Now these links aren't limited to only JSON, you could also use them with XML APIs. I'm
going to use examples that are going to use JSON, but you can apply the same idea to
XML as well. So here we start with just a simple begin and end of a JSON object, and
we might have some data that you might be familiar with returning. So, we might have
some data about the result, like totalResults or Success, and then a list of results that
I've abbreviated here with an ellipse. In here we may also include a set of links that
indicate things that can be done with the results. Links typically are going to at a
minimum have two pieces; have an href, which is typically a URI to a specific operation,
and then a rel, which is going to indicate what the link is for. So in this case we can see
that we're including an href to the previous page of results that we wanted. So what this
looks like, you could certainly document and force people to put in, but because going to
the previous page or going to the next page is such a common occurrence, including it
as links here can self-document your API, and make it easier for your developers to use.
Instead of having to write code in let's say the JavaScript of a webpage to determine and
to craft this URI, they can just take it as part of the results and go, oh, when someone
wants the previous or the next page, I already have self constructed URI that is valid.
You can also have it for other sorts of operations like insert, and in this case the URI
here is going to be related to a POST instead of a GET.
We don't have indications here for the method, but it's also fairly common for these links
to include another parameter called method so that they know what the URI is, and what
HTTP verb to use.
Here's another example, but instead of being with a collection that is returned, this is
going to be with an individual object that may be returned. And in this case I'm showing
some data that is being returned, and then we have that same sort of idea of links. And
here, we can indicate a self link, and this is very common where the item or the object
that is returned will include a link to its own resource URI, so that if you needed to do an
update, or insert, or delete, you'd have the URI that represents this object that you
returned. You may also have other links that relate to associations, so in order to look at
this game's rating, you may have a URI that indicates that this is the rating link, and then
to use this to go ahead and get that additional data if necessary.
Standard HATEOAS Formats
I want to introduce an idea here that is related to the Versioning story, but ends up being
important when we're talking about standard ways of looking at Hypermedia data, and
that is something called Profile Media Types.
The idea here is that profiles are simply the descriptions of what the data is that
you're returning. This is an alternative to using the custom MIME type that we saw in
the versioning section, and it's usually used in coordination with a MIME type. Profile
Media Types are typically included at the back of an existing MIME type in an
Accept header, and servers can return this type as the content type that was retrieved,
so that your client code could automatically know these Profile Media Types.
So as an example, if we were going to get this order, we could use an Accept header to
say, what is the format we're looking for?
In this case it's JSON we're looking for, but at the end of the JSON content type we're
going to include the profile of the schema we want. This is often used to separate the
idea of the type of format we want versus the object identifier or version that we want.
And when we look at the standard formats, they use these pretty commonly to define the
type of data that is returned, or more interestingly, the version of the data type that is
returned.
So there are a few people out there that are creating new standards for how to return
Hypermedia data. There are a handful of them out there, but I've chosen to focus on just
a couple. These standards are emerging, so they aren't final or set in stone; there
isn't a single standard for you to go after, and in some ways by developing APIs now,
you're hitching your wagon to a standard that may or may not become the prevailing
standard. The two out there we're going to talk about is HAL, or Hypermedia
Application Language, and Collection+JSON. These are the two that right now have the
most community support, and they do things in a fairly different way and for different
reasons, so we're going to explore both and talk about why you may want to choose one
over the other.
These standards are based on Custom Content Types, and then using the Profile
Media Type to define the structure of the data that is returned or accepted. The Content
Type defines the data formatting, and in the case of something like HAL or
Collection+JSON, it's going to define which of those standards to use, and then the
Profile Media Type is going to define the structure of that data. In this way it's going
to keep the format and the versioning of the type of or the structure of the data
separate. We saw in the Versioning chapter that using Content Types we could version
what we were getting back, but invariably that was really mixing the two metaphors, we
were mixing both the formatting and the versioning into a single sort of entity; this
separates the two.
Let's look at these two format types.
HAL
So HAL stands for Hypertext Application Language.
This language is meant to be a lean Hypermedia type. It wants to be more brief than
some of the other proposed standards out there and in fact the one that I'm leaning most
heavily towards because I like the brevity of what it's trying to do; it's trying to do just
enough Hypermedia to be helpful without inundating you with a lot of structure or forcing
you to restructure your code with a lot of ceremony.
It supports formats you're already used to like JSON and XML to include
resources and links together. The Content Type is called application/hal+json when
you're expecting or sending back JSON, or hal+xml for obviously XML. More typically,
you're going to use a Profile Media Type to define the kind of data you're actually looking
for; in this case looking for the order type from the Wilder Minds site.
Here's a useful picture that I got actually from the HAL specification itself; you can see
the link on the bottom there, to stateless.co.
This defines pretty much what HAL is going to look like. It is resources that have links
associated with them, and then embedded resources that may have the same structure,
and this can continue down the chain. So, the top level object may have embedded
resources, and each of those objects may themselves have links as well as other
embedded resources.
Let's look at an actual example.
So here, much like the examples we looked at earlier, we're just looking at a simple
JSON object. And the first you'll notice is that _links is the standard name they want to
give to any of the sort of links that are being passed down as this result. Now one of the
first things you'll see is the self link, and this is going to indicate that this is the URL that
was retrieved, and the object that is returned represents this self URL. Now this href
could be relative or absolute; in this example they are relative, and then it can include
other links. Much like we saw in our earlier examples, this is more formalized, and one of
the things you'll notice is that instead of having a separate property called rel, they're
simply making an object that you can look up by name. This makes it a little easier to
consume from JavaScript, which is one of the more common languages that you're
going to consume this from. And it can even have what are called template links. This is
part of the HAL Spec that actually points to a different specification for defining how to
define URLs that have optional or templatable elements, and we'll look at what the
templates look like in a minute, but being able to say that this is a templated URI is
useful to tell the user of this API that here is a URI that's useful to you, and then you're
going to be able to template it with your own data. You're still going to include data that
is simply part of that return, whether it's things like total count or it's the error code, or
time to execute, those sorts of things, you're still going to include a simple property, but
they have the special _embedded property that's going to include the actual results, the
embedded results to the sort of top level object that's being returned, and so we may
have something like games or results that is the actual data that you actually asked for.
And in this case, the game object itself just has basic data that you might be used to, in
this case price, currency, and name of the game, but like the top level object, each of
these individual items may also have links. Here we can see just a single self link, but
you may have other links for things like deletion and insertion, or updating, and those
sorts of links, or even links to associated data, much like we saw in the earlier example
before we introduced HAL.
So template links are something I want to kind of bring together and help you
understand what they're all about. Defining a link as being templated is really pointing at
template URIs, and here you can see the URI if you want to see the full Spec of how
they're defined. HAL doesn't define what templates look like, HAL simply points at
another standard that is out there for template links. But essentially, you're taking what is
inside the curly braces and saying, this is where you're going to put data; in this case,
this games is going to become ?query, and then whatever the data that can be supplied
to query as a query parameter. So, if I was searching for games that had the name halo
in them, I could indicate that here. And there's a full syntax for adding multiple parts of
the query, the query may be part of the URI, it may be part of the query string, it may
have multiple parts, all that is discussed in the RFC6570, so I don't want to duplicate that
effort a lot here, but understand that the power of what HAL is doing with templated links
here is allowing you to not only have simple links over to discrete operations, but also
having links that may include variable information like a search link would have here.
Collection+JSON
The next standard we'll look at for communicating Hypermedia in your results is
Collection+JSON.
And this standard is really for allowing the standard reading and writing of collections; it's
meant to be very self-describing. It defines a standard way to communicate lists and
individual items, as well as includes UI information, so that if you're trying to fill in an
object, some object that's inside of a list, you're able to know what sort of labels for input
controls you should use in order to gather that information. And so, it contains a lot more
metadata than HAL does. It uses that MIME type of application/vnd.collection+json to
indicate that, and you can use the Profile Media Type as well, that's very common in
Collection+JSON. The Collection+JSON doesn't have a corollary of Collection+XML.
Some people have sort of talked about creating that standard, but Collection+JSON is
fairly tied to the way that JSON works.
Well let's take a look at an example.
In Collection+JSON, it always starts with that collection element as the top object, so it
sort of creates one nested object below it, that's including information like what the
version is and what the href of this collection is. It also has a set of links similar to the
way that HAL does, but they're defining them in the typical rel and href properties, and
then items in that collection are defined in another property, and each item itself has it's
own href, and then the data for the items in that collection. So in this case it's just simple
JSON describing the objects of the data. And each of those items can still include their
own links. In this case we're saying the link is to a blog, and notice the last piece of this
where it says prompt:Blog; this is an indicator to you of how you could create a link and
what to show to the user. When they say prompt, they mean what is visible to a user that
wants to know where this links goes.
In addition to what we've seen, they also have the notion of queries, and so similar to
what you saw in the templated links in HAL, queries allow you to define the kinds of
different search semantics you have. Again, they're including the prompt property here,
so that you know what to show to the user when they're using this link. It also includes a
data section so you know what query string parameters to include. And the last one is
the notion of what's called template, and the idea behind template is really to give you
information about how to build a UI. So this is saying for the property that you want to
create, called full-name, the prompt should be full-name, and the value should be, in this
case, a string. Now, it's giving you empty values because this is the same data structure
once filled in with the values you'll actually post up to the server to make those changes.
Again, Collection+JSON is focused on how to create and maintain certain collections up
on the web.
The sweet spot for Collection+JSON is for simple machine-driven lists; this is really
where it excels. So, if you want to be able to point it at an arbitrary Collection+JSON
data source, you should be able to build a UI based on what they're giving you, simply
from what the API is returning. I find Collection+JSON a bit too verbose for real data-
driven RESTs, so as I'm creating most REST services, if I want to include Hypermedia,
I'm really leaning on HAL instead of Collection+JSON, because of the additional
verbosity. The idea of including these UI elements really allows you to create automated
code, but I don't find that this works all that well in practice unless you're dealing with
maybe SharePoint lists or the kind of information you're dealing with is fairly fixed. In
practice, in most APIs I've developed, using something like Collection+JSON is just
going to sort of bloat the API and not going to be as informative to my users. And at the
end of the day, you just find that the payloads you're returning to your developers using
your APIs end up being bloated because of it.
Summary
So when designing an API where you want to include Hypermedia to create these
better versions of APIs that can be more self-describing, and help your users build
systems based on your APIs in a more simple way, you can really leverage this
information from HATEOAS to create better APIs. There's a lot going on in the thinking
around Hypermedia and HATEOAS right now, and so these ideas are emerging. Tying
yourself too much to one concept in Hypermedia or another concept in Hypermedia may
make you a bit of a star in your company today, but it might end up biting you later when
the thinking around HATEOAS might change. Although APIs are going to be longer lived
and are going to develop, I'm not sure I would hang my hat on making sure everything
adhered to a strict interpretation of what Hypermedia or HATEOAS is.
If you're going to do Hypermedia, I really like HAL as being the middle ground of being
this brief or lean version of the Hypermedia-driven language, but it's small and
consistent. I really think that HAL is the right choice when you're doing HATEOAS today.
But at the end of the day when it comes down to it, if your API is an internal one or you
have a small number of users that maybe can deal with reading documentation or even
you already have documentation written, I find that using Hypermedia sometimes may
be more about the ceremony of making sure that I have this fully REST-ful API than
purely useful, and I really want to focus on what is pragmatic when I build and design an
API, not just what people say about whether it is valid, or good, or true; I'm trying to stay
away from what the community thinks about my API, as long as developers are willing
and able to use those APIs.
Now in many cases, using Hypermedia to decorate the result of my APIs can make them
easier to work with, and developers are going to be happy about this, but I would limit
the amount of use of Hypermedia to the things that are only truly useful; things like
paging, and insert and deletes, or associations, are really sweet spots there, but if you
start thinking that every result should return a full list of every operations that could be
done on some entity or item that you're returning, you're probably working way too hard
to make this happen.
6. References
https://app.pluralsight.com/library/courses/web-api-design