Web API Design-Notes

Web API Design

Ngo Nguyen Chinh

Hanoi 2016

(*) Here are my notes after completing a course of study offered by Pluralsight.

1. Introducing Web API Design

Web API Design

This course includes pragmatic advice on what are Web APIs, some basic Archetypes

for API Designs, how to Version Web APIs, how to Secure Web APIs, and finally we're

going to talk about Hypermedia, and what people mean when they talk about it.

This course isn't meant to teach you how to build APIs. This course is meant to help you

design APIs. The focus of this course is to help you Design an API, and the course looks

at the API from the view of the developer that is using your API. This is not specific to

Microsoft's ASP.NET Web API technology, it's about APIs on the web in general. This

includes REST APIs, we're going to touch a little bit on RPC, and Hypermedia or

HATEOAS APIs. These APIs can be written in a variety of languages; this course is not

about any specific language.

Let's get started.

What are Web APIs?

We're going to talk about the introduction of what we mean by Web API and the different

technologies that are involved. We're going to start by simply talking about What are

Web APIs? We will then move on to Why Design is Important to designing your APIs,

and then we'll discuss the different kinds of APIs, the nature of REST, The Role of

HTTP, and finally what is Hypermedia. All of these are going to be related to building or

APIs.

So the central theme of this course is going to be the nature of what APIs really are. We

want to be able to take a look at how people are going to be consuming services or data

that you're going to be exposing. This is the essential interface to your application, to

your system, to your architecture. So these APIs represent a way for the consumers of

your API to be able to access those services and data in the simplest way possible.

Now back in the day we would design these APIs, and then we would publish them with

some printed materials or a help document, or things like that, that would help them

understand exactly how to use them. The APIs we're going to talk about should be much

more self-describing. We're going to be using technologies like REST and Hypermedia

to help us understand the nature of what we're building without being bogged down in

the dogma of some of these key words that has caused arguments through the

community; we're going to try to cut through the dogma and really talk about the

pragmatic way to design these APIs.

This design is important because the developers that are going to be consuming these

APIs are going to want to be able to see how to use them in a fairly simple way. They

should be not only self-describing but self-documenting as much as possible. Being able

to look at the nature of what calls are available, and should be natural to take the next

step and get more data or use more services in your applications, whether this be if

you're exposing something like customers from an API, the orders for the customers and

the line items for those orders should be a natural progression in the API; I shouldn't

have to go back and look for each call in my system to look at interrelated data. In that

same way, when I want to deal with the different operations on that data, it should be

fairly clear whether that's using Hypermedia to describe those operations, or whether

using HTTP verbs to really hint at what are the other operations that are allowed.

Let's talk about the API ecosystem on the web today.

The API Ecosystem

When we say Web API, what do we really mean?

In the beginning of web development, when we needed APIs we typically relied on

something called a Remote Procedure Call. This was really a hearken back to an

earlier time when we were building systems that needed to talk to each other by using

things like COM+ and DCOM or CORBA in the Java world, in order to create systems

that could talk to each other over a network connection. When we came into the web, we

decided, hey we already know how to do this idea of Remote Procedure Call, let's just

do it over the HTTP layer that we're using to communicate and deliver our websites as

well. Remote Procedure Call is typically identified by a couple of ideas. One is that

they're going to use URI Endpoints or address to get at certain pieces of functionality,

services or data, as we talked about before. But unlike some of the technologies we'll

look at, the verbs are typically included in these APIs. For example, we could look at

an API that said Get Customers as part of the URI. So, if we had /API/GetCustomers,

that is more of the type of operation that we would see in typical Remote Procedure Call

systems.

Back in 2000, this idea of REST sort of blossomed, and there was some discussion

about REST versus things like SOAP, but REST has become a common pattern for

building these systems, and REST is different from RPC, it also uses URI Endpoints,

but it typically dictates that the URI should be resource-based. So, instead of Get

Customer it would just be Customers and Orders and Invoices, and the other types of

objects in your system, instead of including the verbs in the API name in the URI

Endpoint, it does this with HTTP verbs. So, if you want to be able to get at the customer

list, you would simply issue a Get HTTP command to that customer's endpoint; if you

wanted to create a new one you would post to it, etc. And REST also dictates that the

server be stateless, so that as we make additional calls into the system, the server isn't

holding on to some state that we need to be remembered by, and this is something that

happens an awful lot in RPC where we have some token or some session state that

knows about us every time we call. REST-ful is indicated by trying to stateless. And

then finally, the kind of data that you're pulling back or the type of result from those

services typically isn't tied to a single type of format. There's a Content Negotiation,

which we'll talk about in a few minutes, to help the services figure out how the client

needs the data. If the client is a webpage, something like JSON or JSONP is

appropriate, whereas if they're dealing with something like a rich client they might be

more comfortable or easier to manage with something like XML. The server should care

less about what that content is and allow a negotiation to happen to determine how to

return that data.

And finally, somewhat more recently, this idea of something called HATEOAS, not an

acronym I'm particularly fond of, but it stands for Hypermedia As The Engine of

Application State. Essentially this adds onto the idea of REST-fullness or REST

interfaces, and includes in the payload links to do other operations. So, there will be

links inside of the payloads of the data from these services that will indicate to the user

and to the user of the API other operations that can be successful. You might have the

idea of submitting a new invoice, and getting an old invoice might give you the URL on

how you would submit an updated version of that invoice. This is the idea behind

Hypermedia; we'll talk a little bit more about that soon.

Resource-based Architecture

Before we dive into the actual design, let's start with some of the foundations, and one of

these foundations is the idea behind a Resource Based Architecture.

Resources are simply put, Representations of Real World Objects or Entities. We

can think about these as People, Invoices, Payments, other things in systems you're

building; you're probably already doing this in your existing software development

career, your created classes, or structures, or databases to store this sort of information

and consume this sort of information. We're simply saying that what we're talking about

is going to start with this idea of resources. In these resources, relationships are typically

nested down a path of those resources. So, if we have the idea of a customer, the

customer may have a relationship to its own orders, and those orders might have a

relationship to their own order items, and those order items might have a further

relationship to the products that are being purchased in each of those line items. You

should think of these as Hierarchies or Web information, not necessarily Relational

Models, because the kind of data that you're going to be dealing with and you're going to

be producing in these APIs, and consuming, is going to be typically Hierarchies and

Webs, not Relational Models in the sense of tables and related tables in the strict sense

of relational databases.

In these Architectures, these Resources are normally Represented as URIs. These

URIs are Paths to those Resources, so when you want to create your APIs, you're going

to want to use URIs to get at those Resources. Query Strings are often used in these

URIs as well, but for non-data elements. So you don't want them to represent verbs, you

know operations against those resources, and you don't want them to represent the

actual data. Often they're used for different purposes, like sorting, maybe filtering, and

sometimes what formats you're getting it back at.

Let's see what this looks like in REST.

Introducing REST

So when I started to do the research for this course, I reached out to some other authors

of courses out there to talk about what does REST mean to them, and unfortunately, this

has become a fairly contentious discussion.

I found that a lot of people are either on the one side of saying that REST as a

philosophy isn't super useful because it's so dogmatic; the constraints it puts on a

system to be blessed as being REST-based becomes overwhelming and not useful to

the day-to-day developer.

On the other side I heard arguments that REST is really important because it helps us

dictate how we want to design those APIs, and that the specific constraints of the

original discussions about what REST is and what REST isn't aren't as useful.

So my goal here is to talk about REST in a very pragmatic sense; how REST can help

you design good APIs, APIs that are going to stand the test of time, that won't have to

change often in order to deal with new constraints, but also leverage what is good about

developing APIs on the web.

So what is REST?

The term REST simply means Representational State Transfer. When we talk about

Representational, we typically mean the resources that we want to transfer across the

wire. These Representational States are typically resources, customer's orders, details,

products, etc. But in order to be considered REST-ful, there are some concepts that Roy

Fielding included in his original papers on this, and we want to really understand what at

the end of the day is useful in REST, and take what will help us build great APIs from

REST, and sort of leave the dogmatic strictness of REST sort of on the table. The

concepts that come clear from REST that I think are important is the Separation of

Client and Server, that the clients are going to call into the server based on URIs, and

the server is going to try to meet those URI requests, whether that's returning data,

whether that's adding data to the server, whether that's changing or deleting data, and

that each of these Requests should be Stateless. There's no notion of who the client

is so that the servers can be scaled out more seamlessly. And that where possible, as

many of these requests can be Cached as possible. When we talk about Cachability,

we're typically talking about caching of data results, so typically gets into a system, you

aren't really going to be able to cache insert of new items or deletion of items, but being

able to cache what Requests are there for getting data, and as long as the data hasn't

changed, you can be pretty aggressive about your caching. And we also want to make

sure that we're talking about these Uniform Interfaces, that when someone comes up

to our API that we're really saying that if you were able to get through an API a customer

object, and you know that there are order objects out there, that you can probably get

them with the same pattern that you got them from the customer object, and so that

really means that you may be walking down the URI by saying customer/1 for the first

customer, /orders to get the orders for those customers; I should also be able to say

orders/want to get the first order in the system as well. These URIs are going to look like

each other, and that's what we mean by Uniformity or Uniform Interfaces.

All of this is really good when we think about what is useful for defining what REST is.

Some problems come in that the specific constraints in Roy Fielding's work to qualify

your interface as being strictly REST-ful or not REST-ful, tends to add a lot of

constraints to the system. In my experience, trying to make your API strictly REST-ful,

or adhere to the REST principles, means you're spending a lot more time trying to follow

the letter of the law instead of the spirit of the law, and at the end of the day, worrying

about whether you're in this walled garden of what it is to be REST-ful or not REST-ful

isn't getting your job done. So I like to take what is good about REST, bring it in

pragmatically into what I'm building, but not worry so much about strict adherence to

those principles in a black and white way. And a lot of this comes about that when we

talked to different experts in the community or even developers in the community,

there's a split about how important the idea of REST is, because REST can

become very dogmatic. REST can be worried about strict adherence to a defined set

of rules instead of getting our job done as developers. We can learn a lot and pattern a

lot of what we should be doing from REST, but worrying about never straying from that

wild garden can get us in trouble in my experience. So, I'm going to teach this course

really from a pragmatic sense.

I'm going to try to take what is best from REST and apply it to your API designs.

Hypermedia

So the last big piece of the puzzle is this notion of Hypermedia, but let's step back a

minute and talk about Hyperlinks. The web is really drawn in by this notion of Hyperlinks,

and it was a core concept in the creation of the web initially, the ability to have

documents and websites that are linked to each other within themselves, to create really

a web of information, to have each of these different pieces around the internet really

tied to each other.

Hypermedia is a little bit like this. When we talk about Hypermedia, it's really a way for

the results of our API calls being as Self-Describing as possible. Hypermedia is simply

a way to have links of resources that describe how to process the data or how to

get at the data in special ways. These are Hyperlinked for Resources, so you can

imagine that the links may include ways to get the cover for an album, it might be a way

to add new items to collection; it's a way that the messages that we're getting from our

APIs are going to tell us more about how to use the service itself.

Is this important?

Hypermedia is HATEOAS. This Hypermedia As The Engine Of Application State, is a

design pattern that you're seeing in more and more APIs. Using it doesn't make your API

better or worse. Depending on what you're trying to accomplish, this can be very useful

or just additional overhead. If you are creating APIs that have special ways of describing

what they need to do, or maybe needing to be implemented by machine systems, this

can become very important. The idea of HATEOAS or Hypermedia is there so that you

can create these self-documenting APIs that can be very dynamic. But you can have

great APIs without HATEOAS. In fact, there are many, many APIs out there that are

considered solid and dependable and well-documented, that don't have any

Hypermedia. Again, don't get caught up in that dogma of your API isn't good enough

unless you're using every part of that REST stack, including Hypermedia.

What kind of API to use?

With all this information about how REST works, you still have to tackle the question of

should you be using REST, and should you be using it in the strict sense of what REST

mavens out there expect your API to be.

The Archetypes or the types of APIs that are out there do vary, and so you have to look

at what you're trying to accomplish and see what makes the most sense for your specific

project. REST or REST-ful APIs are easy to use and maintain, but if your API doesn't fit

this Resource-based Model, using something like Remote Procedure Call style or some

custom APIs, is acceptable. You may find that REST is too limiting for what you're trying

to do; maybe it doesn't fit into a resource model, or maybe you really need something

that is more driven by procedures, so something like Remote Procedure Call or a

custom API may be the right solution. Trying to take what you need to get accomplished

done and fit into the REST or REST-ful model can be counterproductive, so don't get

caught up in the idea that your API has to be REST or REST-ful to be a good and valid

API, but remember that REST matches many, many use cases, so make sure that

you're not avoiding REST just to avoid REST. I generally say to clients that if you're

building a new API, starting by designing and using a lot of the REST Symantec we've

looked at here, is the way to start the approach. If you find that it becomes too limiting or

you're trying too hard to make it REST-ful, then you can start breaking out of that box,

but starting with REST as the natural starting point for your API design is usually what I

suggest.

Summary

Now that you're through the first module, you should see that API design is really

important.

Designing APIs are as important to developers as designing UI layers are to users.

By making your APIs easy to use, obvious, and simple, that's going to increase the

adoption of your API; you're going to get more and more developers using that API,

which is often the goal of any API.

Looking at the requirements of your API and then deciding whether your API should

follow a strict REST pattern or something more flexible is going to be a key to whether

you're successful or not. There are no right and wrong answers here. You're going to

have to make clear decisions about how you want to design these APIs, and know that

you may make mistakes. There are better and worse suggestions here, but there are no

black and white right or wrongs.

And I know I've been saying it a lot in this module, because I think it's really important:

Avoid the dogma of what everyone thinks is the perfect and great API and understand

that as developers we need to be pragmatic about these designs. If your decisions about

how to create your APIs continue to be pragmatic, and what I mean by that is that it's

going to serve the final use cases and serve the developers who want to use your API,

then it is probably the right decision. If you're making design decisions about whether it

will be classified as a true REST-ful API, you're probably making the wrong decision.

2. Designing The API

Introduction

In this module we're going to be talking about designing the actual API itself. We're

going to start by talking about how to Design for the URI, Understanding the role of

Verbs, dealing with Status Codes, Associations, Designing the actual Results that are

returned, ETags, Paging and Partials, and finally Non-Resource based APIs.

Let's get started.

URI Design

So to begin our URI design, we're going to want to look at what the URIs actually look

like, and one of the ideas I want you to get a sense of is that the URIs should contain

Nouns not Verbs.

The problem with verbs is that we start very innocently with something like

getCustomers and saveCustomers, but very quickly this starts to balloon into a bunch of

different sorts of things we want to do with the data. Now instead of having just one

endpoint that we're maintaining, we're having to maintain a number of different endpoints

as different parts of our URI.

The solution is to use Nouns, or Resources, as REST likes to call them. The idea here is

that you want to have endpoints that are described as Plural versions of whatever nouns

you're going to expose. For example, this API has an endpoint called Customers; you

may have Games, Invoices, so you're going to use a Noun to indicate the endpoint that

is going to allow you to manipulate a type of object that you have on the server and that

you want the users of your API to be able to use.

Against those endpoints you're going to use identifiers to point to individual items in the

collection. This does not have to be the key that you use naturally. It does not have to be

the primary key that is contained in the database or some magic key, it can be one that

is generated. For example, being able to get a customer by its key, the API is simply

saying look at that noun, the Customers, and then give me the item that is identified by

the number 123. This may be the ID or something else. An example of something else is

something like Games where you have some unique identifier to find the individual

game, like halo-3; or in the case of Invoices, a date for that invoice. The important idea

here about that key is that it does not have to be the internal key that you're used to

using. It can be something generated, but it has to point at one and only one item. It

can't be generated every time someone comes to the API, because there's this notion of

item potency, and what that means is that when someone retrieves data or pushes data

using this key inside of your API, that it always refers to the same object. Very commonly

this is going to be some primary key that you have in your storage mechanism, but it

could be something that is unique, like the full title of a story in a blog, or it could be the

concatenation of data as it's related; maybe customernumber-invoice number for

invoices. It's up to you to determine what that key is, but it has to remain tied to that

single entity.

Understanding Verbs

So, if your URI design is supposed to be nouns, where do Verbs come in?

In the case of verbs, we're talking about HTTP verbs, and these verbs can be easily

matched to the create, read, update, and delete that we're probably used to when

dealing with things like database data. So, if we take our resource endpoint like

customers, we know that if we do a GET it's going to return a list of those

customers. If we POST a new customer to that endpoint, it's going to create a New

Customer. If we PUT a collection of customers to this endpoint, it will do a Batch

Update of those customers, and if we try to issue a DELETE against that resource,

it's going to give us an error, because we cannot delete the entire list of customers.

Conversely, if we issue these same verbs against the item collection itself, it's going to

do something a little different. So, if we do a GET it's going to get that individual item for

us. If we do a POST it's going to give us an Error, because we can't POST a brand new

item to an existing item. If we do a PUT it will Update that Item, and if we do a DELETE it

will Delete that Item.

But what should you return from those verbs as you encounter them?

o In these cases we're going to look at those same verbs, but figure out what we

have to return back to the client so the client knows what to do with them.

o So in the case of GET it should be pretty obvious; the customer's endpoint

returns a List of those customers. A GET to the Item endpoint is going to return

just the individual Item that the user has pointed at.

o If we POST, like we talked about before, to the customers with the data that

represents what is a customer, it will create a New Item, assuming it's all correct.

And we should return from that POST a new version of the item that was

inserted; not only that the creation happened, but a formatted object that

represents that New Item. And the reason we do that is sometimes part of that

creation process is setting things like default properties and generating the key

that they're going to need and things like that. So returning them that brand new

object is very useful for them to be able to consume what is really the last version

of the object as it existed on the server, which is just a moment ago when we

created it. If we attempt to POST to an individual item, because we can't POST to

an item like we talked about a minute ago, we should return a Status Code, and

that Status Code should be an Error Status Code, probably a 400, in that the

user of the API has done something incorrect.

o In the case of PUT, if we take a collection of updatable objects and PUT them to

the customer's endpoint, it's going to attempt to update them all and then return a

Status Code to say whether it succeeded or not. But if we do a PUT to the item

resource, it will return the updated item because it makes sense to return an

individual item that was updated, and it may have been updated with more than

just the data that was sent to the server; it could include things like the last

updated date or update-related keys, and so you always want to return that

updated item.

o In the case of DELETE, because we can't delete the entire list of customers, we

should return a Status Code that is an Error Status Code when a DELETE is

done on the resource of customers. But if we delete an individual item, we should

return a Status Code of whether you were able to delete that individual item.

Status Codes

When we talk about returning Status Codes, we're talking about HTTP Status Codes.

So, the HTTP Spec Defines certain Status Codes, and there's quite a number of them;

not all of them are listed here, but a number of them that are pretty common that you're

going to see, 200 OK being the most common in that a request has succeeded, and of

course everyone knows about things like 404 Not Found, and 500 Internal Error, and

you can see some of the other Status Codes here. Now our services can use all of these

Status Codes if we want, but we find that in a well-defined API, that the number of

different sorts of Status Codes that can be returned from your particular service can be

simplified into maybe 8 to 10 sort of Status Codes. There are exceptions to this, and you

may need more than 8 or 10, but trying to get a sense of what different Status Codes

each type of call can return, simplifying that is going to make it easier for user of your

API.

Pragmatically, you want to use these Status Codes in returning from your API.

o At a minimum you should really support 200 for everything worked well, 400 for

you did something wrong, you made a bad request, or 500 for the server doesn't

know what it's doing and something has gone bad.

o Most likely you're going to also include these Status Codes: 201 for Created for

when you're doing a POST, Not Modified for when you're returning a cached

object, 404 for Not Found, and then 401 and 403 for when you're dealing with

Authentication and Authorization. Again, you may use more Status Codes than

just these, but this is a good simple set to start with.

Associations

So far we've talked about simple collections in your URIs or the list of customers, and

then individual items in that collection using a key. Associations are that next level of

object.

Associations are about sub-objects of other objects

o And we want to use the URI Navigation path to imply that there's a relationship

between them. So, for example, to get all of the invoices for a particular

customer, you could add another part of the path that says get the invoices for

this particular customer. You could see getting the Ratings for a game, or getting

the Payments for an Invoice. So, we're talking about getting information that is

contextual to the object that it's behind.

o These Associations should return a List of those Related Objects or a single

Object if that's the kind of relationship it has. So that if we look at the API for

getting the Invoices of Customers 123, the shape of that result should be the

same as if we just went and got a list of invoices. That way the user of your API

can really deal with it in that same way; you're really only telling it that by using

this path, you don't have to issue a query against Invoices or walk through to find

Invoices for a customer, you're going to return only the ones that are relative to

that item in the collection.

There may be multiple Associations for the same object. So, while we've looked at

Customers having Invoices, Customers also may have Payments, and they also may

have Shipments, so you can have multiple Associations for each type of object, you just

have to make sure that you're dealing with each of those in the correct way.

If you have more complex needs, you should just use query string parameters to

deal with them. For example, instead of having states that have Customers and those

Customers have Invoices, it might be just simpler to allow you to do something like a

query where you can say Customers?state=GA or Customers?state=GA and is from this

individual salesperson. So, instead of trying to fit everything into simple entities or simple

Associations, use Associations where it makes sense, but understand you can go to

query strings in order to get very specific data as necessary. Associations are an

important part of what you're going to design for your API, but don't try to rely on

Associations to solve every related entity problem you have in your API.

Formatting Results

So what about the formatting of the results that are returned? How do you know what

format they should be in?

The best practice is really to use Content Negotiation

o And what that means is to use an Accept header in the request to the server

to determine what formats are supported. The idea behind the Accept header

is simply to tell the server what kind of data you can accept; I can accept HTML,

text, RSS, whatever it may be. So here you can see a simple request going after

our endpoint of games, and looking for that second game like we saw in our

example earlier. But here we're telling an Accept header to hint at what kinds of

data we can expect. Now the Accept header takes a common delimited list of

types of data that you can accept. Ordinarily when a user is calling your API,

they're only going to list one type here, and that's the type they expect to get

back. When there's more than one, the server looks at the list and finds the first

one that it can match. So, in this case, if the server supports JSON, it will always

return JSON. But, if for some reason the service didn't support JSON but did

support XML, it would fall back to XML.

o You do not have to support all of the types of data that the Accept header

puts in it as well; it may accept quite a few different formats, many of which

you're not going to support, so you'll be able to look at the list and find when it

matches the kind of data that you can format, and it's also a good idea to have a

sane default; XML or JSON as the default is pretty common, and being able to in

case the Accept header isn't included to fall back to a simple well known format,

and to be able to fall back to a sane default, usually JSON.

So the MIME types for Content Types are probably useful for you to know and

understand. JSON is application/json, XML is text/xml, JSONP is

application/javascript. JSONP is a JSON message wrapped in a JavaScript function.

This is often used for cross domain calls, so if you want to be able to support JSONP,

which is a pretty good idea, you're going to want your API to be looking for

application/javascript as something that's different from application/JSON. It's important

to know that when you're using JSONP, your API is going to require something like a

callback query parameter, because that's going to be the name of the method that's

going to be called with the JSON data when it returns. So, in this case you can see our

API here is going to need a parameter, usually called callback, and then the name of

some function to call, which the user of your API would specify. RSS is application, you

can see it here, and ATOM, those are two other fairly common formats; I find that most

of the APIs I've written in the last few years are really focused on the top there; I wanted

to show you that there are different sorts of Content Types. You might even find APIs

that returned non-textual data, so you might support things like img jpeg and img png.

So let's see how this Content Type matters.

There is another approach that in some cases can be helpful, but I would really lean on

Content Negotiation when you can, and this other format is being able to use URI

components to do this formatting. I certainly don't consider this a best practice, but

sometimes it is easier to do this when you have specific requirements for your data.

These are often cases where the consumer of your API can't modify things like Accept

headers, so if you are running an API that you also want to be able to get at from let's

say, Excel or something like that, being able to add a URI component to do that

formatting can be helpful. So an example here would be to include a query string

parameter that defined what format you were going to support. I've seen some other

cases where some APIs also used an extension; I'm not a big fan of this style, I'd rather

have the query string, but that is one approach. And in the case of JSONP, you're going

to be able to not only do the format but also include other query parameters you may

need like the callback parameter.

Result Design

So in designing the actual results you're going to send back, there are simple rules of

thumb I like to talk about.

When your API returns single results, those single results or individual items, should

be just simple objects, whether they be XML or JSON objects. So, if I'm going after

Customer/123, I should expect that the format of the data coming back should be just an

object that represents the data in that object. Now it may contain related or complex

types in it, like we can see with the address here, but it is simply an object that

represents that item.

When defining these objects, I suggest that Member Names shouldn't expose who

wrote the server. I see this a lot where you can see if Ruby and Rails is used there are a

lot of underscores, and if NoJS is used it's camel-cased, if .NET is used sometimes it is

even Pascal-cased, and I hate to do that; I like to pick one format that all my APIs are

going to use regardless of what the background is. I tend to prefer, because most of the

clients that I'm writing APIs for are JavaScript, to just use camelCasing. CamelCasing

ends up being the one that most developers are used to and camelCasing is the way

objects are going to look most natural when you're consuming them from JavaScript. If

you don't like using camelCasing, if you want to choose another way of defining your

Member Names, if you are using Ruby and Rails, and you want to use sort of the

underscore approach, that's fine really, the only thing I would ask is to at least be

consistent.

When Designing Collections it's a little different than just returning the collection. We

actually saw this in one of our earlier examples, and my suggestion is to wrap the

collection around a simple object, that way you can send additional information in the

body of the collection. So, if we simply return an object that contains both data about the

result that we're returning, as well as the actual result, it can be very useful. So if we

want to be able to return certain kinds of data, like in this case the number of results that

were found on the server as well as the actual results themselves, we can do that, so

that we have a container for information about the collection not about items in the

collection.

ETags

Another important part of designing your API is to work with Entity Tags.

The idea behind Entity Tags is to help the server cache better, and so when you're

developing your API you're going to want to support this notion called Entity Tag

Headers. Entity Tags support both Strong and Weak Caching, and they're returned

as headers in the response.

For example, when we make a request, the return of the response can include this

ETag. This is an identifier from the server that is basically a version number for the entity

that was returned.

We can also have a Weak version, and the Weak versions start with W/, and this is for

the server to tell you that this is a Weak Cache. ETags also support the notion of a Weak

Tag.

A Weak Tag starts with the W/, and the difference between the Strong and the Weak

type of ETag is that the Weak Tag says the two objects are semantically the same,

whereas a Strong tag indicates that they are byte- by-byte identical, and so depending

on the type of data you're dealing with, you may want to deal with objects as being a

weak ETag or a strong ETag.

Now, what would you do with this ETag?

That's really where the important part of the story comes in, because this is returned with

a response, and so the user of the API is expected to be able to test for this ETag when

it goes and makes a request, so the client should be sending this ETag back to see if

a new version is available instead of getting a brand new object and dealing with a

brand new object, even if the data is stale. This is typically done with an If-None-

Match header, so if I go and request this game object that I did earlier, I would take the

value from the ETag and put it in the If-None-Match header, and if it matches this, if the

server says this was the ETag for this object, it will simply return a 304 or not modified

status, instead of a 200, and the body of this request would be empty.

This is the same notion of cached images when you're dealing with them in typical web

development. This allows you to do it at the entity level itself, and you will use the If-

None-Match header with the same value of the ETag that was sent to you, and if this did

indeed match, the server would return a 304, which is the not modified status; it

wouldn't have the body of the entity or the individual item anymore, but simply say it

hasn't been modified. Therefore, if the last version you had used this ETag, go ahead

and don't return the new copy, which is just a literal copy of what the client should

already be dealing with.

If I switch over to Fiddler, we can see that I can make a request to the server to get an

object, so I'm going to go ahead and Execute this, and this returned our object as a

JSON call, but in the headers of that object is this ETag.

So, as a client this is going to allow me to, if I choose to, as a smart client, I'm going to

try and use this as much as possible, I can use this to test whether this object has

changed. So, if I go to the Raw view and let's copy that ETag value, and then in the

Composer I'm going to use an If-None-Match;

This essentially says if the object I'm about to request doesn't have the same ETag, go

ahead and return it to me, otherwise I'm going to get a 304 error as shown here, 304

meaning not modified.

And the body of this result, if we look at the Raw version, is the same ETag and no

body, because it didn't need to send back a copy of the object because it knew that I

already had a copy that was the same one that was on the server.

This is often used for optimistic concurrency as well.

So this ETag can check to see if there's a new version when it's doing some like a PUT.

So, for a PUT I can use the If-Match. If the object on the server matches the ETag I have

here, then go ahead and update it with this new data. If it doesn't match this, then I

should probably go back to the server, get the new version without the If-None-Match,

present it to the user, have them make their modifications again, and then do the PUT

again. This allows me to test in the header of my PUT that I'm dealing with the same

object version on the server as I had before.

And if I issue a PUT with the If-Match and it fails, it's not going to return a 200 or a 404,

or any of those, it's going to return a Status Code of 412, Preconditioned Failed; this

means one of the preconditions in the header, in this case, If-Match, failed, so I know

that the update did not really happen, because the If-Match didn't match the ETag of the

object that was on the server.

Paging

In your APIs whenever you're going to deal with returning collections, these lists should

always support paging.

Now, you can support paging in a number of ways, but let's talk about the importance of

paging. The idea behind paging is to prevent your sever from returning voluminous

amounts of data that the client can't really deal with anyway. If you returned 1,000

records, the user probably isn't going to look through all 1,000 unless the client really

wanted to deal with the paging. You also don't want to have to deal with the load of

building up those large result sets when your server is busy, and trying to return them to

a number of clients. And so it's not about just supporting paging but really requiring

paging.

So, you can use Query String parameters to accept the paging information, but one

of the important aspects of this is making sure the first set of list that you return is

only the first page of that data.

It's often common to use the Object Wrapper that we talked about earlier for lists,

indicating the next and previous links so that it's easy for a client to walk through the

pages by just using these additional links.

So here's an example of a result that's going to tell us in the body, oh this is how many

objects there are for us to get, and by using simple properties we can see the next page

and the previous page as URIs back to our service, so we can very easily do this sort of

paging.

So let's see what that looks like. I have our sample API here. In fact, I'll go ahead and

issue a request just to GET the entire games collection, which happens to be more than

1,000 results. When I Execute it, what it's actually going to show me is a smaller number

of results; this number of results is actually 25 by default, so I'm not overwhelming the

client with the amount of data, I'm telling him how many are available in the server with

this total result, but I'm just supplying the first 25 results, and then providing a link or two

to the next result. So here, the next page, href, is just ?page=2. Now, I could document

this in my API, but it's really useful to be able to put it in the actual package that's being

returned back.

So this means if I go back to Composer and I simply say page=2, what it's not going to

return is the next set of 25 elements, and notice now that I'm not on the first page I can

include a previous page, which is pretty common. Obviously, the first result doesn't have

a previous page, so in this example API we're not even computing that, we're simply

saying, hey, the next page is this, so we can see previous is page 1 and next is page 3,

but 1 is the default, so in fact getting just games is going to give that first page. And this

allows people to build clients that use their APIs in a much simpler fashion.

When you're doing paging, even though you might have a default page size, like our

example a minute ago had a page size of 25, you might also want to support different

page sizes; you might want to support them getting a different amount than the default

by maybe supplying a parameter. You should limit this page size to a reasonable

amount so as to not incur extra server load. We saw in the API example a moment

ago that we could indicate the page here, but we could also indicate the pageSize.

Now the terms of your page and page size aren't actually terribly important; different

APIs use different semantics here, some use Take-And-Skip so that there isn't an

implicit page size, you can just sort of do what you will. Many OData REST feeds really

lean on this because this is a common strategy in things like LINK and .NET.

But using the page number, page size, or result size, or whatever you want to call it, the

name isn't as important as the actual functionality.

Partial Items

The last consideration for designing your data-driven API, like most of the examples

we've looked at so far, is to deal with partial items. Now it's a pretty typical request to

request partial items from the service. Query string parameters is a common pattern

for this, and you can see a lot of example on the web that do this. The idea behind

partial is to allow a user of your API to pick what fields it needs for a particular request,

so that the payloads can be smaller instead of you always returning these very verbose

objects that the clients themselves aren't really even using.

A good example of this would be using the ?fields Query parameter where you simply

list the fields that you want the result to include. This pattern of including the names here

could also include the names of fields in sub-objects or associations as well; that's really

up to you, but the idea would be to allow the user of your API to decide what fields are

important. Now this is sort of an optional part of your design, but doing this will really

allow users to consume only the parts of the data that they need, as well as reducing the

footprint of your service, because you're going to be producing smaller serialized objects

and the clients are going to be consuming smaller objects, therefore the roundtrip should

be quicker.

You can also support Updating of those Partial Items as well, and there's a special

verb that's often used for this that's called PATCH. The idea behind PATCH is to be

able to send in a partial object or a subset of the original object with just the fields

that are updated, and check for concurrency based on the ETag that we talked

about a few videos ago.

So here's an example.

We're using PATCH against an individual item, and we're using the If-Match header to

make sure that our ETag is going to match the original requested object. And you can

see here that we're sending back a small subset of the full set of fields that this service

can return.

A service normally has about 10 or 12 different fields, but we are only really updating a

couple of them here, and so we're only going to send this partial object back; it's going to

be the responsibility of your API based on a PATCH to look at this partial object and map

it to the full object in order to do that updating.

Using the ETag will allow you to do actual concurrency here without having to rely on

field by field checking or whatever other semantics you use for doing that. It will know

that the version of the object on the server is the same version as was originally

requested, so that it should simplify the partial item update story.

Non-Resource APIs

So what about parts of your API that aren't really dealing with entities or domain models

in the same sense that you may be used to?

What if you really need to have some Functional Part of your API.

Now this normally breaks the rigorousness of a REST-based API, but in a pragmatic

sense, you should be able to add these elements as necessary, because we're trying at

the end of the day to solve business problems, solve technical problems, provide the

sorts of functionality that our users of our API are going to need.

These functional parts of your API should be well documented that they are in fact

functional parts of your API and not resource APIs; that you're not going to be able to

necessarily do things like PUT and POST and DELETE these elements, that they're

really about calling GET and doing some functional basis.

It's important that you make sure that these parts of the API continue to be

completely functional not resource-based.

The problem is that you can very quickly get into a case where you start to build

functional parts of your API that really should be resource parts of the API. You start to

do things like match the idea of something like a stored procedure to a REST-based API,

and you're going to very quickly fall into sort of the morass of a badly designed API.

So here's an example of one, calculateTax, where you're sending in with Query

parameters some definite data that will help you do the calculation. Now, instead of

Query parameters you could also send in the body of a formatted data, or JSON data, or

XML data to do this sort of operation, but it is doing some sort of functional piece of work

here; we're not asking it to add a new invoice, we're not asking it to create a new

customer, we're really doing a non-resource based part of our API. You could even see

things like restartServer or beginWorldDomination, things that are functionally part of

what you really need your APIs to be able to accomplish.

Summary

Let's wrap up this basic API design part of the story.

Remember, you can design a great API, but you need to be careful not to surprise your

users. By following some basic tenets of the way REST works, you can create APIs that

should be familiar to people that have used other APIs, especially other REST-based

APIs. You can certainly invent something yourself in creating APIs, and sometimes even

create something very functional. But by taking some of the lessons learned in this

module, I really hope you follow the patterns of other APIs that are out there.

At the end of the day, part of your job as a developer is to protect the server from the

user and protect the user from server, and so getting a good balance in the middle of

being very useful for the user but not allowing a single user to do something bad to the

server, like make really large requests, is really what you're after. By making sure you're

using aggressive caching and the use of ETags, you can really allow the user to be a

good citizen to your server without you having to go do the work every time someone

hits your server for the same data.

At the end of the day you need users to make your API a successful API, so making it

easy to use and fulfilling the needs of those users are what's most important, not making

conference speakers happy or not fitting into what I consider maybe a too rigorous

definition of what we would normally call a REST-based interface.

3. Versioning

Introduction

In Module 3 we're going to talk all about Versioning your APIs. This is going to include

why Versioning is important, we're going to show some examples from public APIs and

how they're doing Versioning, and we're then going to talk about patterns for Versioning,

including URI Path Versioning, URI Parameter Versioning, Content Type Versioning,

Custom Header Versioning, and which one to choose and when. We're also going to

touch on the topic of Versioning your Resources themselves.

Let's get started.

Why Version your API

So the first thing is we want to talk about why Versioning is important.

Once you publish an API it's set in stone, and it's set in stone because this

publishing isn't a trivial move. You're telling Users and Customers that your API is

out there and they can start to write code again, but as you make changes to the API

that you're not going to break their code, it's an implicit contract between you and your

customers and users. But requirements for your API are likely to change, and so

you're going to need a way to keep the users and customers happy so that their

code doesn't break, but also support new requirements or changes to your API. You

need to have a way to evolve this API without breaking those existing clients. And one

thing to keep your head around is that API Versioning isn't the same as your Product

Versioning. Releasing a new API version every time you release a product isn't really

useful; only version your API when the semantics, the signatures, and the shapes of the

data you're dealing with are changing, and so you should resist the temptation to change

your API, do your best not to tie the two together.

And so at the end of the day you have one and only one commandment when dealing

with releasing your API, and that is you will not break existing clients, so your API

changes themselves aren't going to cause your clients to have to write new code, unless

they want the new features, new shapes, new support that your API provides. This

doesn't mean that you can't get rid of old versions of your API, but you will need to get

rid of those old versions of the API with some care, with a lot of communication, so that

when you eventually do stop supporting those APIs, your customers and users have

plenty of notice that they're going to have to upgrade or move to a new version of the

API.

Is there a right way?

So many of you may be viewing this module to find out the one and best way to version

an API, and unfortunately there isn't one.

When you look across the web at the different types of APIs out there, they're versioned

in sometimes very different ways, and the methods that are used to version APIs can be

pretty different; they have different pros, they have different cons, so you have to really

find the version of your API that works best for you, and we're going to present a few

options for doing that Versioning, but the important idea here is that you're going to

Version your API.

So there isn't one way to Version your API. We can see existing APIs out there and

see some of the options that are chosen for Versioning of those APIs, but many of those

public APIs have done it very specifically to meet internal requirements, so it may not be

at the behest of the users of an API why Versioning happens, it may be really driving the

way that the developers of the API needed to do the Versioning. There are some

external requirements as well, and that is how difficult it is to use the API. You may

decide to Version using one method or another method specifically about the difficulty in

that. I would love to be able to give you the single option that I would recommend, but I

can't. There simply is no one right way to Version your API.

Examples of Versioning

So let's look across the web at some public APIs out there that do Versioning in different

ways.

Let's start with Tumblr, a pretty popular API out there. The Tumbler API uses a URI

path to do the Versioning. They essentially have a version embedded in the path of

the URI that we can see as the v2 here, so that everything after the v2 is subject to

change as the versions of the APIs change, so there is no guarantee that in v3 of the

Tumblr API there's going to be a user object at all; they make small changes or they may

make large changes to the API. This pattern of using the URI path is really common,

you've probably run into APIs using this method.

Another pattern you can see here is from Netflix, and that is using a Query parameter.

So, instead of embedding it in the API, it's dictating with a Query string parameter what

version of the API to go after.

Another style is the Content Negotiation type, and this is where instead of using

anything in the URI to indicate the version, the content type that is requested in an

Accept header is used, and so this is a custom MIME type that includes the version

information. We can see the 1 here indicates the version of that object that is contained

in the GitHub API.

And the last type of Versioning we'll talk about is a Request Header. With Azure, when

you're going against their API, they're using a special Request Header called x-ms-

version, that is saying this is the version of the API that this is written against, and the

version in this case is just a date from when this API was released. Instead of using

simple version numbers, they're using release dates to do this Versioning, so you're not

tied into having specific version numbers for your entire API, for individual objects, or for

individual types of resources.

Let's walk through some more details and talk about the pros and cons of each of these

four Versioning patterns.

Versioning in the URI Path

So using the URI Path to do your Versioning, the Version becomes Part of the Path

to your API. This allows you to make big drastic changes to your API in later

versions. Everything below that version number is open to change, though the

amount of change you make will really be dictated by how much pain your users and

customers can take in their client code.

Here's an example where the v1 in the API is dictating what is available in that version of

the API. This is a very common pattern; it's probably the most common of all the

Versioning I've seen out there in public APIs. And in this case, you can see instead with

their version 2 of their API they might decide instead of including CurrentCustomers as a

customer type that they now just expose it as a different kind of resource, so that the two

APIs don't have to be that similar, though, of course, what they're doing at the end of day

is ultimately similar.

So the pros of this pattern is that it's very simple to segregate these old APIs, and

what that means is that you can really change the patterns of your APIs as time goes

along, and so you may decide to support the old APIs and implement a brand new API.

The problem here is that this pattern requires a lot of client changes whenever you

change the version. So, even if the whole API changes and you're just adding some

additional pieces, all your users and customers are going to have to go into their code

and change that v1 to v2, unless they only want to support what is in the old APIs. This

also increases the size of the URI surface area that you have to support, so that

when you release the v2 version, you may have a whole new set of code that is

supporting that version, and still having to maintain and fix bugs in the v1, and so often

it's an easier decision to use this type of Versioning if you want that sort of broad reach,

but at the end of the day you may decide against it because it can be a larger amount of

technical debt.

Versioning with a URI Parameter

The next pattern is using a Query String Parameter. One of the interesting parts of

using this is that the version can be an Optional Parameter, which means that you can

make sure that your API always without the Parameter is tied to a specific version,

usually the latest version.

So here's an example of using a simple API. There's no version in the URI right now, but

if I decide I wanted to go get the Customers of a very specific version of the API, I could

then include some Query Parameter that defined what version I was going after.

The pros here are that without a version the users are always going to get the

latest version of the API. It's going to encourage users and customers to use the edge

version of your API, even in some cases when they don't necessarily need to. There are

little changes as the versions mature; this also assumes that you're not going to make

great big changes as the version also changes.

The problem here is that because you have the optional version included as a Query

String, you can surprise developers with changes that they don't expect, and at the

end of the day, you may be breaking client code because they didn't include the specific

version they were going after. Now some of this can be mitigated by not making the

version number optional, and by not making it optional you're making it part of the URI

syntax, and it's in a lot of ways semantically the same as using the URI path we saw in

the last video.

Versioning with Content Negotiation

The next type of Versioning we'll talk about is with Content Negotiation.

And Content Negotiation simply means using a Custom Content Type and Accept

Header in the request.

Instead of using standard MIME types for the Accept types, application/JSON,

text/XML, etc., you're going to use custom MIME types.

Here's an example of a GET where the Accept type includes a custom content type.

Here's myapp with a version, and then the kind of object I'm looking at; this is a pretty

common pattern.

You can include formatting information in this Accept Header as well. So you can

see here putting a .JSON or a .XML in the content type could also tell the server what

kind of content it wants back, which is normally what the Accept header is being used for

anyway. This type of Versioning is becoming increasingly popular. It's becoming

increasingly popular because the version itself is separated from the surface area of the

API itself.

When defining your own MIME type, there is a standard for this. The standard indicates

that the "vnd." or vendor prefix can be used as a starting point and usually is. This is a

reserved beginning of the MIME type, and this indicates that this is a vendor-specific

content type. For example, here, we're doing the same sort of request we saw on the

previous slide. The Accept header could begin with vnd, and that's more typical of what

you're going to want to do in your own API content types.

Let's also look at the pros and cons.

The pro here is that the API and Resource Version are all in one. So, when we're

looking at the version of what our API looks like, but also the resource that we're

returning, we're getting a version that's really tying the two together. It takes that version

out of the API surface area or the URI so that clients don't have to change except when

it comes to including that Accept Header.

The con here is that it adds complexity. Understanding how headers work and

adding headers isn't easy on all platforms, and isn't easy for all levels of

developers. This type of Versioning could also encourage more versions

throughout your code, so you might have specific versions of a number of your

different kinds of resources. This is good in one sense in that you can have more finer

grained versioning, but it also means you're going to have to support and understand the

complex nature of Versioning across your API. This can encourage your developers to

create more versions for different small parts of your API, instead of understanding that

making no change to your API version is often better so that clients don't have to make

their changes.

Versioning with Request Headers

And finally, the last type of Versioning we'll look at is using Custom Headers inside the

request.

This should be a header value that is only a value to the API, so is specific to your

API. You're going to use an x- type of header, that's a name that most routers or

interrogators of traffic are going to ignore.

So here's an example of a header that includes a name that your application is going to

look for. Here's MyApp-Version, and then some text after it that's going to indicate that

version.

Now, it's pretty common for these sorts of custom headers to use dates of numbers, so

what you include as the actual App-Version is completely up to you, it does not have to

be a numbering scheme, like as developers we may be used to the with product

versions, or assembly versions, or jar versions; we should get away from that and just

think of something that is semantically important to what this specific call should be

pointed to.

The pro here is that it separates the Versioning from the API call signatures much

like the Content Negotiation Versioning does, and in this case it's not tied to the

resource versioning, so you're really talking about the version of the API itself, not just

the version of the resource.

The con here is that it adds complexity; much like the Content Negotiation, adding

headers isn't easy on all platforms or for all developers.

Which to Choose?

So ultimately you're going to be asking yourself which one of these patterns should I

chose?

And there isn't an easy answer for you. Versioning with Content Negotiation and

Custom Headers is very popular right now, it's sort of the trend of where Versioning is

going, but it does add that complexity.

Versioning with URI components is more common because there are more APIs out

there that have chosen that pattern. Versioning with URI components tends to be easier

to implement but can add technical debt to the backend of your project.

Ultimately you're going to need to make a decision based on the kind of

requirements you have. In many cases I would probably start with URI Component

Versioning to see whether the technical debt is a hindrance to your project, and switch to

something like Content Negotiation if you need something finer grained, as well as if you

find out that sophistication of your users is high enough that headers aren't a big deal.

An important part of your decision here is how you're going to do Versioning, but

understand it's incredibly important that you version your API from the very first release,

so that makes it easier for your users to move from version to version as your API

matures.

Versioning Resources

So we've talked about Versioning of the API itself, but what about versions of your

Resources.

In most cases, unless the nature of your Resources is very strict or set in stone by other

standards, your Resources Should Be Versioned as well. So the Versioning of the

API calls usually isn't enough.

The structures and constraints of the kinds of objects you're dealing with and

returning via your API and accepting from your API tend to change, and so

Versioning your Resources becomes important.

If you're already using Versioning with Content Negotiation or Custom Content

Types, this is pretty easy because it will know in the Accept header or in the Content

Type what the version of the object that you're expecting and sending, but this does add

complexity as we've talked about. Including a version number in the entity body is

another option, but it does pollute the data; it adds a piece of data that is about the API

and not about the nature of the data, so I don't tend to recommend this approach. If you

need Resource Versioning separate from your API, you should probably be doing

Content Negotiation Versioning.

Summary

So to wrap up this module, you must version your API; that's sort of the mantra I'm

trying to push towards the viewers of this video. Version your API whether you like it or

not; it will help with the maturation of your API as time goes on. If your API is public, it

has to be versioned, period.

There is no one way to do this API Versioning, so starting with something simple

and moving to something more complex is a good approach, but if you feel like

you're going to have a lot of version churn, choosing one of the more complex

approaches like Content Negotiation or Custom Headers is probably the place to start.

You're going to want to pick one that matches the maturity level of your users as

well your internal team. If your internal team is not well versed in dealing with a large

set of code, you may decide with one of the approaches that sort of leans on less

technical debt, but if your team is not as comfortable dealing with worrying about routing

based on things like Content Headers, then choosing one of the simpler approaches, like

the URI Path approach, may be better, so understanding that maturity level is going to

help you pick the right one. Using complex versioning isn't evil in itself, but it can

increase friction with developers. So, if you decide on using a versioning scheme that

is more complex to implement, you're going to have a tougher time reaching out and

getting more developers to work.

Ultimately, you have to be pragmatic about these decisions. Usually using just

enough Versioning to start is where I start new API projects, and then allows us to

make changes as the API matures. Remember that as long as you have a resilient

community around your APIs, you can sunset APIs at a certain point and choose a

whole new scheme.

If we look at the way that GetHub when from one version of an API scheme they had

several years ago to the Content Negotiation type they're using now, they knew that the

API wasn't the one thing holding their customers to their product, so you have to be

pragmatic about how much Versioning you're going to deal with to protect your users, as

well as incurring extra effort on their part to use your API.

There's a balance there that you're going to have to make, and understanding who your

users really are is going to be part of that.

4. Securing Web APIs

Introduction

In this module we're going to be talking about how to secure your Web APIs, which

Threats are coming after your APIs, how to Protect Your API, Cross Domain Security,

Who Should You actually Authenticate with, Working with API Keys, understanding User

Authentication, and finally making sense of OAuth.

Understanding the Threats

Before we can look at how to secure your API, we need to really understand the nature

of security as it relates to developing Web API. Who are the people that are going to

come after your secrets, your work, and even the people that are going to come to just

purely create disruption to your business.

To begin with, do you even need to secure your API? You may be thinking I'm

creating an API, I'm going to use it within my own enterprise for my own applications,

who is going to care about these APIs?

Ask yourself some questions and we can talk about whether you should secure them or

not. Are you using any private or personalized data, data that represents individual

people that could be at risk? This could be social security numbers; this could be

information about your users or employees. If you are, then you should secure it.

Are you sending any of this sensitive data across the wire to your applications? If

you are, then you need to secure it.

If you're using credentials of any kind in order to do authentication, you need to

secure it.

And finally, are you trying to protect people from getting to your servers but

overwhelming them, maybe even to the point of stopping you from being able to serve

your real customers? If so, you're going to need to secure it.

So, securing a Web API typically becomes a 1st class citizen of your design. Security

isn't something you can just throw on top of your existing design and hope that it will

work. You have to think about security through the entire process.

Who's coming after your API?

Well we have users and the browser that are coming across the internet to get at these

APIs, and we are going to have threats from different places here.

We have the typical man in the middle attacks where we have Eavesdroppers that are

looking at the traffic as it goes back and forth and seeing whether there is interesting

data there. So, if you're trading any sensitive sort of data across that wire, you're going

to have to protect against these eavesdroppers.

In addition, you can have Hackers or even your own Personnel that are going after that

personal data directly at the servers themselves; this is often behind the API some

place. This includes intrusion into your systems through your firewalls or even physical

security of your server locations.

And finally, you have the Users and Hackers themselves, which are working on the

other side of the internet, that are taking the code that you may be publishing, or maybe

looking at the website that is using those APIs in order to access your servers through

your API.

These different kinds of threats are the ones you're going to need to make decisions

about how you're going to protect against.

So at the end of the day you're going to want to protect your API in almost every case.

Securing your server infrastructure itself, protecting your data centers with firewalls, and

protecting it against physical intrusion, is outside the scope of protecting your API. We'll

assume that you're working in an organization that knows that the data center needs to

be protected.

When you're communicating with your API, you need to have Security In-Transit, so as

the clients are calling into your servers, how can you protect that data while it's traveling

across the wire?

o And this is usually where SSL is used to protect the actual payloads of the

API calls, so that they can't be modified or changed, or even inspected as it

crosses over the internet.

o SSL does have a cost to it, but is usually worth the expense. So

understanding that the overhead of actually doing the encryption and decryption

on both sides, and even the handshake between the browser and your server, to

do the SSL encryption, there is a cost associated with it. But, in terms of

protection from people interrogating your traffic, you're going to want to do this as

much as possible.

And finally, and what we're going to mostly talk about in this module, is Securing the

API itself.

o And part of the security is to protect yourself from Cross Origin calls, so

knowing what domains are using your API and allowing them to make those calls

where appropriate.

o Additionally, you're going to want to have methods for dealing with Authorization

and Authentication, so determining who is coming into the system and what

rights to those system they have.

Cross Domain Security

So, the first piece we'll talk about is Cross Domain Security.

The question you have is should you allow your API to be called from different

domains. You may be creating your API directly for your public website, and then

maybe this isn't something you want to deal with, you want to only allow your actual

website to go after it. Because the way the browsers work is that when they make a call,

an ajax call, into a Web API, if the browser itself is hosted in the same domain as the call

that's being made, it just simply allows it. If it's in a different domain, if you're crossing

domain, let's say going from foo.com to rd.com, the browser itself is not going to allow it,

unless there are some special circumstances that will allow it to happen.

Making the decision about whether to allow your API from different domains really

depends on whether it's a public or a private API. If it's an API simply for use by your

application or your web property, then you probably don't need to worry about it. But if

it's a public API, because it's going to be called from different parts of the web, you're

going to want this to be supported. Now this whole notion of Cross Domain Security only

matters when it's being called from a piece of client script on someone else's web

property. If someone is writing an app like an iOS, or an Android, or a Windows phone

app, to get at this the API is going to work in either case; this really is about Cross

Domain access from within the browser.

There are Two Approaches to solve this.

o The first is to support a different format called JSONP as the type of data

coming back.

o The other is to allow something called Cross-origin Resource Sharing, which is

a standard out there for doing sort of a handshaking to see whether a domain is

allowed to make those calls.

So let's talk about each of these individually.

What is this format we're talking about with JSONP?

It's simply JSON with Padding, JSON being JavaScript Object Notation. JSONP is

actually JavaScript. It is a small snippet of JavaScript that's returned, instead of a

JSON-formatted body. It typically contains a JSON-formatted body, but it's surrounded

with a small function call. The expectation is that when it comes back from the server it

will be executed, and so the browsers deal with it in a different way, because we very

commonly go get JavaScript that we're going to execute in the browser from different

domains. If you're getting JQuery from a CDN, or using other sources of CSS or

JavaScript, the browser expects to get those from a variety of different domains, so

allows this call to go across to that domain, if the return type is JavaScript.

When the data comes back, this JSONP package, which again is just a small piece of

JavaScript, is evaluated, which ends up calling a function that contains all the data that

you are looking for.

So, let's see how this works. I've created a function ahead of time in my client code

called updateUser, and this is going to accept some data that I want from a cross

domain server. I can then issue a GET to some API, and here I'm calling an API called

games, and the host is going to be some different host than I'm actually hosted on. I

might be hosted at foo.com, but I'm going after some cross domain host, and notice that

part of the API call is passing into the API a callback. What is the name of the function I

want to call, and so this updateGames matches the function that I already have existing

on my page. And the Accept header here also includes the information about what kind

of data I want to come back, and this is application/javascript, it's not application/JSON,

which is the way we would get normal JSON. This is actually application/JavaScript, so

that the content type will be actually JavaScript, because when this GET is executed,

what is returned is a small snippet of JavaScript, and it's used that callback mechanism

here to say wrap the results in a call to a local function, in this case updateUser, and

then inside is a JSON-formatted object that will be passed in as the data to my

updateUser call; that's the core of what JSONP does.

Let's look at this in a live API. If we go over to Fiddler, I can make a call here to get an

object from an API. We've been doing this throughout the course here or there, and if I

tell it that I want JSON as the data, when I Execute this, the result is going to be a JSON

object that is returned, and in fact, if you look at the JSON result, we'll see we're getting

this object from the sever, and it's formatted as JSON. But, if we go back to the

Composer and change this to JavaScript; it's important to put on this the parameter of

callback, this is the parameter that is usually used for APIs to define what is the name of

the callback to use when I'm returning JSONP. What am I going to wrap my JSON result

in? Now, you may decide to make this different, but the convention is actually to call a

callback. So, I'll call it foo in this case, and Execute this. Again, I'm including the

callback, and specifying that I want JavaScript not JSON. And when I execute this, the

Raw body that is returned is wrapped in a function called foo. This assumes that when

this is evaluated, that I actually do have a function that I called foo that will accept that

data as the callback. Now when you're not calling cross domain, this little extra bit of

code and bit of ceremony to using a callback may seem kind of odd and unnecessary,

but it isn't unnecessary. But, if we were going to do the same thing in cross domains, call

from a separate domain, the browsers would allow us to do this, whereas if we tried to

do this otherwise, it would fail to because it is a cross domain call.

When you're designing your data for JSONP, remember that JSONP is just JSON; it's

just the same sort of results you're going to return to the clients, but they're going to be

wrapped by the single function. So, the data passes just the same as with the

JavaScript, it's just packaged as this JavaScript callback.

The other approach is to use something called CORS.

CORS allows Cross Site support from any of the browsers, but it involves a little

handshaking to make it actually work. Now, the different platforms implement CORS in

different ways if you want to add to it. So, we're not going to talk about how to actually

write the CORS, but I want you to understand what's going on in order for this to actually

work.

There is some handshaking that goes on between the browser and your service

before your service is allowed to make the cross domain call.

Implementing this yourself is possible, but usually if you look at the platform, there

are plugins in to help you implement this forward, because it is not a matter of

changing the way your servers work, but actually implementing the handshaking that's

going to happen before your service is executed.

So let's talk about how it works so you can get your head around what the browser is

actually doing.

This is a little difficult to see because the browsers hide the handshaking part, or even

using something like Fiddler hides the handshake, so that if it doesn't work you can sort

of see what's going, but if it does work, you aren't going to see the handshake at all.

So, CORS starts by making a Cross-Origin Request as it's called. I'm on food.com,

and I'm calling Ebay.com to make a request.

The server is asked if this Cross Domain object is allowed, and it does this by

issuing a command from the browser, this isn't something you write it's something the

browsers does automatically, because CORS is a standard, and what it does is it issues

an OPTIONS call to the server, requesting the type of method that it was attempting to

do. In this example, the original Cross-Origin Request was a POST request. This would

say GET if it were a GET request, etc. And the Origin is the name of my site, the site that

I'm coming from, whereas, the Host is pointing at where I'm going to.

The Server Responds with what the Rules are. We're going to allow these methods

and we're going to allow these methods from this Origin, and as long as the calls on the

page after this adhere to these rules, it will continue to work.

So, then the browser actually makes my request, and it adds onto it the Access-Control-

Request-Method that matches what I'm trying to do, in this case the POST of some data

to the Games API. It also includes the Origin so it knows where this is actually coming

from, so that it can then still check to see that this is allowed, but this handshaking of

getting the options and then receiving the rules and caching those rules, are the part that

need to be implemented on the server for CORS to be allowed.

Typically this handshake option is done at a pretty CORS level. You're not necessarily

going to allow it just for individual API calls or methods, but you may decide to do things

like allow Cross Domain only forget but not allow things like POST or PUT or DELETE.

Who Should You Authenticate?

So in many cases you're also going to want to guarantee who the caller is to your API.

You need to figure out who is calling in order to figure out who I'm really authenticating

as.

You're really doing Server-to-Server, or you might think of it as Service-to-Service

Authentication, and in this case, it's most common to design it to work with API Keys

and Shared Secrets, and we'll talk a bit about how that works in a minute.

There's also this thing called User Proxy Authentication. So I've written some piece of

code and I want to work with some 3rd party API, but I don't want to have to collect and

be responsible for storing the user information, so I want to simply have the right to go

over to this 3rd party API and use that API, and that may be your API. And so in this

case you're going to use something like OAuth, something that allows you to proxy the

actual Authentication schemes to themselves.

And finally, there is Direct User Authentication as well. And this is where you're going

to simply piggyback on existing systems. So your API may use cookies or tokens that

you use as part of normal Authentication with your website, so, if you're using some like

ASP.NET, you may be using forms authentication here and also use that same cookie

for your API authentication. This Direct User Authentication is almost always used when

you're writing an API for your same property; it's not a public API, it's more of a private

API that you're using to communicate for your own single page applications or your own

apps.

There are some important definitions for us to get our head around before we dive in

here. First, what is a Credential?

We talk about Credentials an awful lot, but I want to make sure that you, the viewers,

have a sense of what that word really means. And a Credential is a fact that can

describe an entity. Most commonly this fact is something like an identifier, or like an

email address or a user name, and another fact may be something like a

password. So, a set of Credentials is really a list of those facts that helps the server

determine you are who you really are.

Authentication is the way the server will validate a set of credentials to figure out

who you actually are. Now this who you actually are is a curious one, because it's not

necessarily a user of the system, it also may be a developer API Key, so it may be

validating that when you signed up for an API Key that it is you, the developer, that

created that relationship. So, this authentication idea of credentials is true whether you're

calling server-to-server, or app-to- server, or whether you're actually authenticating with

user credentials.

And Authorization. Authorization is the verification that some known entity, an entity

that has been validated with authentication, has rights to access a certain

resource or a certain action, so that I can say that Bob is logged into the system, and

Authentication has validated that it is in fact Bob on the other side of the wire. Now Bob

wants to delete a customer. Does Bob have the right to delete that customer or not? And

that's were Authorization comes in. Is that entity allowed to do these certain things? Can

it read this, can it delete this, can it insert this, can it modify this?

Working with API Keys

So let's talk about API Keys.

A very common method when developing Web APIs is to issue developers a set of

credentials to identify who the developer is instead of the user, and those are normally

thought of as these API Keys. There are even a number of services that APIs can

register with that will do this management of the API Keys for you. So whether you

implement it yourself or use some service, understanding how API Keys work is a pretty

important part of it.

API Keys are for non-user specific API usage.

o For example, if you're writing some code to go after Amazon's Web Services, or

to look at the Amazon catalog, you're not representing a user that wants to look

at what orders they've made, you're simply using the API to get at some data,

and those APIs, instead of being truly open and public, still require a relationship

with those APIs so that it knows who the person calling the API is. And this is

primarily so that when someone uses the API, then it can monitor their usage. If I

see someone is looking at the catalog of my products and they're just walking

through and reading them all, I should be able to look at logs and see who is just

dumping data out, or maybe calling it so often that it's slowing down the service

for others, and identify who the developer that's causing that problem is, and then

mitigate it in one of a number of ways.

o These API Keys are just to verify who the developer is making the call so that

I can make some of those decisions.

So typically, and you're going to see this from lots and lots of public APIs, and you can

implement this yourselves, there's this notion of having an API Key and Signing your

requests.

So, to start out, the developer will go to the API's website and sign up for the API,

it's going to give them some personal information, so we can figure out who the person

is, and then they will be returned two pieces of information; they'll be returned a

magic string that contains an API Key, and then a Shared Secret. The Shared

Secret is normally used for encryption, and we'll see how that encryption works in just a

moment.

So using these two bits of information, when I make a call to one of these services, I'm

going to need to use my API Key to make a request and to sign my request so that they

can guarantee it was me in fact making the request.

So the developer is going to create a request, maybe a call just to a REST-based

service, and this is going to include what do I want to do, what my API Key is, and

what the Timestamp is. So, the API key is being used here to say who I am, but it's not

being secured yet. This API key itself is going to be transmitted across the wire so that

the API itself can determine who I am. And then the developer is going to sign the

request with the Shared Secret. The Shared Secret that was passed to the developer

when they registered is not actually passed in as part of the request; it's going to sign

that request. Now what signing the request means is to take the complete request itself

and use a Shared Secret to run it through a one way encryption, to get a signature for

this request when it is being signed. The developer then takes the request that it

generated, plus this signature, which is this one way encryption that they have

determined, and sends that whole thing to the service. The service then looks up

who's making the call through the API Key, oh, there's Bob the developer, I know

who Bob is, and I can also get that Shared Secret that I had given them before, because

I now who Bob is. We're using the Shared Secret on both sides of the wire, but we're

never transmitting it. The service then takes the request that was given and signs

the request with the Shared Secret just like the developer did. And it does this so it

can then look at the two signatures and make sure they are the same. So it's doing

one way encryption on its side, the developer did theirs, and then it sees what the

developer sent in as the signature to the request and verifies that the signatures are the

same. It does this to verify that that Shared Secret is the same that the developer is

using, so that it knows, oh, this is actually a developer, because the developer wouldn't

just give out his Shared Secret. So the developer knows something that only he and the

API knows, and I'm using the signature to verify that. It also looks at the timeout of

this request and verifies that the signature is within that allotted time. When the

developer created the request it included a timestamp that described the time of the

request so that it could make this check to see that the signed request isn't old, so that

someone couldn't steal the request that was signed and try to issue it an hour, or two

hours, or two weeks later, and mimic that they are actually the developer. If it's valid, it

goes ahead and executes the request and returns the data. If it's not valid it then

returns an error.

The API signing is a way to verify that the developer is who the developer says it is.

When you're designing your APIs, and you're designing APIs that are going to be used

outside the scope of individual users, simply using an API Key and a Shared Secret will

allow you to validate that the developer calling into your service is actually them, so that

you can have a way to register developers, and have them use the service, and be able

to monitor what developer is using your service, without the need for going down and

creating user authentication for each user of the system.

User Security

Identifying individual users is a little different.

So if you have the notion of users, how do you verify that API is calling as them? So the

developers themselves might be identifying themselves with API Keys, but you also are

asking those developers to act in the role of the user they're trying to serve, and how do

you verify that that user is actually them?

If you're building an API for only use on your website, don't worry too much about it,

because you can piggyback on the existing website security. Again, you can take

the forms authentication, in the case of ASP.NET, or any sort of authentication scheme

that you're using on the website and apply it to the API, because if they're logged into

the system that means that they can then use the API in the same way. And if you're

building clients for these 1st party APIs, those clients might be able to collect those

credentials and send them in as header information when it calls into your API. If you're

developing Apps against these 1st party APIs, it tends to be a little bit more

painful because your Apps will need to collect user credentials and secure them.

Securing them is often the harder one, so making decisions about, oh we're going to

keep the user name but force the user into typing the password every time, which isn't

necessarily a clean and easy way to do, or maybe storing the password in hopefully a

secure way, depending on the platform you're on; it can cause additional problems with

that.

If you're expecting 3rd party developers to use your API, you're not going to want

them to identify individual users themselves. You don't want to ask them to collect

those credentials, because you don't know how good they are at protecting those

credentials, and since they are a window into getting into your system, you don't want

them to know user names or passwords at any point, and that's where you would use

something like OAuth. OAuth will allow these 3rd party API developers to have access

to your system while maintaining that only your code is actually accepting those user

credentials and mapping them back to something that the developer can use as that

individual user; let's see how that works.

OAuth

So, in order to protect the user, we need a way to allow the developer to act as the

user in the system, but allow you to maintain control over accepting those actual

user credentials. Once you accept the user credentials, you can then trust that 3rd

party with some magic token that represents the developer, and the developer,

whenever they call into you with this magic token, you'll know that this 3rd party

developer is acting as if some real user in your system. The developer themselves

won't ever receive these user credentials, and more importantly won't be responsible

for storing them and securing them.

So this is how it works. Let's talk about what the Developer will do, what the API will do,

and ultimately what the User will do; they all have a role in how OAuth works.

o The developer is going to request an API Key from the API, much like we saw

earlier with pure API Key authentication.

o And the API is going to supply an API Key and a Shared Secret, again just like it

was before, because you still need to have a way for the developer and the API

to know who is who.

o Using this API Key and Shared Secret, the developer requests a token called a

Request Token. This Request Token is the magic string that the API is going to

return to allow it to make this handshaking by forcing the identification of a user

in their system and having the API give them permission to act as that user.

o The API looks at the API Key and Shared Secret being signed, and returns that

token.

o That token is then used to redirect the user to a specific page in the API to allow

them to give the credentials.

o So the developer redirects to the APIs authentication URI, and the API is going to

display a UI for the user.

o The user is going to supply their credentials if they're not logged in, or once they

log in they're going to confirm that the user wants to give the developer the rights

to call as them. If anyone's done anything like Facebook or Twitter integration, or

even a user allowed a Facebook App to be installed or allow an application to

use Twitter, you as a user have done this before. It forwards you over to the

Twitter.com page; it will say Bob's development check wants you to give access

to your Twitter account, you say yes because you want whatever Bob is going to

give you.

o Once the user has confirmed this authorization, the API itself redirects back to

the developer. So the developer can then request an Access Token.

o This is a separate token that the developer is going to keep, sometimes for quite

a while, in order to make requests to the API.

o When the developer requests this Access Token, the Access Token is going to

come back with a Timeout. Here's a token to make calls into my API, and this is

for how long you can use it.

o From that point the developer can use the API with the Access Token until that

timeout occurs.

They can make multiple calls as the user, as far as the API is concerned, until that

timeout happens, and sometimes this Access Token is good for quite awhile; sometimes

it's 20 minutes, sometimes it's two weeks, it depends on the nature of your API. If you're

developing a banking system, it should be good for a couple of minutes. If you're

developing something like Twitter you might want to make it a sliding expiration so that

the timeout is good for quite awhile. But using that Access Token allows them to use the

API, and the API, when you're designing your API, you have to look at this Access

Token and be able to determine that they're calling the API as the user.

When you're developing your API, you shouldn't expect that the user credentials are

going to be part of the header, or the user credentials especially are not going to be part

of the URI. You're not going to develop Get All Messages From User? User=Bob, right?

You're going to assume that the Access Token that's going to be sent in is going to be

mapped to Bob before you determine what data that resource is going to return. And so

developing your API as it relates to individual users is going to be very clear and obvious

when you start to work with something like OAuth, because it's going to assume that

you're going to identify the user without the need for identifying the user by name using

something like a query string or a path variable.

So how do you design for OAuth?

First of all, I want to make it clear that you should probably not implement OAuth directly.

Most platforms are going to have a way to implement the OAuth for you. Understanding

the flow is going to be useful, but depending on your platform, you're going to want to

allow a library or service to implement the OAuth for you, because there are a lot of

little moving pieces. Most of the time when I'm developing an API, the last thing I want to

do is build a lot of plumbing code. Because it's complex and there are a lot of moving

pieces, getting it wrong means you're likely going to have an insecure API, so rely

on the benefit of more mature code, to mean the OAuth is going to be as secure as

OAuth can be.

You might also decide to integrate your OAuth using 3rd party identities, so you may

use Facebook, Google, or Microsoft ID to determine who the logged in person is.

Even if they are an individual person in your system, you may be using these 3rd party

identities. Using 3rd party identities can be very helpful when you don't want to store

your own identifies; you just want to be able to individually ID users. Users don't want

their own IDs with your system anyway in most cases, they don't want necessarily to

have to remember a username and password for your system and your system alone,

unless it's a big part of your environment, like if you're building Enterprise Apps. Users

will do it if there's a big payoff. So if you're providing them a service, especially if it's a

free service that has a lot of benefit, they will want their own IDs; it's just not that

common.

Summary

So let's wrap up some of these ideas.

When you're securing your API, you should make it part of your original design.

Don't hope that you can tack on security later, or that some higher level piece will just

make it secure on its own.

Don't try and just drop security on top of your API and hope it works well, think

about securing it from the very first step to the last step of developing your API.

You want to make sure that the default behavior is secure, that not going the extra

step by developers will make it secure. That means never returning results that may be

insecure. A common case for this, I've seen quite a lot, is if you have an existing system

that already has data resources that it wants to return, let's say employees as an

example. Even though you probably wouldn't ever write code through the API that used

something like a social security number or a spouse's name, you may end up leaking

some of that information through the API because you simply just want to return the

same entity objects you're using throughout the system. But be sure that your APIs are

secured by default by making sure that the data that you're returning back is pruned to

keep data that may be fine inside your organization or behind the firewall, not getting out

over the internet.

5. Hypermedia

Introduction

In this module we're going to be talking about Hypermedia, and what that means to you

as the designer of a Web API. This module is going to cover the notions of Hypermedia,

and this is going to include explaining what exactly we mean by Hypermedia, how this

relates to REST and HATEOAS, what are Links, looking at some standard formats for

HATEOAS, including what is HAL, and what is Collection+JSON?

Let's get started.

REST and HATEOAS

So what exactly is Hypermedia?

When the web was being envisioned, part of the magic that makes the web work so well

is the idea that pages can have hyperlinks over to different parts of the web, so that the

web becomes interconnected. And so if we look at something like a standard HTML

page, we can see that using typical anchors allows us to link over to other parts of the

web just by using an href.

We can also use a property of the anchor tag called rel to describe the kind of link we're

talking about.

There's actually a formalized link tag in HTML that many of you probably already use.

The link tag is often used to link over to the style sheet for your page, but this link syntax

is actually used for a variety of reasons to link this document over with other documents,

so I can say that there's an alternative version of this page, this is the language using the

hreflang,

that there's a version of this for Arabic, and this is the URL for that, and so we're linking

this document to another version of the same document but that is for a different type of

reader. We can also do the same thing for an alternate link to a print version of this

page.

There's also the notion of a type of link that can indicate what is the next or previous

page in a cycle of pages?

If you're doing something like an article where there's a page 1, and a page 2, and a

page 3, you can use these links to indicate to the browser that there is the notion that

this is a page and knowing how to go to the next or back to the previous page. So these

are different ways that HTML allows us to link a single document or a single URL to

other URLs by what is returned back, and hypermedia is meant to do this at the API

level.

So, essentially hypermedia are just links for an API. These links are essentially

documentation to the developer so that they can know how to use your API. In

many ways this will help achieve the goal of having the API being as self-describing as

possible. In most cases, you can't have the API becoming completely self-describing, but

this can really help inform the users of the API how to do different things with your API.

These links become the State of your Application, and becomes the model for how to

take data that you may be returning as the core of what your API does, but also

indicates verbs so that you know how to insert a new invoice or delete an invoice, and

allows you to include states that's not just the state of the data on the server, but also

ways to take that data and do something with it. This is where the notion of HATEOAS

comes in, or Hypermedia As The Engine of Application State. This is something that

the REST thesis talks about as an important idea of creating these APIs that are

interrelated. Unfortunately, this is a really awful acronym, and you're going to hear me

saying it a lot in this course, mostly with my teeth clenched. A long acronym like this can

make it a little more confusing than it needs to be. Essentially, you should think of

Hypermedia as simply a way to link results with other results or operations in your API.

So let's go back to a slide that you may have seen in the first module to really

understand where this Hypermedia or HATEOAS fits in.

We talked about how simple HTTP and remote procedure calls allow us to create verbs

in the API, have URI endpoints, and that the REST-ful nature of APIs allows us to sort of

layer on top of that, so that we can have resource-based URIs, verbs, and

statelessness, and even caching in our APIs, much like up to this point you've probably

learned. This last piece is allowing you to have these relationships between parts of your

API using these things called links, and that's really what HATEOAS is adding to the

picture. So taking what we've learned about creating these REST-ful but pragmatic APIs,

and adding the ability for you to indicate information about the results you're giving back,

and how to do things with that data.

What are Links?

So in designing your API, if you want to include links, what are we really talking about?

Any links you include should be about helping the developer use the API, so that

they don't have to go craft URLs and go look up documentation, when what they may be

doing may be a next logical step.

Some common scenarios for this are things like Paging, Creating New Items,

Retrieving Associations, or other sorts of actions like updating an invoice, submitting

a new work order, those sorts of things, are common scenarios that may be

communicated as these links.

Let's look at a simple example using JSON.

Now these links aren't limited to only JSON, you could also use them with XML APIs. I'm

going to use examples that are going to use JSON, but you can apply the same idea to

XML as well. So here we start with just a simple begin and end of a JSON object, and

we might have some data that you might be familiar with returning. So, we might have

some data about the result, like totalResults or Success, and then a list of results that

I've abbreviated here with an ellipse. In here we may also include a set of links that

indicate things that can be done with the results. Links typically are going to at a

minimum have two pieces; have an href, which is typically a URI to a specific operation,

and then a rel, which is going to indicate what the link is for. So in this case we can see

that we're including an href to the previous page of results that we wanted. So what this

looks like, you could certainly document and force people to put in, but because going to

the previous page or going to the next page is such a common occurrence, including it

as links here can self-document your API, and make it easier for your developers to use.

Instead of having to write code in let's say the JavaScript of a webpage to determine and

to craft this URI, they can just take it as part of the results and go, oh, when someone

wants the previous or the next page, I already have self constructed URI that is valid.

You can also have it for other sorts of operations like insert, and in this case the URI

here is going to be related to a POST instead of a GET.

We don't have indications here for the method, but it's also fairly common for these links

to include another parameter called method so that they know what the URI is, and what

HTTP verb to use.

Here's another example, but instead of being with a collection that is returned, this is

going to be with an individual object that may be returned. And in this case I'm showing

some data that is being returned, and then we have that same sort of idea of links. And

here, we can indicate a self link, and this is very common where the item or the object

that is returned will include a link to its own resource URI, so that if you needed to do an

update, or insert, or delete, you'd have the URI that represents this object that you

returned. You may also have other links that relate to associations, so in order to look at

this game's rating, you may have a URI that indicates that this is the rating link, and then

to use this to go ahead and get that additional data if necessary.

Standard HATEOAS Formats

I want to introduce an idea here that is related to the Versioning story, but ends up being

important when we're talking about standard ways of looking at Hypermedia data, and

that is something called Profile Media Types.

The idea here is that profiles are simply the descriptions of what the data is that

you're returning. This is an alternative to using the custom MIME type that we saw in

the versioning section, and it's usually used in coordination with a MIME type. Profile

Media Types are typically included at the back of an existing MIME type in an

Accept header, and servers can return this type as the content type that was retrieved,

so that your client code could automatically know these Profile Media Types.

So as an example, if we were going to get this order, we could use an Accept header to

say, what is the format we're looking for?

In this case it's JSON we're looking for, but at the end of the JSON content type we're

going to include the profile of the schema we want. This is often used to separate the

idea of the type of format we want versus the object identifier or version that we want.

And when we look at the standard formats, they use these pretty commonly to define the

type of data that is returned, or more interestingly, the version of the data type that is

returned.

So there are a few people out there that are creating new standards for how to return

Hypermedia data. There are a handful of them out there, but I've chosen to focus on just

a couple. These standards are emerging, so they aren't final or set in stone; there

isn't a single standard for you to go after, and in some ways by developing APIs now,

you're hitching your wagon to a standard that may or may not become the prevailing

standard. The two out there we're going to talk about is HAL, or Hypermedia

Application Language, and Collection+JSON. These are the two that right now have the

most community support, and they do things in a fairly different way and for different

reasons, so we're going to explore both and talk about why you may want to choose one

over the other.

These standards are based on Custom Content Types, and then using the Profile

Media Type to define the structure of the data that is returned or accepted. The Content

Type defines the data formatting, and in the case of something like HAL or

Collection+JSON, it's going to define which of those standards to use, and then the

Profile Media Type is going to define the structure of that data. In this way it's going

to keep the format and the versioning of the type of or the structure of the data

separate. We saw in the Versioning chapter that using Content Types we could version

what we were getting back, but invariably that was really mixing the two metaphors, we

were mixing both the formatting and the versioning into a single sort of entity; this

separates the two.

Let's look at these two format types.

HAL

So HAL stands for Hypertext Application Language.

This language is meant to be a lean Hypermedia type. It wants to be more brief than

some of the other proposed standards out there and in fact the one that I'm leaning most

heavily towards because I like the brevity of what it's trying to do; it's trying to do just

enough Hypermedia to be helpful without inundating you with a lot of structure or forcing

you to restructure your code with a lot of ceremony.

It supports formats you're already used to like JSON and XML to include

resources and links together. The Content Type is called application/hal+json when

you're expecting or sending back JSON, or hal+xml for obviously XML. More typically,

you're going to use a Profile Media Type to define the kind of data you're actually looking

for; in this case looking for the order type from the Wilder Minds site.

Here's a useful picture that I got actually from the HAL specification itself; you can see

the link on the bottom there, to stateless.co.

This defines pretty much what HAL is going to look like. It is resources that have links

associated with them, and then embedded resources that may have the same structure,

and this can continue down the chain. So, the top level object may have embedded

resources, and each of those objects may themselves have links as well as other

embedded resources.

Let's look at an actual example.

So here, much like the examples we looked at earlier, we're just looking at a simple

JSON object. And the first you'll notice is that _links is the standard name they want to

give to any of the sort of links that are being passed down as this result. Now one of the

first things you'll see is the self link, and this is going to indicate that this is the URL that

was retrieved, and the object that is returned represents this self URL. Now this href

could be relative or absolute; in this example they are relative, and then it can include

other links. Much like we saw in our earlier examples, this is more formalized, and one of

the things you'll notice is that instead of having a separate property called rel, they're

simply making an object that you can look up by name. This makes it a little easier to

consume from JavaScript, which is one of the more common languages that you're

going to consume this from. And it can even have what are called template links. This is

part of the HAL Spec that actually points to a different specification for defining how to

define URLs that have optional or templatable elements, and we'll look at what the

templates look like in a minute, but being able to say that this is a templated URI is

useful to tell the user of this API that here is a URI that's useful to you, and then you're

going to be able to template it with your own data. You're still going to include data that

is simply part of that return, whether it's things like total count or it's the error code, or

time to execute, those sorts of things, you're still going to include a simple property, but

they have the special _embedded property that's going to include the actual results, the

embedded results to the sort of top level object that's being returned, and so we may

have something like games or results that is the actual data that you actually asked for.

And in this case, the game object itself just has basic data that you might be used to, in

this case price, currency, and name of the game, but like the top level object, each of

these individual items may also have links. Here we can see just a single self link, but

you may have other links for things like deletion and insertion, or updating, and those

sorts of links, or even links to associated data, much like we saw in the earlier example

before we introduced HAL.

So template links are something I want to kind of bring together and help you

understand what they're all about. Defining a link as being templated is really pointing at

template URIs, and here you can see the URI if you want to see the full Spec of how

they're defined. HAL doesn't define what templates look like, HAL simply points at

another standard that is out there for template links. But essentially, you're taking what is

inside the curly braces and saying, this is where you're going to put data; in this case,

this games is going to become ?query, and then whatever the data that can be supplied

to query as a query parameter. So, if I was searching for games that had the name halo

in them, I could indicate that here. And there's a full syntax for adding multiple parts of

the query, the query may be part of the URI, it may be part of the query string, it may

have multiple parts, all that is discussed in the RFC6570, so I don't want to duplicate that

effort a lot here, but understand that the power of what HAL is doing with templated links

here is allowing you to not only have simple links over to discrete operations, but also

having links that may include variable information like a search link would have here.

Collection+JSON

The next standard we'll look at for communicating Hypermedia in your results is

Collection+JSON.

And this standard is really for allowing the standard reading and writing of collections; it's

meant to be very self-describing. It defines a standard way to communicate lists and

individual items, as well as includes UI information, so that if you're trying to fill in an

object, some object that's inside of a list, you're able to know what sort of labels for input

controls you should use in order to gather that information. And so, it contains a lot more

metadata than HAL does. It uses that MIME type of application/vnd.collection+json to

indicate that, and you can use the Profile Media Type as well, that's very common in

Collection+JSON. The Collection+JSON doesn't have a corollary of Collection+XML.

Some people have sort of talked about creating that standard, but Collection+JSON is

fairly tied to the way that JSON works.

Well let's take a look at an example.

In Collection+JSON, it always starts with that collection element as the top object, so it

sort of creates one nested object below it, that's including information like what the

version is and what the href of this collection is. It also has a set of links similar to the

way that HAL does, but they're defining them in the typical rel and href properties, and

then items in that collection are defined in another property, and each item itself has it's

own href, and then the data for the items in that collection. So in this case it's just simple

JSON describing the objects of the data. And each of those items can still include their

own links. In this case we're saying the link is to a blog, and notice the last piece of this

where it says prompt:Blog; this is an indicator to you of how you could create a link and

what to show to the user. When they say prompt, they mean what is visible to a user that

wants to know where this links goes.

In addition to what we've seen, they also have the notion of queries, and so similar to

what you saw in the templated links in HAL, queries allow you to define the kinds of

different search semantics you have. Again, they're including the prompt property here,

so that you know what to show to the user when they're using this link. It also includes a

data section so you know what query string parameters to include. And the last one is

the notion of what's called template, and the idea behind template is really to give you

information about how to build a UI. So this is saying for the property that you want to

create, called full-name, the prompt should be full-name, and the value should be, in this

case, a string. Now, it's giving you empty values because this is the same data structure

once filled in with the values you'll actually post up to the server to make those changes.

Again, Collection+JSON is focused on how to create and maintain certain collections up

on the web.

The sweet spot for Collection+JSON is for simple machine-driven lists; this is really

where it excels. So, if you want to be able to point it at an arbitrary Collection+JSON

data source, you should be able to build a UI based on what they're giving you, simply

from what the API is returning. I find Collection+JSON a bit too verbose for real data-

driven RESTs, so as I'm creating most REST services, if I want to include Hypermedia,

I'm really leaning on HAL instead of Collection+JSON, because of the additional

verbosity. The idea of including these UI elements really allows you to create automated

code, but I don't find that this works all that well in practice unless you're dealing with

maybe SharePoint lists or the kind of information you're dealing with is fairly fixed. In

practice, in most APIs I've developed, using something like Collection+JSON is just

going to sort of bloat the API and not going to be as informative to my users. And at the

end of the day, you just find that the payloads you're returning to your developers using

your APIs end up being bloated because of it.

Summary

So when designing an API where you want to include Hypermedia to create these

better versions of APIs that can be more self-describing, and help your users build

systems based on your APIs in a more simple way, you can really leverage this

information from HATEOAS to create better APIs. There's a lot going on in the thinking

around Hypermedia and HATEOAS right now, and so these ideas are emerging. Tying

yourself too much to one concept in Hypermedia or another concept in Hypermedia may

make you a bit of a star in your company today, but it might end up biting you later when

the thinking around HATEOAS might change. Although APIs are going to be longer lived

and are going to develop, I'm not sure I would hang my hat on making sure everything

adhered to a strict interpretation of what Hypermedia or HATEOAS is.

If you're going to do Hypermedia, I really like HAL as being the middle ground of being

this brief or lean version of the Hypermedia-driven language, but it's small and

consistent. I really think that HAL is the right choice when you're doing HATEOAS today.

But at the end of the day when it comes down to it, if your API is an internal one or you

have a small number of users that maybe can deal with reading documentation or even

you already have documentation written, I find that using Hypermedia sometimes may

be more about the ceremony of making sure that I have this fully REST-ful API than

purely useful, and I really want to focus on what is pragmatic when I build and design an

API, not just what people say about whether it is valid, or good, or true; I'm trying to stay

away from what the community thinks about my API, as long as developers are willing

and able to use those APIs.

Now in many cases, using Hypermedia to decorate the result of my APIs can make them

easier to work with, and developers are going to be happy about this, but I would limit

the amount of use of Hypermedia to the things that are only truly useful; things like

paging, and insert and deletes, or associations, are really sweet spots there, but if you

start thinking that every result should return a full list of every operations that could be

done on some entity or item that you're returning, you're probably working way too hard

to make this happen.

6. References

https://app.pluralsight.com/library/courses/web-api-design

https://app.pluralsight.com/library/courses/web-api-design

Technology

Web API Design-Notes