You want to develop a RESTful web API for developers that is secure to use, but doesn’t require the complexity of OAuth and takes a simple “pass the credentials in the query” approach… or something equally-as-easy for people to use, but it needs to be secure.
You are a smart guy, so you start to think…
You realize that literally passing the credentials over HTTP leaves that data open to being sniffed in plain-text; After the Gawker incident, you realize that plain-text or weakly-hashed anything is usually a bad idea.
You realize that hashing the password and sending the hash over the wire in lieu of the plain-text password still gives people sniffing at least the username for the account and a hash of the password that could (in a disturbing number of cases) be looked up in a Rainbow Table.
Then you realize that a lot of popular public APIs seem to use a combination of two values passed along with each command request: one public value and one (hopefully) private value that only the account owner is suppose to know.
“Still not quite right!” you exclaim, because in this case (which is really a username/password scenario all over again) you still suffer from the same problems (sniffed traffic) that sending the username and password in plain text had.
At this point you are about to give up and concede to using OAuth, but you insist that there has to be a secure but relatively easy way to design a public web API that can keep credentials private.
After doing Peyote for 2 days straight (you should find better ways to relax) it finally dawns on you: Amazon Web Services has one of the largest and most used web APIs online right now, and they don’t support OAuth at all!
After a long afternoon of fever-dreams, you finally come down enough to see how Amazon keeps it’s API requests secure.
You aren’t sure why, but after reading the entire page on how to assemble a request for an AWS service, it still doesn’t make total sense to you. What’s with this “signature” thing? What is the data argument in the code examples?
So you keep searching for articles on “secure API design“…
You do run across a distillation of the basic concept that makes sense (yay!) and it goes something like this in plain English:
A server and a client know a public and private key; only the server and client know the private key, but everyone can know the public key… who cares what they know.
A client creates a unique HMAC (hash) representing it’s request to the server. It does this by combining the request data (arguments and values or XML/JSON or whatever it was planning on sending) and hashing the blob of request data along with the private key.
The client then sends that HASH to the server, along with all the arguments and values it was going to send anyway.
The server gets the request and re-generates it’s own unique HMAC (hash) based on the submitted values using the same methods the client used.
The server then compares the two HMACs, if they are equal, then the server trusts the client, and runs the request.
That seems pretty straight forward. What was confusing you originally is that you thought the original request was being encrypted and sent, but really all the HMAC method does is create some unique checksum (hash) out of the arguments using a private key that only the client and server know.
Then it sends the checksum along with the original parameters and values to the server, and then the server double-checks the checksum (hash) to make sure it agrees with what the client sent.
Since, hypothetically, only the client and server know the private key, we assume that if their hashes match, then they can both trust each, so the server then processes the request normally.
You realize that in real-life, this is basically like someone coming up to you and saying: “Jimmy told me to tell you to give the money to Johnny Two-toes“, but you have no idea who this guy is, so you hold out your hand and test him to see if he knows the secret handshake.
If he does, then he must be part of your gang and you do what he says… if he doesn’t know the secret handshake, you decide to shoot him in the face (you have anger issues).
You sort of get it, but then you wonder: “What is the best way to combine all the parameters and values together when creating the giant blob?” and luckily the guy behind tarsnap has your back and explains to you how Amazon screwed this up with Signature Version 1.
Now you re-read how Amazon Web Services does authentication and it makes sense, it goes something like:
- [CLIENT] Before making the REST API call, combine a bunch of unique data together (this is typically all the parameters and values you intend on sending, it is the “data” argument in the code snippets on AWS’s site)
- [CLIENT] Hash (HMAC-SHA1 or SHA256 preferably) the blob of data data (from Step #1) with your private key assigned to you by the system.
- [CLIENT] Send the server the following data:
- Some user-identifiable information like an “API Key”, client ID, user ID or something else it can use to identify who you are. This is the public API key, never the private API key. This is a public value that anyone (even evil masterminds can know and you don’t mind). It is just a way for the system to know WHO is sending the request, not if it should trust the sender or not (it will figure that out based on the HMAC).
- Send the HMAC (hash) you generated.
- Send all the data (parameters and values) you were planning on sending anyway. Probably unencrypted if they are harmless values, like “mode=start&number=4&order=desc” or other operating nonsense. If the values are private, you’ll need to encrypt them.
- (OPTIONAL) The only way to protect against “replay attacks” on your API is to include a timestamp of time kind along with the request so the server can decide if this is an “old” request, and deny it. The timestamp must be included into the HMAC generation (effectively stamping a created-on time on the hash) in addition to being checked “within acceptable bounds” on the server.
- [SERVER] Receive all the data from the client.
- [SERVER] (see OPTIONAL) Compare the current server’s timestamp to the timestamp the client sent. Make sure the difference between the two timestamps it within an acceptable time limit (5-15mins maybe) to hinder replay attacks.
- [SERVER] Using the user-identifying data sent along with the request (e.g. API Key) look the user up in the DB and load their private key.
- [SERVER] Re-combine the same data together that the client did in the same way the client did it. Then hash (generate HMAC) that data blob using the private key you looked up from the DB.
- (see OPTIONAL) If you are protecting against replay attacks, include the timestamp from the client in the HMAC re-calculation on the server. Since you already determined this timestamp was within acceptable bounds to be accepted, you have to re-apply it to the hash calculation to make sure it was the same timestamp sent from the client originally, and not a made-up timestamp from a man-in-the-middle attack.
- [SERVER] Run that mess of data through the HMAC hash, exactly like you did on the client.
- [SERVER] Compare the hash you just got on the server, with the hash the client sent you; if they match, then the client is considered legit, so process the command. Otherwise reject the command!
REMINDER: Be consistent and careful with how you combine all parameters and values together. Don’t do what Amazon did with Auth Signature version 1 and open yourself up to hash-collisions! (Suggestion: just hash the whole URL-encoded query string!)
SUPER-REMINDER: Your private key should never be transferred over the wire, it is just used to generate the HMAC, the server looks the private key back up itself and recalculates it’s own HMAC. The public key is the only key that goes across the wire to identify the user making the call; it is OK if a nefarious evil-doer gets that value, because it doesn’t imply his messages will be trusted. They still have to be hashed with the private key and hashed in the same manner both the client and server are using (e.g. prefix, postfix, multiple times, etc.)
Update 10/13/11: Chris correctly pointed out that if you don’t include the URI or HTTP method in your HMAC calculation, it leaves you open to more hard-to-track man-in-the-middle attacks where an attacker could modify the endpoint you are operating on as well as the HTTP method… for example change an HTTP POST to /issue/create to /user/delete. Great catch Chris!
It’s been a long few days, but you finally figured out a secure API design and you are proud of yourself. You are super-extra proud of yourself because the security method outlined above actually protects against another commonly popular way of hacking API access: side-jacking.
Session sidejacking is where a man-in-the-middle sniffs network traffic and doesn’t steal your credentials, but rather steals the temporary Session ID the API has given you to authenticate your actions with the API for a temporary period of time (e.g. 1hr). With the method above, because the individual methods themselves are checksumed, there is no Session ID to steal and re-use by a nefarious middle man.
I am relatively new to the RESTful API game, focusing primarily on client-side libraries. If I missed something please point it out and I’ll fix it right up. If you have questions, suggestions or ideas that you think should go into the story above, please leave a comment below.
Alternatively you can email me and we can talk about friendship, life and canoeing.
Gotchas (Problems to Watch For)
<This section was removed, because by using UTC time you avoid the daylight-savings-time issue all together and my solution proposed here was stupid anyway.>
Additional Thoughts for APIs
What about the scenario where you are writing a public-facing API like Twitter, where you might have a mobile app deployed on thousands of phones and you have your public and private keys embedded in the app?
On a rooted device, those users could likely decompile your app and pull your private key out, doesn’t that leave the private key open to being compromised?
Yes, yes it does.
So what’s the solution?
Taking a hint from Twitter, it looks like to some degree you cannot avoid this. Your app needs to have it’s private key (they call it a secret key) and that means you are open to getting your private key compromised.
What you can do though is to issue private keys on a per-application-basis, instead of on a per-user-account basis. That way if the private key is compromised, that version of the application can be banned from your API until new private keys are generated, put into an updated version of the app and re-released.
What if the new set of keys get compromised again?
Well yes, that is very possible. You would have to combat this in some way on your own, like encrypting the keys with another private key… or praying to god people will stop hacking your software.
Regardless, you would have to come up with some 2nd layer of security to protect that new private key, but at least there is a way to get the apps deployed in the wild working again (new version) instead of the root account being locked and NONE of the apps being able to access the service again.
Update #1: There are some fantastic feedback and ideas on securing a web-API down in the comments, I would highly recommend reading them.
Some highlights are:
- Use “nonce” (1-time-use-server-generated) tokens to stop replay attacks AND implement idempotentcy in your API.
- The algorithm above is “95% similar to ‘two-legged’ OAuth 1.0“, so maybe look at that.
- Remove all the security complexity by sending all traffic to go over SSL (HTTPS)!
Update #2: I have since looked at “2-legged OAuth” and it is, as a few readers pointed out, almost exactly the process described above. The advantage being that if you write your API to this spec, there are plenty of OAuth client libraries available for implementors to use.
The only OAuth-specific things of note being:
- OAuth spec is super-specific with how you need to encode your pararms, order them and then combine them all together when forming the HMAC (called the “method signature” in OAuth)
- OAuth, when using HMAC-SHA1 encoding, requires that you send along a nonce. The server or “provider” must keep the nonce value along with the timestamp associated with the request that used that nonce on-record to verify that no other requests come in with the SAME nonce and timestamp (indicating a “replay” attempt). Naturally you can expire these values from your data store eventually, but it would probably be a good idea to keep them on-file for a while.
- The nonce doesn’t need to be a secret. It is just a way to associate some unique token to a particular timestamp; the combination of the two are like a thumbprint saying “at 12:22pm a request with a nonce token of HdjS872djas83 was received”. And since the nonce and timestamp are included in the HMAC hash calculation, no nefarious middle-man can ever try and “replay” that previous message AND successfully hash his request to match yours without the server seeing the same timestamp + nonce combination come back in; at which point it would say “Hey! A request with this thumbprint showed up two hours ago, what are you trying to do?!”
- Instead of passing all this as GET params, all these values get jammed into one giant “Authorization” HTTP header and coma-separated.
That is pretty much the high points of 2-legged OAuth. The HMAC generation using the entire request and all the params is still there, sending along the timestamp and a nonce is still there and sending along the original request args are all still there.
When I finally get around to implementing 2-legged OAuth from a server perspective, I’ll write up another article on it.