Last night I was reading through the Amazon Web Services Forums and ran across this post that has vexed me since then. I imagine this post doesn’t come as a surprise to folks very familiar with AWS, but not previously knowing how elastic load balancers were implemented in AWS, it came as a surprise to me. It is a potential customer (your customers) security hole that I thought deserved some more publicity so an improvement could be worked towards.
I’ll sum up the thread and the problem:
- AWS hosts a lot of websites. Some very popular ones that see billions of pageviews a month.
- Many websites on AWS use Elastic Load Balancing in conjunction with Auto Scaling to scale their applications in real-time and spread in-bound traffic across all the backing EC2 instances used to host their site.
- Elastic Load Balancers are like any other EC2 resource; Amazon scales them up and down to best handle the level of traffic you are getting. (I wasn’t aware of this detail previously)
- Just like any other EC2 resource, the IP addresses assigned to Elastic Load Balancers are subject to change. To avoid issues as best they can, Amazon sets very short TTLs on the ELBs to avoid client’s caching them too long and when a sites ELB IP address is changed, it sees the change and doesn’t keep trying to hit the old IP Address.
- PROBLEM: Not all clients are well-behaved and some cache IP addresses longer than they should. As a side effect, it doesn’t seem uncommon for other customers on AWS to temporarily receive traffic intended for another site, sometimes very popular sites, after having their ELB assigned the IP address previously belonging to another account’s ELB.
- SECURITY RISK: Assuming you were a particularly nasty individual, you could setup a single EC2 instance fronted by an ELB in each availability zone in each region in EC2 simply waiting for traffic that wasn’t yours and then either capturing that data to snoop on other people’s customers or proxy the request through (as best you can) to the real site and reply to the query, becoming a man-in-the-middle and continuing to sniff or manipulate the traffic.
I’ll admit, there is some hand-waving here with regard to how easy this would be, it’s not a walk in the park, you would need to know what you were doing, but the security risk to your users still stands: AWS customers you don’t know can potentially receive some of your traffic when using an Elastic Load Balancer on AWS.
The real gotcha being that there isn’t likely anything AWS can do about this. No matter how well-behaved the client and how short the TTL on the ELB IP address, there is still a moment in time when an IP address is swapped out that client traffic will be directed to the new owner of the IP address (an unintended middle man) before the client updates and starts sending it to the correct end point. In the case of poorly behaved clients that cache the IP addresses too long, this just gets worse.
The severity of the issue scales proportional to the sensitivity of the information your web application is sending back and forth between client and server. For example, login credentials, account ID’s, private API key’s and other customer-specific information that isn’t intended to be shared publicly could all be data logged by other AWS customers.
I don’t have a workaround to this problem and the only “solution” I can come up with is if Amazon changed ELB behavior to automatically create and use an elastic IP address that the AWS system kept updated every time the ELB’s were swapped. This is just a thought-solution that I’d be curious to hear opinions on from folks more familiar with AWS; I have no idea if it would actually work. Again this depends on how EIP’s and ELBs would implement this.
If you wanted to close this hole right now, I think you’d have to stop using an ELB, but who are we kidding… if you are on EC2 then Auto-Scaling and ELBs are the whole reason you are using AWS. Otherwise you would just go to Linode (or your favorite VPS shop) and buy up a ton of nodes yourself and do everything manually.
I want this post to shed some more light on this issue to help collect ideas on how to close this hole. The AWS team, from what I’ve seen over the last few years, is extremely responsive with high profile features the community is demanding. I am sure with more eyes looking at this a good solution can be born.
I will keep this post updated with any thoughts or feedback collected.
Update #1: Eric from the AWS team has responded, pointing out that load-balancers already use elastic IPs and that a more robust solution is in the works:
ELBs do use Elastic IPs, but we can’t unmap the EIP and remap the EIP without breaking existing connections. Therefore, we do the DNS updates to point new new EIPs and that way we don’t break existing connections.
Switching IPs isn’t ideal either, as failing to respect the DNS TTL issue can lead to issues. We’re currently working on variations that can help us do immense scaling and not break connections in use, while at the same time mitigating the impact of DNS caching.
Many thanks to the AWS team for being on top of this already.