In recent years, Amazon has become nearly synonymous with cloud. Hundreds of major Internet services like DropBox, Netflix, and Instagram leverage AWS for all or major portions of their infrastructure. The Amazon cloud is so important that outages make the cover of the Wall Street Journal.
In this blog, we ask the question how big is Amazon’s cloud?
Amazon is clearly big.
For example, Amazon’s press releases like to tout highly technical statistics like the rate of S3 requests (650,000 per second!) and number of objects (900 billion!). Somewhat more insightfully, outside analysts’s have estimated the number of servers and revenue ($200 million in 2010). But none of this really gives a picture of Amazon’s growing role underpinning the global Internet economy.
So with collaboration of several network provider research parters, we conducted one of the largest studies of its kind analyzing multiple weeks worth of network data to AWS from a broad cross section of a several million Internet end-users (mainly in North America). Our goal was to characterize AWS traffic, understand the major companies using AWS infrastructure, and ultimately gauge the importance of AWS to the Internet infrastructure and daily services / browsing of end-users.
The below chart summarizes some of our key findings.
One way to gauge the importance of Amazon is to ask how frequently will a typically Internet user visit a web site based on Amazon infrastructure? The answer: an amazing 1/3 of all users every day. This number is all the more impressive when you consider that our data includes millions of users and end devices of limited scope or activities, such as users who only check mail and home game consoles.
[Note: Since our study focused on subscriber traffic, we excluded servers (such as consumers hosting web sites) and Internet "background noise" including the nearly constant barrage of scanning / intrusion attempts from China, botnets, machine-to-machine communication for software updates, etc. Though a different dataset, our earlier academic papers provide more background on related methodology].
Traffic volume provides another metric, albeit indirect, of Amazon’s growing Internet presence. As of April 2012, Amazon contributes more than one percent of all consumer Internet traffic in North America. This is a huge number given that Amazon, unlike, say Google, does not typically host massive video content. Instead, this one percent represents the broad reach of Amazon infrastructure across hundreds of client companies. By comparison, we found all of Google’s sprawling YouTube infrastructure contributed six percent of Internet traffic in 2010.
Finally, we looked at Amazon’s growing content distribution network (CDN). Over the last several years, CDNs have evolved as the workhorse of the Internet, delivering the majority of images, video and other content to end users. Since its launch in 2008, Amazon’s CloudFront CDN and S3 distributed storage services have steadily gained in popularity. As of today, Amazon ranks as the fourth largest CDN by traffic volume (trailing behind Akamai, Limelight and Level3).
Now on to our final question: what companies are using Amazon cloud infrastructure?
In the below table, we show the 40 largest corporate users of Amazon’s cloud infrastructure (contact us for a complete list).
As an estimate of the importance of AWS to each company, we calculated the average percentage of all subscriber AWS connections that access one or more of each site’s AWS components each day. So, for example, in the top spot, 21% of subscriber connections to AWS go to truste.com. Like many of the top AWS corporate users, truste.com is an advertising / analytics company (as is InviteMedia, Chartbeat, Evidon, etc.). Although most consumers remain blissfully unaware, almost every web page they visit is tracked, analyzed and scored by dozens of analytics and marketing companies (a large number of them using Amazon infrastructure).
Many of the companies above are familiar consumer names like DropBox, Netflix, Instagram and Pintrest. Others, like Heroku, provide behind the scenes platform as a service (PaaS) to hundreds of other companies running cloud applications. And still many other companies (including my own), use Amazon infrastructure for their internal enterprise applications and back-office support.
Overall, Amazon enjoys a commanding lead in the much balleyhooed, mind-blowingly large $200 billion anticipated cloud computing market. But the war for cloud dominance is just beginning. Companies like Rackspace, CSC, Microsoft and Google are investing billions in datacenters and software to compete. In upcoming blogs, we’ll explore the infrastructure and Internet footprint of some of these other large cloud players.