Running Magento in an AWS Environment

I was hosting Magento on AWS in 2011 until 2013. Many of the things you gain from using cloud infrastructure rather than traditional dedicated or shared hosting are more relevantly described under the topic of DevOps, which are not exclusive to AWS but are more easily enabled through its use.

  • Complete and flexible control of your capacity planning -- scale up ahead of marketing events, enable dynamic provisioning via Elastic Beanstalk, scale down during low volume periods, spin up a site in a couple weeks, tear it down and throw it away.
  • Easily setup dev/test/staging environments and replicate changes between them.
  • Host your admin pages on a separate host.
  • Use DynamoDB for session management and ElastiCache for cache w/o incurring additional operations overhead, beware ElastiCache doesn't currently function in VPC though.
  • use ELBs for loadbalancing, but beware if requests take more than 60 seconds they'll be terminated ungracefully
  • Use AmazonSES for sending emails (now even easier via regular SMTP), though gaps still exist in tooling for tracking bounces/complaints
  • Use AmazonS3 for hosting media, and Cloudfront for CDN.
  • Reduced operations cost for database activity via RDS, which features point in time restore, automated failover, automated backups, and automated upgrades.
  • DNS management is super easy in Route53, but generally recommended for ease of mapping subdomains to load balancers.
  • VPC to put all your machines on their own private network to have more granular control and expose access as you see fit via your own VPN tunnels.
  • Easy performance metrics and alerting (aside from memory usage and disk usage) via CloudWatch & SNS

Minimal footprint will be 1 ELB, 2 EC2 webservers in seperate AZs, 1 multi-az RDS, Route53 hosted zone for the domain. Initially you can use sticky sessions on the ELB to keep session management simpler, but as your traffic increases you'll want to move media into a CDN (S3 & CloudFront) and sessions off of the individual machines.

Areas I haven't looked but still are promising: CloudFormation scripts for easier deployment of a Magento stack, offloading order creation via DynamoDB and worker queues for greater checkout throughput (someone has already started a project to do this via MongoDB at one of the hackathons recently), and setting up a multi-region presence via latency based routing with Route53.

I guess I'm kind of an evangelist for cloud. Specific to AWS, c3.large are a decent instance size for production webservers, but I'd start off with the smallest of each instance class and measure performance and scale up or optimize code as you see fit, which is why I refer everyone to xhgui constantly.


This is how we do it for the Angrybirds webshop:

English presentation at Magento Imagine 2012.

German presentation at Meet Magento #6.12

The current German "PHP Magazin" also has a 6-page article (in German) with some details

Having read all of Fabrizio's presentations linked above many times over, I think that this answer is truly the best one, though I agree it could use more explanation and an extraction of the key ideas from the presentations (especially since the original first link had already been 404'd by the time I posted this update).

The only thing I would add to the key concepts in the presentations is that modern advancements in AWS / competitor's technologies would suggest some tweaks...like the fact that Cloudfront does support gzip for CDN performance improvements now, though it's not as fast as nor does it give you free SSL termination like CloudFlare offers. Their Route 53 DNS is also not as fast or feature rich as CloudFlares, nor does AWS have a comparable Web Application Firewall or DDOS protection, all of which are included in the CloudFlare offerings...

There are a few other possible ways to improve on Fabrizio's original presentation but I wouldn't be a good consultant if I gave away EVERYTHING I knew on every StackExchange post I answered, now would I? Plus some of the newest offerings would substantially change the suggestions in the original presentations, all of which STILL offer great performance, even if more could be squeezed out of AWS with different options used.

Summary of Key Concepts:

  1. Know Your Bottlenecks Intimately: and optimize appropriately. Each tier of the stack has specific bottlenecks (bandwidth, cpu, database) and solving the bottlenecks at each tier requires a different solution optimized for each specific challenge, though really caching is the common element at every level, which leads to...

  2. Cache All The Things: Leverage AWS systems where possible (Elasticache for Redis/Memcache type data caching, Cloudfront for Caching image, js, and css assets nearest to end users via CDN) and Varnish for speeding up server instance responses to initial asset-level caching requests from CDN. Also, be sure to compress & mininify in your deployment systems BEFORE deploying to CDN's

  3. Autoscaling is Essential: Demand changes frequently and faster than you can monitor and react manually. Adapting to these changes in real-time requires using automation tools available in AWS like Auto-Scaling Groups to spin up the pieces of the system that are best suited to this task. AWS handles this transparently for CloudFront CDN, Route 53 DNS, Elastic Load Balancers and S3 Buckets, you have to handle it by sizing and auto-scaling for EC2 Instances, and just sizing / tuning for RDS & Elasticache tiers

  4. Automation is the only way to effectively tie all of this together: with so many interrelated components, some of which have to be initialized at deploy time, some right after deployment, managing a system tuned for optimum performance requires automation. Leveraging deployment and systems automation for cache clearing, cache warming, image processing, etc is the only reasonable way to manage this many different subsystems and keep them well-oiled and problem free.

  5. But really even that's not possible without test automation: With this many moving parts, something will break with almost any change. And you will need to change to keep up with developments in Magento and AWS. And those will happen OFTEN. So to keep the cost of change minimized, all forms of testing need to be both implemented and automated fully - from unit tests to integration test to Selenium-based functional tests of the actual site launched in actual testing configurations that mimic the production environment. Now you're REALLY glad you automated all your deployment processes, right?


A slightly simpler (!) solution is just to install as you would on any other VPS. I've been offering a free image for a few years now... lately I've concentrated on the new Sydney DC due to it being local - more details at http://www.greengecko.co.nz/magento_on_amazon_ec2 if you're interested in that. Practically zero pain getting started - one click and you're there. Point your browser at the instance for more details. This will make a good starting point - but look in and modify /etc/rc.local if you're going to build upon it.

The important things to realise is that the instances are pretty low powered. Obviously throwing a lot of money at the app does improve this, but for even a moderately small webshop a medium instance is an absolute minimum, just to get multiple cores, and really large is the smallest necessary.

Also, Amazon storage is slow. Because of that, it's even more important than usual to deliver everything you possibly can from memory: tuning databases, memory backed caches, etc are imperative.

Once you get that sorted, it works ok. the requirement to run in a VPC if you want > 1 IP address is really annoying ( especially if you don't realise this when you start out! ), and really the only gotcha you will come across.

It's simple to expand the platform 'on the fly' - eventually the only bottleneck becomes the amount of processing power available to PHP ( inefficient code aside! ), and running multiple 'engines' in parallel is probably the simplest option - bringing extras online when necessary.

Enjoy!

Steve