RockThePost.com is a LAMP stack hosted on Ec2. We're preparing to be featured in an email which will be sent to ~1M investors... all at the same time. For our 2 person engineering department, that meant we had to do a quick sanity check to see just how many people we can support concurrently.
Our app uses PHP's Zend Framework 2. Our web servers are two m1.medium Ec2 machines with an ELB in front of them setup to split the load. On the backend, we're running a master/slave MySQL database configuration. Very typical for most of the small data shops I've worked at.
Here are my opinions and thoughts in random order.
- Use PHP's APC feature. I don't understand why this isn't on by default but having APC enabled is really a requirement in order for a website to have a chance at performing well.
- Put everything that's not a .php request on a CDN. No need to bog down your origin server with static file requests. Our deployer puts everything on S3 and we abs path everything to CloudFront. Disclaimer: CloudFront has had some issues lately and we've recently set the site to instead serve all static materials directly from S3 until the CloudFront issues are resolved.
- Don't make connections to other servers in your PHP code, such as the database and memcache servers, unless it's mandatory and there's really NO other way to do what you're trying to do. I'd guess that the majority of PHP web apps out there use a MySQL database for the backend session management and a memcache pool for caching. Making connections to other servers during your execution flow is not efficient. It blocks, runs up the CPU, and tends to hold up the line, as it were. Instead, use the APC key/value store for storing data in PHP and Varnish for full page caching.
- I repeat... Use Varnish. In all likelihood, most of the pages on your site do not change, or hardly ever change. Varnish is the Memcache/ModRewrite of web server caching. Putting Varnish on my web server made the single biggest difference to my web app when I was load testing it.
- Use a c1.xlarge. The m1.medium has only 1 CPU to handle all of the requests. I found that upgrading to a c1.xlarge, which has 8 CPU's, really paid off. During my load test, I did an apt-get install nmon and then ran nmon and monitored the CPU. You can literally watch the server use all 8 of its cores to churn out pages.
Google Analytics shows us how many seconds an average user spends on an average page. Using this information, we ran some load tests with Siege and were able to extrapolate that with the above recommendations, we should be able to handle 30,000 users concurrently per web server on the "exterior" of the site. For "interior" pages which are PHP powered, we're expecting to see something more along the lines of 1,000 concurrent sessions per server. In advance of the email blast, we'll launch a bunch of c1.xlarges and then kill them off slowly as we get more comfortable with the actual load we're seeing on blast day.
Finally, we will do more work to make our code itself more scaleable... however, the code is only about 4 months old and this is the first time we've ever been challenged to actually make it scale.