Page Caching: Varnish vs Nginx FastCGI Cache
Varnish has long been a part of the stack we use here on our site, handling full-page caching, but after some benchmarking it looks like Nginx FastCGI Cache is actually a better choice.
If you followed along with Ashley’s Hosting WordPress Yourself series, you’re probably familiar with the stack but here’s a diagram as a refresher:
Nginx employs FastCGI Cache for full-page caching, PHP-FPM processes the PHP, Redis manages the object cache, and MySQL is at the very back. This site (deliciousbrains.com) has been running a similar stack since I set it up two and a half years ago:
There’s a couple of minor differences:
- Linode and CentOS rather than Digital Ocean and Ubuntu
- HTTP requests are served rather than redirecting and always serving HTTPS
- Apache mod_php rather than PHP-FPM
- APC rather than Redis for object cache
But the biggest difference is definitely the presence of Varnish and using it over FastCGI Cache for full-page caching. Because Varnish doesn’t support HTTPS, we have Nginx sitting in front of it, handling the HTTPS bits and proxying requests for Varnish. Varnish then proxies requests to Apache on the backend.
As I’ve written previously, I had doubts about managing my own server, especially one that my company and its employees depend on to bring in revenue. I went with Apache because I knew it well. Nginx + PHP-FPM was relatively new in comparison and I didn’t know it at all.
I was also seeing “Gateway 502” errors (an Nginx timeout error) now and then while browsing the web. This could have been due to lots of things, but I assumed it was mainly due to the
set_time_limit() PHP function not having any effect when running PHP-FPM. Definitely a strike against.
I’ve since played with Nginx + PHP-FPM a bit and have more confidence using it in production. Especially where I have control over PHP-FPM’s timeout settings.
I had been reading good things about Varnish. It did full-page caching really well and could handle massive traffic without breaking a sweat. Web hosts were adding it to their setups. I believe WP Engine was using it.
So in 2012 decided it would be worth a try. I setup an Amazon EC2 instance with Varnish and ran my blog on it for a year. I got comfortable with Varnish. It worked well.
When I setup the server for deliciousbrains.com, I felt good about running it there as well. And like I said above, I didn’t know Nginx well at all, let alone the FastCGI Cache options. It’s also possible that FastCGI Cache wasn’t mature in 2012, I’m not sure.
Why Varnish Today?
If I setup a new server today, would I still go with Varnish?
When I first reviewed part 4 of Ashley’s series, I thought Varnish would destroy FastCGI Cache in performance because it stores cached pages to memory while FastCGI Cache stores it to disk.
Well, after asking Ashley about that, it turns out you can configure the FastCGI Cache folder to be stored in memory. Time for some benchmarking!
FastCGI Cache (Disk) Benchmark
I tried using a similar benchmark as Ashley used in his article. I ran a Digital Ocean 2GB server with Ubuntu running 1 to 1,000 concurrent requests over a 60 second time period. All requests over regular HTTP.
A similar result to Ashley’s benchmark. The response time was double Ashley’s but this was likely due to difference in distance between my Digital Ocean data center (Toronto) and the origin of the benchmark requests (Virginia). Ashley was running his between Ireland and London. I also transferred 3x the data in my test, so that could have had an impact as well. In any case, that’s our baseline for the following benchmarks.
Configure FastCGI Cache to Use Memory
To get the FastCGI Cache folder to be served from memory, we use Linux’s ability to mount a folder into memory. In my favorite editor, I edited
/etc/fstab and added the following line:
tmpfs /sites/bradt.ca/cache tmpfs defaults,size=100M 0 0
This allows up to 100MB of cache files to be stored in memory for quicker access. Obviously you can tweak the folder path and size for your own site.
Now I saved and quit the editor and ran the following command:
This mounts all filesystems configured in
/etc/fstab. Now let’s see if it worked:
You should see your folder in the output.
FastCGI Cache (Memory) Benchmark
Running the same benchmark now, we get the following:
Surprisingly it performed a tiny bit worse, but so slightly that it’s not significant at all (i.e. if we ran it again it would probably perform slightly better).
Ashley guesses that this is likely because “disk” is actually solid-state (SSD) and much closer to memory performance than ye olde spinning hard disks. Sounds like a good guess to me.
Now time to try the same benchmark with Varnish. For my Varnish config, I’m using this template.
Again we see a dip in performance, but this time it’s significant. The average response has gone up from 82ms to 100ms. Let’s take a look at the response times over time to see what happened:
Looks like things are fine up to around 500 concurrent users, then it starts to struggle a bit. Looking at New Relic, Varnish causes a pretty big CPU spike:
As a side note, HTTPS has a huge impact on the server at this scale (1,000 concurrent users). Running the first FastCGI Cache (Disk) benchmark over HTTPS I got a staggeringly different result:
And looking at the response times you can see there’s a pretty solid relationship between the response time and the number of concurrent users.
Looking at New Relic we can see that Nginx is causing a big CPU spike:
So it looks like Nginx requires some significant extra CPU to do the encryption/decryption. There’s also some extra network latency for each request to do the TLS handshake.
To put things in perspective here, the server did fine up to 250 concurrent users before it started to gradually get slower. That’s pretty damn good for $20/month (Digital Ocean 2GB).
I’m pretty surprised that Varnish has been outperformed here. This was supposed to be the main selling point for having Varnish in the stack.
Varnish is definitely more configurable, but how much configuration do you really need? FastCGI Cache is plenty flexible for most sites.
I guess if you were interested in fragment caching, you might want to use Varnish so that you could use its Edge Side Includes (ESI) feature. Rachel Andrew did a nice article for Smashing Magazine about ESI if you’re interested in an intro.
There is one nice feature that Varnish added in 4.0 that FastCGI Cache doesn’t have yet: the ability to serve stale content when the cache has expired and trigger a fetch of fresh content. With FastCGI Cache, when you request content and the cache has expired, you have to wait for it to fetch fresh content from the backend which slows down the request a lot. It’s the difference between a 40ms response time and 200ms. Five times slower is huge.
You can add
updating to the
fastcgi_cache_use_stale directive, but it doesn’t solve this problem. The first request that hits expired content still has to wait for the fresh content to be fetched from the backend and any concurrent requests that come in during that time will get the stale content. A nice feature, but again, doesn’t solve the problem.
With what I know right now, I wouldn’t bother adding Varnish to the next server stack I setup. It’s just not worth the extra daemon running on the server, the extra configuration to manage, and most importantly the extra point of failure. But who knows, I could learn something tomorrow that changes my mind.
Have you used Varnish and/or Nginx FastCGI Cache? Maybe you’ve used something else for page caching? Let us know in the comments.