Caching, Zipping, and (Amazon CloudFront) CDN For A Rails App
Friday, June 18th, 2010This is Article #4 of a 4-part series. For a good primer, check out the first two articles listed below. For the reasoning and analysis behind the “Recommended” option in this article, check out Part 3, How to Combine GZip + CDN for Fastest Page Loads. Otherwise, jump right in!
- The Importance of Page Load Speed
- Improve Page Load Speed (by 80%) by Improving Component Load Speed
- How to Combine GZip + CDN for Fastest Page Loads
- Caching, Zipping, and (Amazon CloudFront) CDN For A Rails App
- Prerequisites
- Cached stylesheets and javascripts
- Creating an Amazon AWS Account
- Setup S3 Buckets
- Setup CloudFront Distributions
- Create CNAME records (optional)
- Install Rails S3 Synch Plugin
- Installing AWS-S3 Gem
- Configure S3 Synch Plugin
- Add S3 Synch to Deployment
- Option A: Compressible Assets from App Server, Images from CloudFront (recommended)
- Configure Rails Asset Host
- Create A-name Record
- Configure Apache
- Option B: Serve Everything from CloudFront (easier, but not recommended)
- Configure Rails Asset Host
- Pre-compile Cached Stylesheet and Javascript File
- Conclusion
- Prerequisites
In this article, we’re going to speed up our Rails application by up to 75%, simply by optimizing our Rails asset host. We’re going to serve our components (stylesheets, javascripts, images, etc.) from a combination of our app’s server and Amazon CloudFront (Option A, recommended), or entirely from CloudFront (Option B – easier).
The best option for you may depend on your specific needs, but I’ll cover both processes below. For a an in-depth analysis of why Option A is recommended over Option B, see the last article in this series, How to Combine GZip + CDN for Fastest Page Loads.
Prerequisites
Cached Stylesheets and Javascripts
Another way to reduce page load time is to combine all of your components into as few files as possible. In other words, combine all of your stylesheets into a single css file, and likewise with your javascripts. Remember from the last article, that each request takes 50-150ms, not including the response and download time. If you have 10 separate javascripts, this equates to 0.5-1.5 seconds just to request the files (not to mention all the time to download them). If you can combine all of the files into one, that means you need just one request to get the same amount of data.
Luckily in Rails, this is easy, simply add :cache => 'cached-file-name' to your stylesheet_link_tag and javascript_include_tag in your application layout. For example:
<%= javascript_include_tag 'jquery', 'jquery-ui', 'application', :cache => 'all-app-javascripts' %>
Now, as long as the following line is set to true in your environment.rb, or more likely in production.rb, Rails will either load your combined files in the layout, or create and load them if they don’t already exist.
Simply packing all stylesheets and javascripts into one file each reduced page load time of one of our production applications from 10.1 to 8.3 seconds (an 18% reduction in load time alone).
Creating an Amazon AWS Account
If you do not yet have an Amazon AWS account, you will need to create that and enable S3 and CloudFront services. See this writeup on creating and setting up an Amazon S3 account. for helpful instructions.
Setup S3 Buckets
Once you’ve signed up for your Amazon AWS account and activated S3 and CloudFront, you’ll want to setup 4 S3 buckets for your application, using Amazons S3 management console.
We’re going to setup 4 buckets and CDN distributions because some old browsers still have an artificial limitation that only allow 2 concurrent connections to each domain, meaning our components will take longer to download from Amazon if they can only be downloaded 2 at a time. By creating 4 different domains pointing to 4 different buckets/distributions, we’re allowing our components to download up to 8 at a time from those browsers that still enforce this limitation.
If you name your bucket cdn0.yourapp.com, then your components will have the URL https://cdn0.yourapp.com.s3.amazonaws.com/stylesheet.css. This will give you a warning message saying the connection is not trusted, because the browser treats your bucket name as subdomains (and in this case, com.s3.amazonaws.com would be trusted, but subdomains of that, cdn0.yourapp and yourapp will not).
Setup CloudFront Distributions
Once your S3 buckets are created, click over to the CloudFront tab and create one distribution for each S3 bucket as shown. You can type any comment to help you quickly identify each distribution.
Create CNAME Records (optional)
Once you’ve created your 4 CloudFront distributions, you may create a CNAME record for each distribution. This allows you to serve files from CloudFront using your own asset subdomains, like cdn0.yourapp.com, instead of raNDomString1234.cloudfront.net. We’ll use the following format of cdn%d.yourapp.com, where %d stands for digits 0-3:
cdn1.yourapp.com
cdn2.yourapp.com
cdn3.yourapp.com
Install Rails S3 Synch Plugin
This plugin adds some Capistrano recipes to synch our application’s public directory with our four S3 buckets automatically every time we deploy our app. See Spatten Design’s documentation for more information. I’ve made some updates to their original plugin to properly set the Cache-control and Expires headers for our assets on S3, as well as to properly set the Content-encoding header for Gzipped assets.
Update: I’ve updated the S3 Synch Plugin further; it can now handle unique S3 buckets for different Rails environments (e.g. one set of buckets for production and another for staging). Be sure to update your synch_s3_asset_host.yml file as shown below.
Installing AWS-S3 Gem
The synch_s3_asset_host plugin requires the AWS-S3 gem, so add the following to your environment.rb:
…and then run the following from the terminal to install the S3 Synch plugin’s gem dependency:
Configure S3 Synch Plugin
Create a config/synch_s3_asset_host.yml file like this:
AWS_SECRET_ACCESS_KEY: 'YourSecretAccessKeyHere'
production:
asset_host_name: "yourapp-com-cdn%d" # This is whatever you named your S3 buckets, using %d in place of the numbers 0-3
# dry_run: false # Set to true if you want to test the asset_host uploading without doing anything on Amazon S3
Update: The “production” part in the file above has been added for my latest update of the S3 Asset Synch Plugin.
Add S3 Synch to Deployment
Now, in your Capistrano deploy.rb script, add the following line to the :deploy namespace:
...
before "deploy:symlink", "s3_asset_host:synch_public"
...
end
…and then add the :asset_host_syncher => true flag to the :web role:
role :web, "yourapp.com", :asset_host_syncher => true
...
Option A: Compressible Assets from App Server, Images from CloudFront (recommended)
For more detail about why this method is recommended, see the last article in this series.
Configure Rails Asset Host
Use the following configuration in your production.rb file to configure the way Rails writes the URLs for asset_tags:
# config.action_controller.asset_host = "http://assets.example.com"
ActionController::Base.asset_host = Proc.new { |source, request|
# the following will route to Amazon S3 + CloudFront if /images asset (setup with CNAMEs as domains cdn0-cdn3)
# and will route to cdn for anything else (js, css, html), which routes to RMSR's own server so that files can be gzipped and served
if source.starts_with?('/images')
unless request.ssl? # CloudFront does not support HTTPS, but S3 does
"http://cdn#{source.hash % 4}.yourapp.com"
else # For SSL we want the certificate to match the hosting domain for cloudfront
[ "https://yourcloudfrontdist0.cloudfront.net",
"https://yourcloudfrontdist1.cloudfront.net",
"https://yourcloudfrontdist2.cloudfront.net",
"https://yourcloudfrontdist3.cloudfront.net" ][source.hash % 4]
end
else
# use the cahed and zipped subdomain for assets that can be zipped (i.e. non-binary filetypes)
# => text/html text/css application/x-javascript application/javascript
"#{request.protocol}cache.yourapp.com"
end
}
if source.starts_with?('/images')
[ "#{request.protocol}yourcloudfrontdist0.cloudfront.net",
"#{request.protocol}yourcloudfrontdist1.cloudfront.net",
"#{request.protocol}yourcloudfrontdist2.cloudfront.net",
"#{request.protocol}yourcloudfrontdist3.cloudfront.net" ][source.hash % 4]
else
# use the cahed and zipped subdomain for assets that can be zipped (i.e. non-binary filetypes)
# => text/html text/css application/x-javascript application/javascript
"#{request.protocol}cache.yourapp.com"
end
}
source.hash % 4 code above. This ensures that the same component is always served from the same subdomain to take full advantage of client-side caching for that component, rather than randomly selecting from which subdomain to serve each component on each page load.
For more information on configuring Rails’s asset_host, see the documentation for
Base.asset_host
Create A-name Record
We will also need to create an A-name record for the cache.yourapp.com subdomain, which points to your application server’s IP address.
Configure Apache
Now we need to configure Apache to accept incoming requests to our “cache” subdomain, setting the appropriate far-future Expires and Cache-control headers. We also need to tell Apache to automatically compress and serve any compressible filetype on the fly. Add this to your site’s Apache conf file:
# gzip html, css, and js
AddOutputFilterByType DEFLATE text/html text/css application/x-javascript application/javascript
<virtualhost *:80>
ServerName cache.yourapp.com
DocumentRoot /path/to/yourapp/public
<filesmatch ".(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
ExpiresActive On
ExpiresDefault "access plus 1 year"
</filesmatch>
FileETag none
</virtualhost>
...
However, the ETag’s uniqueness depends not just on the file, but usually on the server it’s being served from as well. This means if you have your assets copied to several asset domains on different servers, a file downloaded and cached from one server, and then the next page tries to pull the asset from another asset domain, the file’s ETag will not match the ETag of the cached file, so it will re-download the file instead of serving it from cache.
Furthermore, Rails does a very good job of appending the last-modified-date to the asset file names (using the asset_tag helpers), which effectively serves, caches, and invalidates the assets for you as necessary. So, we’re much better off just turning ETags off for our Rails app.
Now we need to make sure the appropriate Apache modules are enabled and restart Apache.
sudo a2enmod expires
sudo /etc/init.d/apache2 force-reload
Option B: Serve Everything from CloudFront (easier, but not recommended)
For more detail about why this is not recommended, see the last article in this series. Basically, though, it’s because it requires you to make one of the following compromises:
- a) Serve all files uncompressed, resulting in file sizes up to 4x bigger than necessary.
- b) Serve Gzipped assets from CloudFront without first detecting whether or not the visitor’s browser support Gzip encoding.
That being said, if this is acceptable for you, this method is simpler to set up and configure.
Configure Rails Asset Host
Add the following to your production.rb:
# Enable serving of images, stylesheets, and javascripts from an asset server
# config.action_controller.asset_host = "http://assets.example.com"
unless request.ssl? # CloudFront does not support HTTPS, but S3 does
"http://cdn#{source.hash % 4}.yourapp.com"
else # For SSL we want the certificate to match the hosting domain for cloudfront
[ "https://yourcloudfrontdist0.cloudfront.net",
"https://yourcloudfrontdist1.cloudfront.net",
"https://yourcloudfrontdist2.cloudfront.net",
"https://yourcloudfrontdist3.cloudfront.net" ][source.hash % 4]
end
}
[ "#{request.protocol}yourcloudfrontdist0.cloudfront.net",
"#{request.protocol}yourcloudfrontdist1.cloudfront.net",
"#{request.protocol}yourcloudfrontdist2.cloudfront.net",
"#{request.protocol}yourcloudfrontdist3.cloudfront.net" ][source.hash % 4]
}
Pre-compile Cached Stylesheet and Javascript File
If you’re serving every component from CloudFront, you will need to pre-compile your stylesheets and javascripts on every deploy. Otherwise, Rails will try to compile and save the files to your application server, but try to serve them from S3 (where they won’t exist).
To solve this, we’ll add some Capistrano scripts to our deploy.rb to compile our files for us before the synch_s3_asset_host plugin copies our public directory over to our S3 buckets. But this means, we’d have to copy the list of asset files to be compiled into our Capistrano script, as well as having them listed in our application.html.erb layout. To DRY things up a little, we’re going to create some project-wide constants:
lib/assets_for_cache.rb
JAVASCRIPT_FILES = ['jquery', 'jquery-ui', 'application']
STYLESHEET_FILES = ['reset', 'application']
JAVASCRIPT_CACHE_FILE = 'all-app-javascripts'
STYLESHEET_CACHE_FILE = 'all-app-stylesheets'
end
And then replace your javascript_include_tag and stylesheet_link_tag in your application layout with the following:
<%= stylesheet_link_tag AssetsForCache::STYLESHEET_FILES, :cache => AssetsForCache::STYLESHEET_CACHE_FILE %>
Add this to your deploy.rb script:
require File.dirname(__FILE__) + '/../lib/assets_for_cache.rb'
set :stylesheets, AssetsForCache::STYLESHEET_FILES
set :javascripts, AssetsForCache::JAVASCRIPT_FILES
task :package_cached_assets do
package_stylesheets
package_javascripts
end
task :package_stylesheets, :roles => :web do
sudo %{rm -f #{release_path}/public/stylesheets/#{AssetsForCache::STYLESHEET_CACHE_FILE}.css}
stylesheets.each do |stylesheet|
run %{cat #{release_path}/public/stylesheets/#{stylesheet}.css >> \
#{release_path}/public/stylesheets/#{AssetsForCache::STYLESHEET_CACHE_FILE}.css}
end
run %{gzip -c #{release_path}/public/stylesheets/#{AssetsForCache::STYLESHEET_CACHE_FILE}.css > #{release_path}/public/stylesheets/#{AssetsForCache::STYLESHEET_CACHE_FILE}.css.gz}
end
task :package_javascripts, :roles => :web do
sudo %{rm -f #{release_path}/public/javascripts/#{AssetsForCache::JAVASCRIPT_CACHE_FILE}.js}
javascripts.each do |javascript|
run %{cat #{release_path}/public/javascripts/#{javascript}.js >> \
#{release_path}/public/javascripts/#{AssetsForCache::JAVASCRIPT_CACHE_FILE}.js}
end
run %{gzip -c #{release_path}/public/javascripts/#{AssetsForCache::JAVASCRIPT_CACHE_FILE}.js > #{release_path}/public/javascripts/#{AssetsForCache::JAVASCRIPT_CACHE_FILE}.js.gz}
end
end
…and then add this to the :deploy namespace in your deploy.rb file, before calling the s3_asset_host sync script:
...
before "deploy:symlink", "assets:package_cached_assets"
before "deploy:symlink", "s3_asset_host:synch_public"
...
end
Conclusion
Now simply save your project and deploy it! The first deploy will take quite a while, as your entire /public directory will be copied to all 4 buckets on Amazon S3, one at a time. But after that, it’s a painless process.
--exclude list in the synch_s3_asset_host plugin on line 186 of vendor/plugins/synch_s3_asset_host/recipes/synch_s3_asset_host.rb
Whether you chose the “recommended” or the “easier” option, you should immediately notice a significant increase in the performance of your Rails app. Thanks for sticking with me through this 4-part series! Please let me know if you have any thoughts, questions, or feedback in the comments.




