Create a Cloudfront distribution for an S3 website

Tuesday, May 9 2017 in devops frontend

The HTTPS protocol encrypts the data between your website and the browser; it makes it harder for third-parties to inject their content into your website and may boost search rankings. You can’t use HTTPS if you’re hosting a website on an S3 bucket on your own domain. To serve your S3 website with HTTPS, one solution is serving your website through the Amazon Cloudfront CDN.

What’s a CDN?

CDN stands for Content Delivery Network; a CDN replicates your website on servers around the world, so visitors should see your pages faster. When a visitor requests one of your assets, the CDN fetches the data from your S3 bucket, then pass it along to the client. On following requests from the same region, the CDN serves the data directly from the cache. Cloudfront supports HTTPS: it encrypts the data between the CDN and the browser, but not between Cloudfront and the S3 bucket. I will describe a few sticking points I encountered during the set up, as well as how the solution fared in practice and how much it cost.

Setup

There’s a couple of points that slowed me down. To use HTTPS, you need to create an SSL certificate. Amazon certificate manager supplies free certificates, so I used that since I expected it to integrate well with other Amazon products. The first issue is that you must create the certificate in the AWS North Virginia region. Cloudfront itself has no region, but can only use certificates from North Virginia, no matter the region of your S3 website. To verify you own the domain, AWS sends a mail to your registrar. Hopefully your registrar forwards it to you even if you aren’t hosting email on your own domain.

When you host a website on an S3 bucket, AWS creates two host names. One points to the bucket and the other to the website. The difference is that the website has redirection rules, for example it serves index.html when someone requests a a directory. When you create the distribution, you must select the S3 website as the origin, not the bucket.

Bonus features

There were a few nice free features that I did not expect. Access logs are really nice. You can see graphs of top requested resources as well as errors. You’ll notice that CSS is often a top requested resource. There’s also a lot of 404 errors, which can be deceptive because a lot come from favicons requests and iOS icons.

Noticing that almost every page visits requested the CSS too, I first tried to inline the critical CSS with Critical, but the script to load the non-critical CSS took as much space as the whole stylesheet, so I ended up inlining everything.

In addition to requested object, you also can access some data about your visitors, for example which browser they’re using. Finally, detailed billing reports let you see which regions request your content.

Costs and annoyances

The website performed nicely, but it ended up costing the double than serving from an S3 bucket directly. While the price for data transfer is lower on Cloudfront, the price for request is higher. There were also a few costs and annoyances that I did not anticipate; although in my case they were negligible, it’s interesting to learn more about how AWS works.

First, HTTPS costs more. All HTTPS to Cloudfront incur a surcharge compared to unencrypted HTTP requests.

When visitors request an asset, Cloudfront makes a new request to the origin even if Cloudfront already caches the asset in another region. If you don’t have alot of visitors, but they come from all over the world, you might not use the Cloudfront caches as often as you expect!

When Cloudfront receives a file, it caches it according to the Cache-Control headers you set in your website. To replace a file before it expires, you need to set up a cache invalidation in the Cloudfront console. This adds an additional step when you update your website. Cache invalidations will cost you money once you want to invalidate more than 1 000 paths. They aren’t instantaneous either, understandable given the distributed nature of the CDN. One solution is to use a unique name for each version of your CSS and JavaScript files.

Altough it was a good learning experience, and statistics about your most requested assets provide information, for my small website, Cloudfront did not justify the costs and especially the administration overhead when free static website hosting solutions like Netlify, Firebase and Surge, which all offer HTTPS.