Configuring Cloudfront for use with S3

Recently I spend some time setting preparing the ground for scown.dev (coming soon to an Internet near you). While I was at the ~~[T-437 Safety Command]~~ AWS console I realised that the scown.space Cloudfront Distibution was set up in a sub-optimal fashion, allowing direct read access to the underlying bucket. Having spent some time getting a better approach working, I thought I’d document it here.

Prior Configuration

Previously the scown.space setup consisted of:

An S3 bucket with static website hosting enabled and public read-only access to all objects in the bucket
A Cloudfront Distribution in front of the S3 bucket using a website origin
A Lambda@Edge function to allow the HTML files in the bucket to be accessed without specifying their file extension

This worked, but also allowed access to the site using S3 URLs. While this was not really an issue (everything in the bucket is accessible via the Cloudfront Distibution), it was also not quite what I was after.

Accessing S3 Buckets using Cloudfront Origin Identities

While setting up the scown.dev Cloudfront Distribution I spotted a setting I didn’t remember seeing before: Restrict Bucket Access. Essentially this locks down the S3 bucket and requires clients to access the objects using a Cloudfront Origin Identity. This was exactly what I was after.

Eager to try this out I went back to the scown.space distribution and configured the same settings. Unfortunately it wasn’t quite that straightforward - turning off the static website hosting resulted in some nasty looking XML and 403 errors. Fun times.

After some experimentation turning various permissions on and off, it transpired that there was a step I had missed.

In addition to adding a policy granting the Origin Identity access to the bucket objects, I also needed to add an ACL granting the Origin Identity permission to list objects in the bucket.

This got me most of the way there, but there was still an issue accessing the homepage.

Index document handling

S3 buckets configured for static website hosting specify an index document, which I was relying on for serving the main page. Accessing it presented me with some different XML and 404 errors. Sigh.

This turned out to be an issue with my Lambda function. I had unconciously set the Lambda up to rely on the index document property of the bucket. Thankfully the Lambda is so trivial that this was easy to fix.

'use strict';

exports.handler = (event, context, callback) => {
    const request = event.Records[0].cf.request;

    let uri = request.uri;

    if (uri === '') {
        uri = request.uri = '/index.html';
    }
    else if (uri.match(/\/$/)) {
        uri = request.uri += 'index.html';
    }

    const extension = uri.match(/\.[^.\/]+$/);

    if (extension === null) {
        request.uri += '.html';
    }

    return callback(null, request);
};

Conclusion

At this point I was able to access scown.space in full without a public bucket. Result.