lambda@edge prototype

Recently I was doing a MVP for replacing a ELB/EC2/Docker based static site preview stack with a cloudfront/lambda/s3 based one.

Background

The purpose of this is to

  1. reduce the maintenance we has to do with the EC2 stack like regular AMI update.
  2. reduce the complexity of the stack  as the previous one involves building custom image, store image, cloud-formation to bring up stack, ec2 user data to init the system(pull image, run docker compose etc).
  3. reduce the cost as ELB, EC2 have to run 7/24.
  4. increase the stability as we know lambda does not rely on any specific run time whereas our docker containers still have to run on some instace even though docker has done a pretty good job on isolation.

EC2/Dcoker Stack

The existing stack is like below. On init, the docker containers will pull code from github, run installation on node dependencies, run preview command which is a browser-sync server pulls data from CMS and return combined html to client browser.

When code in Github updated, we have to restart the EC2 instance to pick it up.

EC2-Docker-based-preview

CloudFront/Lambda@edge Stack

The new stack will be we build the bundle from Github code via Jenkins, push to s3 which is fronted by a cloud-front distribution which notifies the lambda@edge function on request. So when user request a page, if it is the entry point(/bank/xxx), as it has no extension, cloud-front will have a miss and forward the request to origin. At this point, the lambda function we registered on the origin request life cycle will receive this request before it goes to origin and this is perfect time to do manipulation. So here in the lambda function, we request the html file from origin by adding the .html extension, then request the dynamic data form CMS and combine them the function and return to user directly. What’s happening next is browser will parse the html and send requests for the resource to CloudFront where we cloud either serve from CDN cache or fetch from S3 origin.

When code in Github updates, we just need to have a hook to trigger a Jenkins build to push the new artifacts to s3. One thing to notice is we need to set the entry html file TTL to 0 on CloudFront so that we do not have to invalidate it explicitly when deploying new code. It is a trade-off.

lambda@edge-based-preview

Logging

I was having a hard time with lambda@edge logging on CloudWatch. The function I triggered from lambda test console logs fine however when the function is triggered via CloudFront, it does not appear on the /aws/lambda/Function_Name log path. I had to open an enterprise aws support ticket for it. Turns out that the function triggered by CloudFront logs have a region prefix, like: /aws/lambda/us-east-1.Function_Name

CloudFront Trigger Selection

There are currently(as of 09/015/2018) 4 triggers we can choose from:

  1. the time a viewer request is received
  2. the time of cache miss and send request to origin
  3. the time it receives response from origin and before it caches the object
  4. the time it returns the content to the viewer.

So the type 1 and 4 are kind of the expensive and heavy hook that are triggered on each request no matter what! Be careful when they are selected  as it may increase the latency as well as the cost. The origin request  is a perfect life cycle hook in this use case as we only what the entry point to be manipulated. The following real assets request can still be handled by CloudFront and leverage its cache capability.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s