lambda@edge prototype

Recently I was doing a MVP for replacing a ELB/EC2/Docker based static site preview stack with a cloudfront/lambda/s3 based one.

Background

The purpose of this is to

  1. reduce the maintenance we has to do with the EC2 stack like regular AMI update.
  2. reduce the complexity of the stack  as the previous one involves building custom image, store image, cloud-formation to bring up stack, ec2 user data to init the system(pull image, run docker compose etc).
  3. reduce the cost as ELB, EC2 have to run 7/24.
  4. increase the stability as we know lambda does not rely on any specific run time whereas our docker containers still have to run on some instace even though docker has done a pretty good job on isolation.

EC2/Dcoker Stack

The existing stack is like below. On init, the docker containers will pull code from github, run installation on node dependencies, run preview command which is a browser-sync server pulls data from CMS and return combined html to client browser.

When code in Github updated, we have to restart the EC2 instance to pick it up.

EC2-Docker-based-preview

CloudFront/Lambda@edge Stack

The new stack will be we build the bundle from Github code via Jenkins, push to s3 which is fronted by a cloud-front distribution which notifies the lambda@edge function on request. So when user request a page, if it is the entry point(/bank/xxx), as it has no extension, cloud-front will have a miss and forward the request to origin. At this point, the lambda function we registered on the origin request life cycle will receive this request before it goes to origin and this is perfect time to do manipulation. So here in the lambda function, we request the html file from origin by adding the .html extension, then request the dynamic data form CMS and combine them the function and return to user directly. What’s happening next is browser will parse the html and send requests for the resource to CloudFront where we cloud either serve from CDN cache or fetch from S3 origin.

When code in Github updates, we just need to have a hook to trigger a Jenkins build to push the new artifacts to s3. One thing to notice is we need to set the entry html file TTL to 0 on CloudFront so that we do not have to invalidate it explicitly when deploying new code. It is a trade-off.

lambda@edge-based-preview

Logging

I was having a hard time with lambda@edge logging on CloudWatch. The function I triggered from lambda test console logs fine however when the function is triggered via CloudFront, it does not appear on the /aws/lambda/Function_Name log path. I had to open an enterprise aws support ticket for it. Turns out that the function triggered by CloudFront logs have a region prefix, like: /aws/lambda/us-east-1.Function_Name

CloudFront Trigger Selection

There are currently(as of 09/015/2018) 4 triggers we can choose from:

  1. the time a viewer request is received
  2. the time of cache miss and send request to origin
  3. the time it receives response from origin and before it caches the object
  4. the time it returns the content to the viewer.

So the type 1 and 4 are kind of the expensive and heavy hook that are triggered on each request no matter what! Be careful when they are selected  as it may increase the latency as well as the cost. The origin request  is a perfect life cycle hook in this use case as we only what the entry point to be manipulated. The following real assets request can still be handled by CloudFront and leverage its cache capability.

Advertisements

nginx reverse proxy S3 files

China access issue

Recently some of our church site users reported that the sermon audio/video download feature does not work any more. We recently moved our large files from file system to s3. After some research looks like the aws s3 is blocked by the famous Chinese Great FireWall(GFW).

Possible Solutions

Moving files back to file system(EBS) is one option but maybe too much as we already decided to store files in S3 which is much cheaper and easier to maintain.

Tried 2nd way which is using our existing web server Nginx as a reverse proxy. The directives in nginx is quite tricky and there are a lot of conventions which is not easier to document as well as debug.

The config is quite simple though

location ^~ /s3/ {
    rewrite /s3/(.*) /$1 break;
    proxy_pass http://YOUR-BUCKET.s3.amazonaws.com;

    proxy_set_header Host 'YOUR-BUCKET.s3.amazonaws.com';
    proxy_set_header Authorization '';
    proxy_hide_header x-amz-id-2;
    proxy_hide_header x-amz-storage-class;
    proxy_hide_header x-amz-request-id;
    proxy_hide_header Set-Cookie;
    proxy_ignore_headers "Set-Cookie";
  }

Basically have the url match whatever starts with /s3 and rewrite the current url with a regex so that we can get rid of the /s3 prefix and get the real key in s3. Then use break keyword so that it continues rather than the last which would try to find another handler. Next is use the proxy_pass to proxy the request to S3 with the processed real key. The other settings are just hide all the amazon response headers etc.

Nginx location order

From the HttpCoreModule docs:

  1. Directives with the “=” prefix that match the query exactly. If found, searching stops.
  2. All remaining directives with conventional strings. If this match used the “^~” prefix, searching stops.
  3. Regular expressions, in the order they are defined in the configuration file.
  4. If #3 yielded a match, that result is used. Otherwise, the match from #2 is used.

Example from the documentation:

location  = / {
  # matches the query / only.
  [ configuration A ] 
}
location  / {
  # matches any query, since all queries begin with /, but regular
  # expressions and any longer conventional blocks will be
  # matched first.
  [ configuration B ] 
}
location /documents/ {
  # matches any query beginning with /documents/ and continues searching,
  # so regular expressions will be checked. This will be matched only if
  # regular expressions don't find a match.
  [ configuration C ] 
}
location ^~ /images/ {
  # matches any query beginning with /images/ and halts searching,
  # so regular expressions will not be checked.
  [ configuration D ] 
}
location ~* \.(gif|jpg|jpeg)$ {
  # matches any request ending in gif, jpg, or jpeg. However, all
  # requests to the /images/ directory will be handled by
  # Configuration D.   
  [ configuration E ] 
}

aws cli ProfileNotFound

I was trying to do some KMS encryption for some of our prod credentials with aws cli. After pulling down the temporary aws sts token for prod roles and run the aws --profile SOME_PROD_ROLE kms encrypt xxx, the  botocore.exceptions.ProfileNotFound: The config profile (SOME_DEV_ROLE) could not be found constantly pop up.

I checked the ~/.aws/credentails file and make sure the [default] block is the one that i need. Still getting that. So looks like somewhere else is setting the cli to use that SOME_DEV_ROLE.

It turns out while I was using some other cli tool, the AWS_PROFILE was set on my environment so the cli will try to locate that profile. Even if another profile is explicitly set with --profile, it will still make sure that profile exist, otherwise error out. This is not ideal and should be considered a bug in aws cli IMHO.

So after unset that AWS_PROFILE var, everything works again.

kms quick bash cmd on MacOS:

echo "Decrypted: $(aws kms decrypt --ciphertext-blob fileb://<(echo $ENCRYPTED_DATA | base64 -D) --query Plaintext --output text | base64 -D)"

Serverless EMR Cluster monitoring with Lambda

Background

One issue we typically have is our EMR cluster stops consuming hive queries due to the overload of the metastore loading/refreshing. This is partially caused by the usage of the shared-metastore which hosts many teams’ schema/tables inside our organization. When this happens in prod, we have to ask help from RIM to terminate our persistent cluster and create a new one because we do not have the prod pem file. This becomes a pain for our team(preparing many emergency release stuff and getting into bridge line then waiting for all types of approval). Also for our client, they lose the time we process all the stuff we mentioned above(usually hours).

To solve this problem we created nagios alerts to ping our cluster every 10 minutes and have OPS watch for us. This is quite helpful since we know the state of the cluster all the time. However when hive server is down, we still have to go through the emergency release process. Even though we created the Jenkins pipeline to create/terminate/add-step, RIM does not allow us or OPS to use Jenkins pipeline to do the recovery.

Proposals

We have different ways to resolve it ourself:

  1. have a dedicated server(ec2) to monitoring and take actions
  2. have the monitor code deploy in launcher box and do monitoring/recovering
  3. have the code in Lambda

Dedicated EC2

Method 1 is a traditional way which requires a lot of setup with PUPPET/provision to get everything automated, which does not seem to worth.

Launcher box

Method 2 has less work because we typically have a launcher box in each env. And we could put our existing js code into a nodejs server managed by PM2. The challenge is (1, the launcher box is not a standard app server instance which is not reliable. (2. the nodejs hive connector does not currently have good support on ssl and custom authenticated that we are using in Hive server2.

More over, there is one common weak point for method 1/2, which is we could end up with another monitoring service to make sure it is up and running doing its job.

Lambda

All the above analysis brings us to the Method 3 where we use Serverless monitoring with lambda. This gives us

(1, Lower operational and development costs

(2,  smaller cost to scale

(3, Works with agile development and allows developers to focus on code and to deliver faster

(4, Fits with microservices, which can be implemented as functions.

It is not silver bullet of course. One drawback is with Lambda we could not reuse our nodejs script which is written in ES 6 and aws lambda’s node environment is 4.x. We only tested and run our script in Node 6.x env. With this, we have to re-write our cluster recovery code in java, which fortunately is not difficult thanks to the nice design of aws-java-sdk. 

Implementation

The implementation is quite straightforward. 

Components

On high level, we simulate our app path and start the connection from our elb to our cluster.  

lambda-overview

Flow

The basic flow is:

lambda-flow

The code is : https://github.com/vcfvct/lambda-emr-monitoring

Credentials

For hive password, The original thought was to pass in by Sprint boot args. Since we are running in lambda, our run arg will be plain text in the console, which is not ideal. We could also choose to use lambda’s KMS key to encrypt and then decrypt in app. Looks like it is not allowed in Org-level. After discuss with infrastructure team, our hive password will be store in credstash(dynamo).

stateful firewall with inbound outbound traffic

Background

I have worked as Devops for cloud migration in the recent 3 months without really writing much code. Even though being exposed to many AWS services like EMR/EC2/ASG(auto scaling group)/LC(launch config)/CF(cloud formation) etc.. with the need of setting up security groups(SG), i find myself still a bit confusing with inbound and outbound traffic rules. Was wondering if i allow inbound traffic, i have to send response back to client which means i have to allow outbound traffic? Did some google search with the question and get the keyword stateful firewall.

So basically with a stateful firewall, when a connection is established, the firewall will automatically let packets out back to the client’s port. You don’t need to create rules for that because the firewall knows.

History

Before the development of stateful firewalls, firewalls were stateless. A stateless firewall treats each network frame or packet individually. Such packet filters operate at the OSI Network Layer (layer 3) and function more efficiently because they only look at the header part of a packet.They do not keep track of the packet context such as the nature of the traffic. Such a firewall has no way of knowing if any given packet is part of an existing connection, is trying to establish a new connection, or is just a rogue packet. Modern firewalls are connection-aware (or state-aware), offering network administrators finer-grained control of network traffic.

Early attempts at producing firewalls operated at the application layer, which is the very top of the seven-layer OSI model. This method required exorbitant amounts of computing power and is not commonly used in modern implementations.

Description

A stateful firewall keeps track of the state of network connections (such as TCP streams or UDP communication) and is able to hold significant attributes of each connection in memory. These attributes are collectively known as the state of the connection, and may include such details as the IP addresses and ports involved in the connection and the sequence numbers of the packets traversing the connection. Stateful inspection monitors incoming and outgoing packets over time, as well as the state of the connection, and stores the data in dynamic state tables. This cumulative data is evaluated, so that filtering decisions would not only be based on administrator-defined rules, but also on context that has been built by previous connections as well as previous packets belonging to the same connection.

The most CPU intensive checking is performed at the time of setup of the connection. Entries are created only for TCP connections or UDP streams that satisfy a defined security policy. After that, all packets (for that session) are processed rapidly because it is simple and fast to determine whether it belongs to an existing, pre-screened session. Packets associated with these sessions are permitted to pass through the firewall. Sessions that do not match any policy are denied, as packets that do not match an existing table entry.

In order to prevent the state table from filling up, sessions will time out if no traffic has passed for a certain period. These stale connections are removed from the state table. Many applications therefore send keepalive messages periodically in order to stop a firewall from dropping the connection during periods of no user-activity, though some firewalls can be instructed to send these messages for applications.

Depending on the connection protocol, maintaining a connection’s state is more or less complex for the firewall. For example, TCP is inherently a stateful protocol as connections are established with a three-way handshake (“SYN, SYN-ACK, ACK”) and ended with a “FIN, ACK” exchange. This means that all packets with “SYN” in their header received by the firewall are interpreted to open new connections. If the service requested by the client is available on the server, it will respond with a “SYN-ACK” packet which the firewall will also track. Once the firewall receives the client’s “ACK” response, it transfers the connection to the “ESTABLISHED” state as the connection has been authenticated bidirectionally. This allows tracking of future packets through the established connection. Simultaneously, the firewall drops all packets which are not associated with an existing connection recorded in its state table (or “SYN” packets), preventing unsolicited connections with the protected machine by black hat hacking.

By keeping track of the connection state, stateful firewalls provide added efficiency in terms of packet inspection. This is because for existing connections the firewall need only check the state table, instead of checking the packet against the firewall’s rule set, which can be extensive. Additionally, in the case of a match with the state table, the firewall does not need to perform deep packet inspection.

Add Username password auth to Hive

In my previous post, we achieved end to end SSL encryption from client to ELB to the EMR master. Our next requirement is to add username password authentication. There are different ways in hive to do this: 1. LDAP, 2. PAM, 3. CUSTOM mode. After some evaluation we finally choosed the  CUSTOM mode way.

We first need to create a class implements the org.apache.hive.service.auth.PasswdAuthenticationProvider interface. Notice here, we use the MD5 hashed password here so that we do not expose the credential in the code.


public class RcHiveAuthentication implements PasswdAuthenticationProvider
{
    Hashtable<String, String> store = null;

    public RcHiveAuthentication()
    {
        store = new Hashtable<>();
        store.put("rcapp", "7451d491e7052ac047caf36a5272a1bf");
    }

    @Override
    public void Authenticate(String user, String password)
        throws AuthenticationException
    {
        String storedPasswd = store.get(user);
        //since MD5 is just some HEX, we ignore case.
        if (storedPasswd == null || !storedPasswd.equalsIgnoreCase(getMd5(password)))
        {
            // if we do not have the user or passowrd does not match, auth fail.
            throw new AuthenticationException("SampleAuthenticator: Error validating user");
        }
    }

    /**
     * @param input input string to be hashed
     *
     * @return  the MD5 of the input. if any error, return empty String.
     */
    public static String getMd5(String input)
    {
        String resultMd5 = "";
        try
        {
            resultMd5 = DatatypeConverter.printHexBinary(MessageDigest.getInstance("MD5").digest(input.getBytes(StandardCharsets.UTF_8)));
        } catch (NoSuchAlgorithmException e)
        {
            e.printStackTrace();
        }

        return resultMd5;
    }

    // package access for unit test
    Hashtable<String, String> getStore()
    {
        return store;
    }
}

Make the above class a jar and put it into the $HiveHOME/lib directory

Then we need to add some config to hive-site.xml so that Hive knows to use our custom provider to do authentication

    <property>
        <name>hive.server2.authentication</name>
        <value>CUSTOM</value>
    </property>
    <property>
        <name>hive.server2.custom.authentication.class</name>
        <value>org.finra.rc.hiveserver2.RcHiveAuthentication</value>
     </property>

At this stage, if you try to connect the server, you will probably get the below error:

Error while compiling statement: FAILED: RuntimeException Cannot create staging directory 'hdfs://namenode:

This is because the user we set in the custom auth provider does not have write access in the HDFS. To get around this, we need to set the hive Impersonation to false:

    <property>
        <name>hive.server2.enable.doAs</name>
        <value>false</value>
    </property>

After adding this , we now should be able to access the hive with our custom defined username/password