Pandas & AWS Lambda

The repo mthenw/awesome-layers lists several publicly available aws lambda layers.

In particular, keithrozario/Klayers has pandas+numpy and is up-to-date as of today with pandas 0.25.

Its ARN is arn:aws:lambda:us-east-1:113088814899:layer:Klayers-python37-pandas:1


I believe you should be able to use the recent pandas version (or likely, the one on your machine). You can create a lambda package with pandas by yourself like this,

  1. First find where the pandas package is installed on your machine i.e. Open a python terminal and type

    import pandas
    pandas.__file__
    

    That should print something like '/usr/local/lib/python3.4/site-packages/pandas/__init__.py'

  2. Now copy the pandas folder from that location (in this case '/usr/local/lib/python3.4/site-packages/pandas) and place it in your repository.
  3. Package your Lambda code with pandas like this:

    zip -r9 my_lambda.zip pandas/
    zip -9 my_lambda.zip my_lambda_function.py
    

You can also deploy your code to S3 and make your Lambda use the code from S3.

aws s3 cp  my_lambda.zip s3://dev-code//projectx/lambda_packages/

Here's the repo that will get you started


After some tinkering around and lot's of googling I was able to make everything work and setup a repo that can just be cloned in the future.

Key takeaways:

  1. All static packages have to be compiled on an ec2 amazon Linux instance
  2. The python code needs to load the libraries in the lib/ folder before executing.

Github repo: https://github.com/moesy/AWS-Lambda-ML-Microservice-Skeleton