Build custom AWS Lambda layer for Scikit-image

Interesting couple of days figuring this out. ...hopefully the answer below will be some help to anyone struggling to figure out how to make a custom layer (for python but also other languages).


Where is the best place to find the latest AWS AMI docker image?

The answer, as Greg above points out, for where is the "right" docker image to use to build layers is here: lambci/lambda:build-python3.7. That is the official SAM repo for the docker images they use.

The full list for all AWS lambda runtime environments, not just python, is here


What's the best way to build your own AWS lambda layer? ...What's the best way to build a custom python module layer?

The best way I found, to date, is to use AWS's SAM in combination with some tweaks I used from a great blog here.

The tweaks are needed because (at the time I'm writing this) AWS SAM lets you define your layers, but won't actually build them for you. ...See this request from the SAM group's github.

I'm not going to try to explain this in huge detail here - instead please check out the bryson3gps blog. He explains it well, and all the credit to him.*


OK, a quick background on the process to use:

At present, AWS SAM won't build your layer for you.

Meaning, if you define a requirement.txt for a set of modules to install in a layer, it won't actually install/build them into a local directory ready to upload to AWS (as it does if you use it to define a lambda function).

But, if you define a layer in SAM, it will package (zip everything and upload to S3) and deploy (define it within AWS Cloud with ARN etc etc so it can be used) that layer for you.


The way to get SAM to build your layers too

The hack, at present, to "fool" SAM into also building your layer for you, from the bryson3Gps blog here, is to

  1. Define a dummy AWS lambda function template in SAM. Then for that function, make a pip requirement.txt that SAM will use during the build to load the modules you want into your layer. You won't actually use this function for anything.

This entails making a SAM template.yaml file that defines a basic function. Check out the SAM tutorial, then look at bryson3gps' blog. It's pretty easy.

  1. Define an AWS layer in the same template.yaml file. Again not too hard - check out the blog

  2. In the SAM spec's for your layer definition, set ContentUri (ie where it looks for the files/directories to zip and upload to AWS) to the build location for the function you defined in (1).

So, when you use sam build, it will build the function for you (ie process requirements.txt for the function) and put the resulting function packages in a directory to later zip up and send to AWS.

But (this is the key) the layer you defined has it's ContentUri pointing to the same directory sam build used to create the directory for the (dummy) function.

So then, when you tell SAM to package (send to S3) and deploy (configure with AWS) for the template as a whole, it will upload/create the layer that you defined, but it will also use the correct contents for the layer that got built for the (dummy) function.

It works well.

A couple of extra tips

1

In bryson3gps' blog, he points out that this method doesn't put the layers package in the correct location in the lambda AMI directory for them to be found by default (for python that is /opt/python). Instead they are placed in /opt.

His way around this is to add /opt to the sys.path in your lambda scripts prior to importing:

sys.path.append('/opt')
import <a module in your layer>

Instead of doing that, prior to sam package uploading to S3 (after sam build), you can go into the appropriate .aws-sam/<your package subdir> directory and move everything into a new /python directory within that package directory. This results in the layer modules being placed in /opt/python correctly, instead of just /opt.

cd .aws-sam/<wherever you package is>/
mkdir .python
mv * .python
mv .python python

2

If you're making a python layer with compiled code (eg scikit-image that I'm using) make sure you use sam build -u (with the -u flag).

That will make sure the build (pip'ing requirements.txt) will happen inside a docker container matching the AWS lambda runtime, and so will DL the correct lib's) for the runtime.

3

If you're including any modules that depend on numpy or scipy, then after sam build -u, but before package/deploy, make sure you go into the appropriate .aws-sam/<your package> directory that is built and remove the numpy and scipy modules that the dependency will install

cd .aws-sam/<wherever you package is>/
rm -r numpy*
rm -f scipy*

Instead you should specify to use the AWS supplied numpy/scipy layer in your lambda function.

I couldn't find a way to tell SAM to run pip with --no_dep, so have to do this manually


I'm not an expert at this, but I happened to have the very same set of questions on the same day. However I can answer question #1 and #2. Taking them out of order:
2) An AMI is not a docker image, its for use in an EC2 instance.

1) Here is how I got the appropriate docker image:

I installed SAM cli and executed the following commands:

sam init --runtime python3.7 (sets up hello world example)
sam build -u (builds app, -u means use a container)

Output from sam build -u:

Fetching lambci/lambda:build-python3.7 Docker container image

So there you go. You can either get the image from dockerhub directly or if you have SAM cli installed, you can execute "sam build -u". Now that you have the image, you don't have to follow the full SAM workflow, if you don't want the overhead.


As of v0.50.0, the sam cli has direct support for building layers. You decorate your AWS::Serverless::LayerVersion resource with metadata about which runtime strategy to use.

MyLayer:
 Type: AWS::Serverless::LayerVersion
 Properties:
   Description: Layer description
   ContentUri: 'my_layer/'
   CompatibleRuntimes:
    - python3.8
 Metadata:
   BuildMethod: python3.8