How to install external modules in a Python Lambda Function created by AWS CDK?

It is not even necessary to use the experimental PythonLambda functionality in CDK - there is support built into CDK to build the dependencies into a simple Lambda package (not a docker image). It uses docker to do the build, but the final result is still a simple zip of files. The documentation shows it here: https://docs.aws.amazon.com/cdk/api/latest/docs/aws-lambda-readme.html#bundling-asset-code ; the gist is:

new Function(this, 'Function', {
  code: Code.fromAsset(path.join(__dirname, 'my-python-handler'), {
    bundling: {
      image: Runtime.PYTHON_3_9.bundlingImage,
      command: [
        'bash', '-c',
        'pip install -r requirements.txt -t /asset-output && cp -au . /asset-output'
      ],
    },
  }),
  runtime: Runtime.PYTHON_3_9,
  handler: 'index.handler',
});

I have used this exact configuration in my CDK deployment and it works well.

And for Python, it is simply

aws_lambda.Function(
    self,
    "Function",
    runtime=aws_lambda.Runtime.PYTHON_3_9,
    handler="index.handler",
    code=aws_lambda.Code.from_asset(
        "function_source_dir",
        bundling=core.BundlingOptions(
            image=aws_lambda.Runtime.PYTHON_3_9.bundling_image,
            command=[
                "bash", "-c",
                "pip install --no-cache -r requirements.txt -t /asset-output && cp -au . /asset-output"
            ],
        ),
    ),
)

UPDATE:

It now appears as though there is a new type of (experimental) Lambda Function in the CDK known as the PythonFunction. The Python docs for it are here. And this includes support for adding a requirements.txt file which uses a docker container to add them to your function. See more details on that here. Specifically:

If requirements.txt or Pipfile exists at the entry path, the construct will handle installing all required modules in a Lambda compatible Docker container according to the runtime.

Original Answer:

So this is the awesome bit of code my manager wrote that we now use:


    def create_dependencies_layer(self, project_name, function_name: str) -> aws_lambda.LayerVersion:
        requirements_file = "lambda_dependencies/" + function_name + ".txt"
        output_dir = ".lambda_dependencies/" + function_name
        
        # Install requirements for layer in the output_dir
        if not os.environ.get("SKIP_PIP"):
            # Note: Pip will create the output dir if it does not exist
            subprocess.check_call(
                f"pip install -r {requirements_file} -t {output_dir}/python".split()
            )
        return aws_lambda.LayerVersion(
            self,
            project_name + "-" + function_name + "-dependencies",
            code=aws_lambda.Code.from_asset(output_dir)
        )

It's actually part of the Stack class as a method (not inside the init). The way we have it set up here is that we have a folder called lambda_dependencies which contains a text file for every lambda function we are deploying which just has a list of dependencies, like a requirements.txt.

And to utilise this code, we include in the lambda function definition like this:


        get_data_lambda = aws_lambda.Function(
            self,
            .....
            layers=[self.create_dependencies_layer(PROJECT_NAME, GET_DATA_LAMBDA_NAME)]
        )