AWS Lambda not importing LXML

I faced the same issue.

The link posted by Raphaël Braud was helpful and so was this one: https://nervous.io/python/aws/lambda/2016/02/17/scipy-pandas-lambda/

Using the two links I was able to successfully import lxml and other required packages. Here are the steps I followed:

  • Launch an ec2 machine with Amazon Linux ami
  • Run the following script to accumulate dependencies:

    set -e -o pipefail
    sudo yum -y upgrade
    sudo yum -y install gcc python-devel libxml2-devel libxslt-devel
    
    virtualenv ~/env && cd ~/env && source bin/activate
    pip install lxml
    for dir in lib64/python2.7/site-packages \
         lib/python2.7/site-packages
    do
    if [ -d $dir ] ; then
       pushd $dir; zip -r ~/deps.zip .; popd
    fi
    done  
    mkdir -p local/lib
    cp /usr/lib64/ #list of required .so files
    local/lib/
    zip -r ~/deps.zip local/lib
    
  • Create handler and worker files as specified in the link. Sample file contents:

handler.py

import os
import subprocess


libdir = os.path.join(os.getcwd(), 'local', 'lib')

def handler(event, context):
    command = 'LD_LIBRARY_PATH={} python worker.py '.format(libdir)
    output = subprocess.check_output(command, shell=True)

    print output

    return

worker.py:

import lxml

def sample_function( input_string = None):
    return "lxml import successful!"

if __name__ == "__main__":
    result = sample_function()
    print result
  • Add handler and worker to zip file.

Here is how the structure of the zip file looks after the above steps:

deps 
├── handler.py
├── worker.py 
├── local
│   └── lib
│       ├── libanl.so
│       ├── libBrokenLocale.so
|       ....
├── lxml
│   ├── builder.py
│   ├── builder.pyc
|       ....
├── <other python packages>
  • Make sure you specify the correct handler name while creating the lambda function. In the above example, it would be- "handler.handler"

Hope this helps!


I have solved this using the serverless framework and its built-in Docker feature.

Requirement: You have an AWS profile in your .aws folder that can be accessed.

First, install the serverless framework as described here. You can then create a configuration file using the command serverless create --template aws-python3 --name my-lambda. It will create a serverless.yml file and a handler.py with a simple "hello" function. You can check if that works with a sls deploy. If that works, serverless is ready to be worked with.

Next, we'll need an additional plugin named "serverless-python-requirements" for bundling Python requirements. You can install it via sls plugin install --name serverless-python-requirements.

This plugin is where all the magic happens that we need to solve the missing lxml package. In the custom->pythonRequirements section you simply have to add the dockerizePip: non-linux property. Your serverless.yml file could look something like this:

service: producthunt-crawler

provider:
  name: aws
  runtime: python3.8

functions:
  hello:
    # some handler that imports lxml
    handler: handler.hello

plugins:
  - serverless-python-requirements

custom:
  pythonRequirements:
    fileName: requirements.txt
    dockerizePip: non-linux

    # Omits tests, __pycache__, *.pyc etc from dependencies
    slim: true

This will run the bundling of python requirements inside a pre-configured docker container. After this, you can run sls deploy to see the magic happen and then sls invoke -f my_function to check that it works.

When you've used serverless to deploy and add the dockerizePip: non-linux option later, make sure to clean up your already built requirements with sls requirements clean. Otherwise, it just uses the already built stuff.


Extending on these answers, I found the following to work well.

The punchline here is having python compile lxml with static libs, and installing in the current directory rather than site-packages.

It also means you can write your python code as usual, without need for a distinct worker.py or fiddling with LD_LIBRARY_PATH

sudo yum groupinstall 'Development Tools'
sudo yum -y install python36-devel python36-pip
sudo ln -s /usr/bin/pip-3.6 /usr/bin/pip3
mkdir lambda && cd lambda
STATIC_DEPS=true pip3 install -t . lxml
zip -r ~/deps.zip *

to take it to the next level, use serverless and docker to handle everything. here is a blog post demonstrating this: https://serverless.com/blog/serverless-python-packaging/