Download a folder from S3 using Boto3

quick and dirty but it works:

import boto3
import os 

def downloadDirectoryFroms3(bucketName, remoteDirectoryName):
    s3_resource = boto3.resource('s3')
    bucket = s3_resource.Bucket(bucketName) 
    for obj in bucket.objects.filter(Prefix = remoteDirectoryName):
        if not os.path.exists(os.path.dirname(obj.key)):
            os.makedirs(os.path.dirname(obj.key))
        bucket.download_file(obj.key, obj.key) # save to same path

Assuming you want to download the directory foo/bar from s3 then the for-loop will iterate all the files whose path starts with the Prefix=foo/bar.


You could also use cloudpathlib which, for S3, wraps boto3. For your use case, it's pretty simple:

from cloudpathlib import CloudPath

cp = CloudPath("s3://bucket/folder/folder2/")
cp.download_to("local_folder")


A slightly less dirty modification of the accepted answer by Konstantinos Katsantonis:

import boto3
s3 = boto3.resource('s3') # assumes credentials & configuration are handled outside python in .aws directory or environment variables

def download_s3_folder(bucket_name, s3_folder, local_dir=None):
    """
    Download the contents of a folder directory
    Args:
        bucket_name: the name of the s3 bucket
        s3_folder: the folder path in the s3 bucket
        local_dir: a relative or absolute directory path in the local file system
    """
    bucket = s3.Bucket(bucket_name)
    for obj in bucket.objects.filter(Prefix=s3_folder):
        target = obj.key if local_dir is None \
            else os.path.join(local_dir, os.path.relpath(obj.key, s3_folder))
        if not os.path.exists(os.path.dirname(target)):
            os.makedirs(os.path.dirname(target))
        if obj.key[-1] == '/':
            continue
        bucket.download_file(obj.key, target)

This downloads nested subdirectories, too. I was able to download a directory with over 3000 files in it. You'll find other solutions at Boto3 to download all files from a S3 Bucket, but I don't know if they're any better.