Serving dynamically generated ZIP archives in Django

Here's a Django view to do this:

import os
import zipfile
import StringIO

from django.http import HttpResponse


def getfiles(request):
    # Files (local path) to put in the .zip
    # FIXME: Change this (get paths from DB etc)
    filenames = ["/tmp/file1.txt", "/tmp/file2.txt"]

    # Folder name in ZIP archive which contains the above files
    # E.g [thearchive.zip]/somefiles/file2.txt
    # FIXME: Set this to something better
    zip_subdir = "somefiles"
    zip_filename = "%s.zip" % zip_subdir

    # Open StringIO to grab in-memory ZIP contents
    s = StringIO.StringIO()

    # The zip compressor
    zf = zipfile.ZipFile(s, "w")

    for fpath in filenames:
        # Calculate path for file in zip
        fdir, fname = os.path.split(fpath)
        zip_path = os.path.join(zip_subdir, fname)

        # Add file, at correct path
        zf.write(fpath, zip_path)

    # Must close zip for all contents to be written
    zf.close()

    # Grab ZIP file from in-memory, make response with correct MIME-type
    resp = HttpResponse(s.getvalue(), mimetype = "application/x-zip-compressed")
    # ..and correct content-disposition
    resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename

    return resp

The solution is as follows.

Use Python module zipfile to create zip archive, but as the file specify StringIO object (ZipFile constructor requires file-like object). Add files you want to compress. Then in your Django application return the content of StringIO object in HttpResponse with mimetype set to application/x-zip-compressed (or at least application/octet-stream). If you want, you can set content-disposition header, but this should not be really required.

But beware, creating zip archives on each request is bad idea and this may kill your server (not counting timeouts if the archives are large). Performance-wise approach is to cache generated output somewhere in filesystem and regenerate it only if source files have changed. Even better idea is to prepare archives in advance (eg. by cron job) and have your web server serving them as usual statics.

Tags:

Python

Django