Can I restore my source code that has been uploaded into Google AppEngine?

Update: Google appengine now allows you to download the code (for Python, Java, PHP and Go apps)

Tool documentation here.


Since I just went to all the trouble of figuring out how to do this, I figure I may as well include it as an answer, even if it doesn't apply to you:

Before continuing, swear on your mother's grave that next time you will back your code up, or better, use source control. I mean it: Repeat after me "next time I will use source control". Okay, with that done, let's see if it's possible to recover your code for you...

If your app was written in Java, I'm afraid you're out of luck - the source code isn't even uploaded to App Engine, for Java apps.

If your app was written in Python, and had both the remote_api and deferred handlers defined, it's possible to recover your source code through the interaction of these two APIs. The basic trick goes like this:

  1. Start the remote_api_shell
  2. Create a new deferred task that reads in all your files and writes them to the datastore
  3. Wait for that task to execute
  4. Extract your data from the datastore, using remote_api

Looking at them in order:

Starting the remote_api_shell

Simply type the following from a command line:

remote_api_shell.py your_app_id

If the shell isn't in your path, prefix the command with the path to the App Engine SDK directory.

Writing your source to the datastore

Here we're going to take advantage of the fact that you have the deferred handler installed, that you can use remote_api to enqueue tasks for deferred, and that you can defer an invocation of the Python built-in function 'eval'.

This is made slightly trickier by the fact that 'eval' executes only a single statement, not an arbitrary block of code, so we need to formulate our entire code as a single statement. Here it is:

expr = """
[type(
    'CodeFile',
    (__import__('google.appengine.ext.db').appengine.ext.db.Expando,),
    {})(
        name=dp+'/'+fn,
        data=__import__('google.appengine.ext.db').appengine.ext.db.Text(
            open(dp + '/' + fn).read()
        )
    ).put()
 for dp, dns, fns in __import__('os').walk('.')
 for fn in fns]
"""

from google.appengine.ext.deferred import defer
defer(eval, expr)

Quite the hack. Let's look at it a bit at a time:

First, we use the 'type' builtin function to dynamically create a new subclass of db.Expando. The three arguments to type() are the name of the new class, the list of parent classes, and the dict of class variables. The entire first 4 lines of the expression are equivalent to this:

from google.appengine.ext import db
class CodeFile(db.Expando): pass

The use of 'import' here is another workaround for the fact that we can't use statements: The expression __import__('google.appengine.ext.db') imports the referenced module, and returns the top-level module (google).

Since type() returns the new class, we now have an Expando subclass we can use to store data to the datastore. Next, we call its constructor, passing it two arguments, 'name' and 'data'. The name we construct from the concatenation of the directory and file we're currently dealing with, while the data is the result of opening that filename and reading its content, wrapped in a db.Text object so it can be arbitrarily long. Finally, we call .put() on the returned instance to store it to the datastore.

In order to read and store all the source, instead of just one file, this whole expression takes place inside a list comprehension, which iterates first over the result of os.walk, which conveniently returns all the directories and files under a base directory, then over each file in each of those directories. The return value of this expression - a list of keys that were written to the datastore - is simply discarded by the deferred module. That doesn't matter, though, since it's only the side-effects we care about.

Finally, we call the defer function, deferring an invocation of eval, with the expression we just described as its argument.

Reading out the data

After executing the above, and waiting for it to complete, we can extract the data from the datastore, again using remote_api. First, we need a local version of the codefile model:

import os
from google.appengine.ext import db
class CodeFile(db.Model):
  name = db.StringProperty(required=True)
  data = db.TextProperty(required=True)

Now, we can fetch all its entities, storing them to disk:

for cf in CodeFile.all():
  os.makedirs(os.dirname(cf.name))
  fh = open(cf.name, "w")
  fh.write(cf.data)
  fh.close()

That's it! Your local filesystem should now contain your source code.

One caveat: The downloaded code will only contain your code and datafiles. Static files aren't included, though you should be able to simply download them over HTTP, if you remember what they all are. Configuration files, such as app.yaml, are similarly not included, and can't be recovered - you'll need to rewrite them. Still, a lot better than rewriting your whole app, right?