External Backups/Snapshots for Google Cloud Spanner

Today, you can stream out a consistent snapshot by reading out all the data using your favorite tool (mapreduce, spark, dataflow) and reads at a specific timestamp (using Timestamp Bounds).

https://cloud.google.com/spanner/docs/timestamp-bounds

You have about an hour to do the export before the data gets garbage collected.

In the future, we will provide a Apache Beam/Dataflow connector to do this in a more scalable fashion. This will be our preferred method for doing import/export of data into Cloud Spanner.

Longer term, we will support backups and the ability to restore to a backup but that functionality is not currently available.


As of July 2018, Cloud Spanner now offers Import and Export functionality which allows you to export a database into Avro format. If you go to the specific Cloud Spanner database in question via the Google Cloud Console website, you will see Import and Export buttons toward the top. Simply click Export, populate the requested information such as a destination Google Cloud Storage bucket, and the database will be backed-up in Avro format to Google Cloud Storage. If you need to restore a database, use the corresponding Import functionality from the Cloud Spanner section of the Google Cloud Console website.

Note: The actual backup and restore (i.e., export and import) are done using Google Cloud Dataflow and you will be charged for the dataflow operation.

See the documentation for Import and Export functionality.


Google Cloud Spanner now has two methods that can be used to do backups.

https://cloud.google.com/spanner/docs/backup

You can either use the built-in backups or do an export/import using a dataflow job.