Google Cloud Bigtable vs Google Cloud Datastore

Bigtable is optimized for high volumes of data and analytics

  • Cloud Bigtable doesn’t replicate data across zones or regions (data within a single cluster is replicated and durable), which means Bigtable is faster and more efficient, and costs are much lower, though it is less durable and available in the default configuration
  • It uses the HBase API - there’s no risk of lock-in or new paradigms to learn
  • It is integrated with the open-source Big Data tools, meaning you can analyze the data stored in Bigtable in most analytics tools customers use (Hadoop, Spark, etc.)
  • Bigtable is indexed by a single Row Key
  • Bigtable is in a single zone

Cloud Bigtable is designed for larger companies and enterprises who often have larger data needs with complex backend workloads.

Datastore is optimized to serve high-value transactional data to applications

  • Cloud Datastore has extremely high availability with replication and data synchronization
  • Datastore, because of its versatility and high availability, is more expensive
  • Datastore is slower writing data due to synchronous replication
  • Datastore has much better functionality around transactions and queries (since secondary indexes exist)

Based on experience with Datastore and reading the Bigtable docs, the main differences are:

  • Bigtable was originally designed for HBase compatibility, but now has client libraries in multiple languages. Datastore was originally more geared towards Python/Java/Go web app developers (originally App Engine)
  • Bigtable is 'a bit more IaaS' than Datastore in that it's not 'just there' but requires a cluster to be configured.
  • Bigtable supports only one index - the 'row key' (the entity key in Datastore)
    • This means queries are on the Key, unlike Datastore's indexed properties
  • Bigtable supports atomicity only on a single row - there are no transactions
  • Mutations and deletions appear not to be atomic in Bigtable, whereas Datastore provides eventual and strong consistency, depending on the read/query method
  • The billing model is very different:
    • Datastore charges for read/write operations, storage and bandwidth
    • Bigtable charges for 'nodes', storage and bandwidth