Blockchain (Bitcoin) as a database?

I'm very familiar with cryptocurrency and databases, and I can tell you it's not a great DB engine at all.

Using the blockchain as a live database:

Think of it as a first normalized form without any really good built in search capability or indexing as far as the blockchain goes. Basically a excel sheet without any computation capabilities that just gives you 'read/write' capabilities with lots of verification and validation. A blockchain is a great way to validate your data is sanitized and correct before you put it in a database which let's you query it differently, index it, etc.

Benefits of the blockchain:

The blockchain in this case is purely a ledger and an API for PUT and GET requests. That's about it. The blockchain is interesting because you need a majority of nodes to pass the transaction as valid and there aren't any rollbacks, once it's committed it's committed. Thus if someone tries to put in a fake transaction it will be caught unless the person doing it has a pool which has a strong majority share. Then they can validate it in their pool before someone can reject it. That is the strong point of the blockchain. Verification that the data is accurate. It is also typically pretty slow. You're looking at about 10 minutes under normal load for it to get validated. Under heavy load the time goes up quite a bit.

After you have validated that the transactions are valid and not fraudulent using the blockchain, you can then import that data into a database and work with it however you like. I have some experience with this but note that every single transaction on the current bitcoin architecture will be recorded thus it has some interesting info to analyze.

Querying data out of the blockchain schema in a DBMS:

Here is the bitcoin diagram you can use to create the schema in PostgreSQL. Using this you can then put it in a relational DBMS: https://bitcointalk.org/index.php?topic=38246 enter image description here

This code repo is also helpful if you want to import the data into a real RDBMS: https://github.com/bitcoin-abe/bitcoin-abe

As far as what DBMS you should put it in, that's up to your use case. If you want to analyze the transactions/wallet IDs to see some patterns or do B.I. work I would recommend a relational DB. If you want to setup a live ingest with multiple cryptocoins I would recommend something that doesn't need the transaction log so a MongoDB solution would be good. I don't think you need to worry about Elastic Search unless you want to start doing live recording of all cryptocoins at the same time and will use it to do auto trading or something equally crazy. :)


Is blockchain a potentially viable database solution for modern, high transaction volume applications?

The blockchain technology in general has some characteristics that make it difficult to work with high volumes.

Take a look at Bitcoin for example. The average transactions per day have never been more than 300K: Transactions per day (source blockchain.info)

enter image description here

Even more important, the median confirmation time for a transaction is around 8 minutes!: Median Transaction Confirmation Time (With Fee Only) and a nice image from Quandl:

enter image description here

Now how many computers around the world are responsible for keeping the bitcoin database? I'm no expert on bitcoin but I think the complete history of transactions are stored in the block chain, so all computers that participate in the bitcoin network essentially keep a copy of the entire database (the transactions part of course, not the accounts info and secret keys, these are kept in the personal wallets).

We can only estimate how many they are but I'd guess they are more than a million. 300K transactions in a day with a million computers does not sound like high volume. And 8 minutes for confirmation?

A modern RDBMS in a decent hardware can easily go up to 1K transactions per second. That's about 86M transactions per day. The confirmation time? That depends on the size of the transaction (how many tables and rows it affects) but for a small transaction of the bitcoin type (remove 42 coins from account A and add 42 coins to account B), it will be milliseconds.

In conclusion the difference in volumes and time is 1000 to 100000-fold today.

If the blockchain technology solves this issue in the future, it might be possible to be used in medium or high volume applications. We can read discussions and suggestions for how the problem should be solved - many of the companies mentioned in the links are actually working on these issues - but we haven't seen yet an actual working solution or product that offers high volumes and speed.


In 2014 we built ascribe.io with the premise of using Bitcoin as a database for Intellectual Property claims. Upon release, we plugged the network because it couldn't handle the throughput, latency was at least 10 minutes and we were limited by what we could put into the OP_RETURN, forcing us to store the actual digital file relating to the claim in Amazon S3. We realized that Bitcoin in its current form could never be a high transaction database.

But the idea of whether we could have a blockchain style database - decentralized control, immutability (tamper-resistance) and live assets on the network stuck with us. So in mid-2014, we started working on BigchainDB

Long story short - we can process 100k tps with 100mS latency and have petabytes of capacity. The code is our BigchainDB Github, technical documentation here and the foundational thinking in our whitepaper.

If you have a use case for a high-transaction, decentralized database - we built BigchainDB exactly for this.