One Big Database vs. Several Smaller Ones

I ran a system with the best part of a thousand databases, spread across multiple servers. They were all an identical structure and were synchronised with a template database which was on each of the machines.

This allowed me the ability to migrate databases from one db to another if one was getting excessively over-loaded, and as the client mix changed, I could create new databases on different servers to load balance across the servers. This was the biggest advantage I got from the system, in that I had multiple large lumps of tin performing multiple complicated queries simultaneously on the separate servers.

The great thing about this, is that you can add servers to the configuration at your own speed, as each server starts to get over-loaded, add another into the mix, migrate some dbs across to the new server and end up with a nicely load balanced set of servers. A really nice and simple way to add scale to the system as and when it is required!

The reason I went with this approach rather than the single huge database approach, was the sheer size of the potential database that would have been created... each of the 1000 databases had 200 tables, and many of the individual tables within each of the databases comprised many hundreds of millions of rows of data!

A single database configuration would have required certain tables (approx 8 of them) to have multi-billions of rows of data, and the total db size would have been over 10Tb. We were able to have multiple servers with 5Tb of RAID 10 storage, with many databases on each.

That's what I would do! Hope it helps your decision making... :)


Is the application you are building a SaaS application? If so, I would suggest you consider a third approach - have one DB, with a common structure for all application instances with one difference - add a userid/applicationid column in all tables. This will greatly reduce your application development / maintenance costs. This in my experience is one of the best approaches to storing multi-tenant data.

Also see this great white paper by Microsoft on multi-tenant data architecture

It also highlights the advantages/ disadvantages on the approaches you have mentioned.


Setup B is way easier to manage

Each tablen sits in a different folder. That can be very beneficial if you do no want to test OS limits.

For example, my employer hosts MySQL for a CRM system of car dealerships. Client has 800 dealerships. Each dealership database has 160 tables. That's 128,000 tables.

  • Under Setup A, all 128,000 tables would sit under one database.
  • Under Setup B, each set of 160 tables sits in a subfolder under /var/lib/mysql.

From the perspective of the OS and its ability to handle i-nodes (or FAT tables for Windows), which includes having a maximum number of files per folder:

  • Under Setup A, you would worry about 128,000 files under one folder. Can your OS support that many files under a single folder?
  • Under Setup B, no such worry.

If you had to tweek table structures using ALTER TABLE or some other DDL:

  • Under Setup A, you would have to script the needed DDL using PHP (or specialized MySQL scripts) against the specific table name and corresponding queries before accessing it and making changes
  • Under Setup B, Connect to right database, then access the same named table every time. The access paradigm would always be clean:
    • Specific Database
    • Specific Folder under /var/lib/mysql
    • Specfic TableName.

If you want to put different databases on different disks:

  • Under Setup A, symlinks for every table moved to a separate disk will only exacerbate the "number of inodes in a folder" problem. Disk I/O and overall table access complicates more and increases overall server load since .frm files are repeatedly accessed.
  • Under Setup B, simply move a whole database folder into a separate data mount. Disk I/O can be distributed on demand.
  • CAVEAT : Highly discouraged for InnoDB

Speaking metaphorically, which would you rather have?

  • a gigantic apartment with one bedroom, one bathroom and one kitchen (SetupA)
  • multiple apartments, each with its own bedroom, bathroom and kitchen (SetupB)

When it comes to fixing a radiator in an apartment:

  • With Setup A, every tenant can be inconvenienced and must be involved because you have to talk with affected tenants in front of everyone like it's everybody's business
  • With Setup B, other than hearing some banging on the wall or in the pipes, tenants can go on with their private lives
  • This list and its metaphors can go on and on

IHMO Although budgets may be a driving force to design/infrastructure decisions, I would easily be in favor of separates databases per client.