SaaS database design - Multiple Databases? Split?

For SaaS applications, you use multiple databases for multiple tenants, but usually don't split it module-wise.

This is the most common model I have seen in SaaS application design. Your base schema is replicated for each tenant that you add to your application.


Start with one database. Split data/functionality when project requires it.

Here is what we can learn from LinkedIn:

  • A single database does not work
  • Referential integrity will not be possible
  • Any data loss is a problem
  • Caching is good even when it's modestly effective
  • Never underestimate growth trajectory

Source:

LinkedIn architecture

LinkedIn communication architecture


High Scalability is a good blog for scaling SaaS applications. As mentioned, splitting tables across databases as you suggested is generally a bad idea. But a similar concept is sharding, where you keep the same (or similar) schema, but split the data on multiple servers. For example, users 1-5000 are on server1, and users 5000-10000 on server2. Depending on the queries your application uses, it can be an efficient way to scale.


Having a single database is best for data integrity because then you can use foreign keys. You can't have this built-in data integrity if you split the data into multiple databases. This isn't an issue if your data isn't related, but if it is related, it would be possible for your one database to contain data that is inconsistent with another database. In this case, you would need to write some code that scans your databases for inconsistent data on a regular basis so you can handle it appropriately.

However, multiple databases may be necessary if you need your site/application to be highly scalable (e.g. internet scale). For example, you could host each database on a different physical server.