Database design for big table

There are a couple of patterns for splitting a single logical entity across several physical tables.

Vertical partitioning puts some of the columns in one table and some in another. There can be several such additional tables, if required. All these tables share the same primary key. Columns that are used together are stored together so one page fetch read all the required values. The advantage is there are more rows per page so scans and aggregates require fewer IOs. Sometimes it can be tricky deciding which columns should go together. Every INSERT becomes proportionately more expensive as the number of partitions increases.

Horizontal partitioning splits the data by key range. For example, all users with a surname beginning A through M go into table User_1 and N through Z go into User_2. The application figures out which sub-table to use at run-time, usually algorithmically or through a look-up, though most DBMS these days will offer this as a build-in feature, implemented through DDL, transparent to the application. If you have hot spots in data writes this can spread the pain out. The optimiser can eliminate whole partitions from range scans, improving response time. Loading and removing whole partitions can be very fast metadata operations.

Sharding is where key ranges are moved, not just to a different table, but to a whole different instance of the DBMS. The application decides which instance to connect to, depending on key range. This is a scale-out technique. Some DBMS support this as a feature, such as Galera for MariaDB. The obvious cost is in additional hardware to run the other instances, duplication of reference data across all nodes to maintain RI and application complexity.

The techniques may be combined, of course, so a table could be both horizontally and vertically partitioned.

General advantages of splitting:

  • Indexes may becomes shallower, saving costs on index look-ups.
  • Each partition can write to separate disk, spreading IO load.

Disadvantages include:

  • Application complexity
  • It will be more difficult to ensure data integrity
  • Upgrades become trickier.
  • Purging data is incrementally more difficult to get right.

Basically, if you don't have a problem, and don't see one coming, don't bother. Horizontal partitioning using DBMS functionality is the only one worth considering as a pre-emptive measure. Even then it will require a re-write of 200GB. Do you have the disk and service window to accomplish this?


"If it ain't broke, don't fix it."

You say the billion-row table is not having a performance problem? It sounds like either all activity is at the end of the table, hence easily cacheable, or you are rarely hitting the table and hitting it only with trivial 'point queries'. It is quite possible that your innodb_buffer_pool_size (which should be 20-25G for your 32GB machine) is big enough. Without more info on access patterns, we can't say whether 32GB will suffice as the table grows.

Partitioning does not provide any performance, except in very limited cases.

To rebut common misconceptions:

  • Indexes may become shallower, saving costs on index look-ups. -- But the savings is eaten up by picking the sub-table to look into. A billion-row BTree is about 5 levels deep.
  • Each partition can write to separate disk, spreading IO load. -- Even better would be to use RAID "striping" (-0, -5, -10). And that will benefit non-partitioned tables just as much.
  • Purging data is incrementally more difficult to get right. -- On the contrary, if you are purging "old" data and you partition on date, then DROP PARTITION is very fast. (OK, it is somewhat complicated.)
  • If you have large fields in the table (which you seem not to have), "vertical partitioning" is somewhat handled by InnoDB -- see discussion of ROW_FORMAT.

Before embarking on any changes, let's see the main queries. Better yet, set long_query_time=1 and turn on the slowlog. Wait a while, the see what is in the slowlog.

More on partitioning

Tags:

Mysql