Cluster design: if I expect to insert data into several tables every week, is it a bad idea to cluster them?

I have yet to see a real world user scenario where the benefit (saving a bit of disk or I/O or block access) of using a cluster instead of just regular tables (or IOTs) with joins is so significant that it worths the hassle of dealing with it.

5-20 records per week: that is nothing. Paper and pencil can do that.

FYI: The data dictionary tables use a few clusters for identifiers. These identifiers never change. They are inserted, deleted, but never updated. In some environments, 5-20 records are inserted/deleted in a matter of seconds or minutes (due to dynamically creating and dropping objects) without causing any problem. So 5-20 records per week will not be a problem. The question is: do you really want to use something that is almost never used, that may not even improve performance noticeably (or even make it worse), but requires extra attention.


caveat emptor

Whenever you have a schema design idea, run benchmarks to (dis)prove its usefulness.

For me, using a non-standard schema design needs to be proven to have a significant benefit prior to implementing.

For your really-really tiny data amount, I expect you to save only a few jiffies per year.

Again, run benchmarks.

TL;DR When to use a Clustered Table? never (unless proven otherwise.)


Clustering and partitioning are to create locality of reference in huge datasets.

Clustering is storing together all rows associated with each key value, and depending on the RDBMS may be applied either table by table as an index where the rows are the leaves or multi-table, where the data for each key value in multiple tables is kept together. With Clustering, the table is still huge.

Partitioning is putting the table in different spaces, so it acts like many small tables. For instance, in the exchanges we partitioned by trading day. This is great for speeding query and churn, as old partitions are quiescent. It is also very handy for efficiently purging data and recycling the space to serve a new partition key value, when partitioning on date.