Cassandra: maintenance

In general, a well designed cluster can live for YEARS without being touched. I've had clusters that ran for years hands-off. However, here are some guidelines:

Monitoring is hugely important:

1) Monitor latencies. Use opscenter or your favorite metrics tools to keep track of latencies. Latencies going up can be signs of problems coming, including GC pauses (more common in read workloads than write workloads), sstable problems, and the like.

2) Monitor sstable counts. SSTable counts will increase if you overrun compaction (each sstable is written exactly one time - deletes are handled by combining old sstables into new sstables through compaction).

3) Monitor node state changes (up/down,etc). If you see nodes flapping, investigate, as it's not normal.

4) Keep track of your disk usage - traditionally, you need to stay under 50% (especially if you use STCS compaction).

There are some basic things you should and shouldn't do regularly:

1) Don't explicitly run nodetool compact. You mention that you've done it, it's not fatal, but it does create very large sstables, which then are less likely to participate in compaction moving forward. You don't necessarily need to keep running it, but sometimes it may help to get rid of deleted/overwritten data.

2) nodetool repair is typically recommended every gc_grace_seconds (10 days by default). There are workloads where this is less important - the biggest reason you NEED repair is to make sure deletion markers (tombstones) are transmitted before they expire (they live for gc_grace_seconds, if a node is down when the delete happened, that data may come back to life without the repair!). If you don't issue deletes, and you query with sufficient consistency level (reads and writes at QUORUM, for example), you can actually live a life without repair.

3) If you are going to repair, consider using incremental repair, and repair small ranges at a time.

4) Compaction strategies matter - a lot. STCS is great for writes, LCS is great for reads. DTCS has some quirks.

5) Data models matter - just like RDBMS/SQL environments get into trouble as unindexed queries hit large tables, Cassandra can be problematic with very large rows/partitions.

6) Snapshots are cheap. Very cheap. Nearly instant, just hard links, they cost almost no disk space immediately. Use snapshot before you upgrade versions, especially major versions.

7) Be careful with deletes. As hinted in #2, delete creates more data on disk, and doesn't free it for AT LEAST gc_grace_seconds.

When all else fails:

I've seen articles that suggest Cassandra in prod requires a dedicated head to manage any sized cluster - I don't know that it's necessarily true, but if you're concerned, you may want to hire a third party consultant (TheLastPickle, Pythian) or have a support contract (Datastax) to give you some peace of mind.


According to the Cassandra repair documentation, nodetool repair should be run in the following situations:

  • As a best practice, you should schedule repairs weekly. Note: If deletions never occur, you should still schedule regular repairs. Be aware that setting a column to null is a delete.
  • During node recovery. For example, when bringing a node back into the cluster after a failure.
  • On nodes containing data that is not read frequently.
  • To update data on a node that has been down.

I should think that read/write loads cause fragmentation in the storage.

Data in Cassandra does not "fragment" in the way that you are thinking. However, deletes do trigger the placement of tombstones, and the normal compact process eliminates the tombstones.

I understand now that the compaction is a big deal and runs automagically

Correct. I was told by a DataStax rep that once you run compact manually, you will always have to run it manually. The reason is that compaction works by "compacting" all existing SSTABLES in a keyspace into a single SSTABLE file. You may have some column families in that SSTABLE file that are small, and will take so long to increase beyond the compaction threshold, that the likelihood of automatic compaction ever running again is very low.

Essentially, make sure to schedule a regular nodetool repair, never run nodetool compact, and implement a backup strategy (snapshots, incremental backups, or both).