Is it possible to delete old records from clickhouse table?

Clickhouse doesn't have update/Delete feature like Mysql database. But we still can do delete by organising data in the partition.I dont know how u r managing data so i am taking here an example like one are storing data in a monthwise partition.

By using "DROP PARTITION" command you can delete the data of that month by Droping the partition of that month, here is the complete explanation of how to Drop partition https://clickhouse.yandex/blog/en/how-to-update-data-in-clickhouse.


Example to create and delete partition

    CREATE TABLE test.partitioned_by_month(d Date, x UInt8) ENGINE = MergeTree 

PARTITION BY toYYYYMM(d) ORDER BY x;

    INSERT INTO test.partitioned_by_month VALUES ('2000-01-01', 1), ('2000-01-02', 2), ('2000-01-03', 3);

INSERT INTO test.partitioned_by_month VALUES ('2000-02-03', 4), ('2000-02-03', 5);

INSERT INTO test.partitioned_by_month VALUES ('2000-03-03', 4), ('2000-03-03', 5);

SELECT * FROM test.partitioned_by_month;

---d------------|-------x-----

 2000-02-03 | 4 

 2000-02-03 | 5 


---d------------|-------x-----

 2000-03-03 | 4 

 2000-03-03 | 5 

---d------------|-------x-----

 2000-01-01 | 1 

 2000-01-02 | 2

 2000-01-03 | 3 

ALTER TABLE test.partitioned_by_month DROP PARTITION 200001;

select * from partitioned_by_month;


---d------------|-------x-----

 2000-03-03 | 4 

 2000-03-03 | 5 

---d------------|-------x-----


 2000-02-03 | 4 

 2000-02-03 | 5 

Lightweight delete

Available since v22.8

Standard DELETE syntax for MergeTree tables has been introduced in #37893.

SET allow_experimental_lightweight_delete = 1;
DELETE FROM merge_table_standard_delete WHERE id = 10;

Altering data using Mutations

See the docs on Mutations feature https://clickhouse.yandex/docs/en/query_language/alter/#mutations.
The feature was implemented in Q3 2018.

Delete data

ALTER TABLE <table> DELETE WHERE <filter expression>

"Dirty" delete all

You always have to specify a filter expression. If you want to delete all the data through Mutation, specify something that's always true, eg.:

ALTER TABLE <table> DELETE WHERE 1=1

Update data

It's also possible to mutate (UPDATE) the similar way

ALTER TABLE <table> UPDATE column1 = expr1 [, ...] WHERE <filter expression>

Mind it's async

Please note that all commands above do not execute the data mutation directly (in sync). Instead they schedule ClickHouse Mutation that is executed independently (async) on background. That is the reason why ALTER TABLE syntax was chosen instead of typical SQL UPDATE/DELETE. You can check unfinished Mutations' progress via

SELECT *
FROM system.mutations
WHERE is_done = 0

...unless

you change mutations_sync settings to

  • 1 so it synchronously waits for current server
  • 2 so it waits for all replicas

Altering data without using Mutations

Theres's TRUNCATE TABLE statement with syntax as follows:

TRUNCATE TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]

This synchronously truncates the table. It will check for table size so won't allow you to delete if table size exceeds max_table_size_to_drop. See docs here:

https://clickhouse.tech/docs/en/sql-reference/statements/truncate/

Tags:

Sql

Clickhouse