Possible to keep max number of records in postgresql?

You can define a trigger to maintain your desired row number:

CREATE OR REPLACE FUNCTION trf_keep_row_number_steady()
RETURNS TRIGGER AS
$body$
BEGIN
    -- delete only where are too many rows
    IF (SELECT count(id) FROM log_table) > rownum_limit
    THEN 
        -- I assume here that id is an auto-incremented value in log_table
        DELETE FROM log_table
        WHERE id = (SELECT min(id) FROM log_table);
    END IF;
END;
$body$
LANGUAGE plpgsql;

CREATE TRIGGER tr_keep_row_number_steady 
AFTER INSERT ON log_table
FOR EACH ROW EXECUTE PROCEDURE trf_keep_row_number_steady();

This is probably not the best performing option, but once you reach the limit, it will never be exceeded. If there is space for fluctuation, then you can check the row number periodically and delete excess rows from the beginning.

EDIT: If you have really large logs (say a million per month) than partitioning can be the easiest solution. You can then simply drop the unnecessary tables (say where max(timestamp) < CURRENT_DATE - 1 year). You can use your timestamp (or a derived date) as condition for range partitioning.

But be careful before discarding old logs. Are you sure you will never need those?


I created a more generic, table independent function.

CREATE OR REPLACE FUNCTION keep_row_number_steady()
RETURNS TRIGGER AS
$body$
DECLARE
    tab text;
    keyfld text;
    nritems INTEGER;
    rnd DOUBLE PRECISION;
BEGIN
    tab := TG_ARGV[0];
    keyfld := TG_ARGV[1];
    nritems := TG_ARGV[2]; 
    rnd := TG_ARGV[3];

    IF random() < rnd
    THEN 
        EXECUTE(format('DELETE FROM %s WHERE %s < (SELECT %s FROM %s ORDER BY %s DESC LIMIT 1 OFFSET %s)', tab, keyfld, keyfld, tab, keyfld, nritems));
    END IF;
    RETURN NULL;
END;
$body$
LANGUAGE plpgsql;

CREATE TRIGGER log_table_keep_row_number_steady_trigger
AFTER INSERT ON log_table
FOR EACH STATEMENT EXECUTE PROCEDURE keep_row_number_steady('log_table', 'id', 1000, 0.1);

The function takes 4 parameters:

  • tab: table name
  • keyfld: numeric, progressive key field
  • nritems: number of items to retain
  • rnd: random number, from 0 to 1; the bigger it is, the more frequent table will be cleaned (0=never, 1=always, 0.1=10% of times)

This way you can create how many triggers you want calling the same function.

Hope this helps.

Tags:

Postgresql