Deleting repeating values in database

You can do this with some creative self-join logic.

Think of three hypothetical rows in the table.

  • Row a you want to keep.
  • Row b has the same product name and price, and a date 1 day after a. You want to delete this.
  • Row c has the same product name and price, and a date 1 day after b. You want to keep this.

So if you can do a self-join to match these three rows, then delete row b.

DELETE b FROM MyTable AS a 
JOIN MyTable AS b ON a.name=b.name AND a.price=b.price AND a.date=b.date + INTERVAL 1 DAY 
JOIN MyTable AS c ON b.name=c.name AND b.price=c.price AND b.date=c.date + INTERVAL 1 DAY;

This works even if there are multiple rows that fit the conditions for row b. It'll delete the first one, and then continue to delete subsequent rows that also fit the conditions.

This works if you use the DATE data type and store your dates as 'YYYY-MM-DD', not 'DD-MM-YYYY'. You should do this anyway.


You want to delete the rows where the product name and price are the same as the rows with the date plus/minus one day.

DELETE row_mid
FROM 
  record_table AS row_mid
  JOIN record_table AS row_prev
  JOIN record_table AS row_next
WHERE
  row_mid.name = row_prev.name 
  AND row_mid.price = row_prev.price
  AND row_mid.date = DATE_SUB(row_prev.date, INTERVAL 1 DAY)
  AND row_mid.name = row_next.name
  AND row_mid.price = row_next.price
  AND row_mid.date = DATE_ADD(row_next.date, INTERVAL 1 DAY);

Is your MySQL new enough to support CTE? This is a pretty interesting problem I've seen with date scheduling. The code always looks awkward. To check the results without a delete you can switch the comment mark with the select and delete and comment out the t.[Name] is null line.

WITH

cte AS  (
        SELECT a.ID
            , a.[Name]
            , a.[Date]
            , a.Price
            , NextDate = max(npc.[Date])    -- Next Price change
            , PrevDate = max(lpc.[Date])    -- Next Price change
        FROM    yourTable as a  -- Base Table
            LEFT JOIN
                yourTable as npc    -- Looking for Next Price Change
            ON a.[Name] = npc.[Name]
                and a.[Date] < npc.[Date]
                and a.Price <> npc.Price
            LEFT JOIN
                yourTable as lpc    -- Looking for Last Price Change
            ON a.[Name] = lpc.[Name]
                and a.[Date] > lpc.[Date]
                and a.Price <> lpc.Price
        GROUP BY a.ID, a.[Name], a.[Date], a.Price
    ) 

----SELECT f.*, [Check] = CASE WHEN t.[Name] is null THEN 'DELETE' ELSE '' END
DELETE f
FROM 
        yourTable as f
    LEFT JOIN
        (
            SELECT [Name], [GoodDate] = Max([Date])
            FROM cte
            GROUP BY [Name], PrevDate
            UNION
            SELECT [Name], [GoodDate] = Min([Date])
            FROM cte
            GROUP BY [Name], PrevDate
            UNION
            SELECT [Name], [GoodDate] = Max([Date])
            FROM cte
            GROUP BY [Name], NextDate
            UNION
            SELECT [Name], [GoodDate] = Min([Date])
            FROM cte
            GROUP BY [Name], NextDate
        ) as t
    ON t.[Name] = f.[Name] and t.[GoodDate] = f.[Date]
WHERE t.[Name] is null
--ORDER BY f.[Name], f.[Date]