Trouble wrapping head around complex SQL delete query

You want to know all registrations that need to be kept. So your first query returns registrations within 5 previous years :

SELECT
  Id
FROM
  Registration
WHERE
  Registration_date >= '2011-10-08'

then all registrations with clients related to the previous query :

SELECT
  a2.Registration_id as Id
FROM
  AssociatedClient AS a1
  INNER JOIN AssociatedClient AS a2
    ON a1.Client_id = a2.Client_id 
WHERE
  a1.Registration_id IN
  (  
    SELECT
      Id
    FROM
      Registration
    WHERE
      Registration_date >= '2011-10-08'
 )

Then you have all registrations that you must not delete by combining the previous queries in an UNION, and you want all clients that are not part of this query :

SELECT
  Client_id
FROM
  AssociatedClient
WHERE
  Registration_id NOT IN
  (
    SELECT
      Id
    FROM
      Registration
    WHERE
      Registration_date >= '2011-10-08'
    UNION
    SELECT
      a2.Registration_id as Id
    FROM
      AssociatedClient AS a1
      INNER JOIN AssociatedClient AS a2
        ON a1.Client_id = a2.Client_id 
    WHERE
      a1.Registration_id IN
      (  
        SELECT
          Id
        FROM
          Registration
        WHERE
          Registration_date >= '2011-10-08'
      )
  )

you can see the results in this SQL fiddle

Then you can delete the lines of clients without registration correspondig to the criterias using the following query :

DELETE FROM
  AssociatedClient
WHERE
  Client_id IN (<previous query>);

and all registrations not present in AssociatedClient :

DELETE FROM
  Registration
WHERE
  Id NOT IN (SELECT Registration_id FROM AssociatedClient)

Begin by identifying the registrations of the other clients of a registration. Here's a view:

create view groups as 
select   a.Client_id
       , c.Registration_id
from AssociatedClient as a 
join AssociatedClient as b on a.Registration_id = b.Registration_id 
join AssociatedClient as c on b.Client_id = c.Client_id;

That gives us:

select Client_id
    , min(Registration_id) as first
    , max(Registration_id) as last
    , count(distinct Registration_id) as regs
    , count(*) as pals
from  groups 
group by Client_id;
Client_id   first       last        regs        pals      
----------  ----------  ----------  ----------  ----------
2           2           8           4           5         
3           2           8           4           18        
4           5           5           1           1         
5           2           8           4           5         
7           10          10          1           1         
8           9           9           1           1         

You dont' need a view, of course; it's just for convenience. You could just use a virtual table. But inspect it carefully to convince yourself it produces the right range of "pal registrations" for each client. Note that the view does not reference Registration. That's significant because it produces the same results even after we use it to delete from Registration, so we can use it for the second delete statement.

Now we have a list of clients and their "pal registrations". What's the date of each pal's last registration?

select g.Client_id, max(Registration_date) as last_reg
from groups as g join Registration as r
on g.Registration_id = r.Id
group by g.Client_id;
g.Client_id  last_reg  
-----------  ----------
2            2011-10-14
3            2011-10-14
4            2011-10-07
5            2011-10-14
7            2011-10-17
8            2011-10-14

Which ones have a latest date before a time certain?

select g.Client_id, max(Registration_date) as last_reg
from groups as g join Registration as r
on g.Registration_id = r.Id
group by g.Client_id
having max(Registration_date) < '2011-10-08';
g.Client_id  last_reg  
-----------  ----------
4            2011-10-07

IIUC that would mean that client #4 should be deleted, and anything he registered for should be deleted. Registrations would be

select * from Registration
where Id in (
      select Registration_id from groups as g
      where Client_id in ( 
            select g.Client_id
            from groups as g join Registration as r
            on g.Registration_id = r.Id
            group by g.Client_id
            having max(Registration_date) < '2011-10-08'
      )
);
Id          Registration_date
----------  -----------------
5           2011-10-07       

And, sure enough, client #4 is in Registration #5, and is the only client subject to deletion by this test.

From there you can work out the delete statements. I think the rule is "delete the client and anything he registered for". If so, I'd probably write the Registration IDs to a temporary table, and write the deletes for both Registration and AssociatedClient by joining to it.

Tags:

Mysql

Sql

Bash