Removing duplicates from SQL Join

There are some situations when you have to do grouping in a subquery

SELECT distinct b.recid, b.first, b.last 
FROM table1 a
INNER JOIN (
  SELECT MAX(recid) as recid, first, last 
  FROM table2
  GROUP BY first, last
) b
ON a.first = b.first AND a.last = b.last

SELECT t2.recid, t2.first, t2.last 
FROM  table1 t1
INNER JOIN table2 t2 ON t1.first = t2.first AND t1.last = t2.last
GROUP BY t2.recid, t2.first, t2.last

EDIT: Added picture

enter image description here


You don't want to do a join per se, you're merely testing for existence/set inclusion.

I don't know what current flavor of SQL you're coding in, but this should work.

SELECT MAX(recid), firstname, lastname 
FROM table2 T2
WHERE EXISTS (SELECT * FROM table1 WHERE firstname = T2.firstame AND lastname = T2.lastname)
GROUP BY lastname, firstname

If you want to implement as a join, leaving the code largely the same:

i.e.

SELECT max(t2.recid), t2.firstame, t2.lastname 
FROM Table2 T2 
INNER JOIN Table1 T1 
    ON T2.firstname = t1.firstname and t2.lastname = t1.lastname
GROUP BY t2.firstname, t2.lastname 

Depending on the DBMS, an inner join may be implemented differently to an Exists (semi-join vs join) but the optimizer can sometimes figure it out anyway and chose the correct operator regardless of which way you write it.

Tags:

Sql

Join