Proving SQL query equivalency

This sounds to me like a an NP complete problem. I'm not sure there is a sure fire way to prove this kind of thing


1) Real equivalency proof with Cosette:
Cosette checks (with a proof) if 2 SQL query's are equivalent and counter examples when not equivalent. It's the only way to be absolutely sure, well almost ;) You can even throw in 2 query's on their website and check (formal) equivalence right away.

Link to Cosette: https://cosette.cs.washington.edu/

Link to article that gives a good explanation of how Cosette works: https://medium.com/@uwdb/introducing-cosette-527898504bd6


2) Or if you're just looking for a quick practical fix:
Try this stackoverflow answer: [sql - check if two select's are equal]
Which comes down to:

(select * from query1 MINUS select * from query2) 
UNION ALL
(select * from query2 MINUS select * from query1)

This query gives you all rows that are returned by only one of the queries.


The best you can do is compare the 2 query outputs based on a given set of inputs looking for any differences. To say that they will always return the same results for all inputs really depends on the data.

For Oracle one of the better if not best approaches (very efficient) is here (Ctrl+F Comparing the Contents of Two Tables):
http://www.oracle.com/technetwork/issue-archive/2005/05-jan/o15asktom-084959.html

Which boils down to:

select c1,c2,c3, 
       count(src1) CNT1, 
       count(src2) CNT2
  from (select a.*, 
               1 src1, 
               to_number(null) src2 
          from a
        union all
        select b.*, 
               to_number(null) src1, 
               2 src2 
          from b
       )
group by c1,c2,c3
having count(src1) <> count(src2);

This is pretty easy to do.

Lets assume your queries are named a and b

a minus b

should give you an empty set. If it does not. then the queries return different sets, and the result set shows you the rows that are different.

then do

b minus a

that should give you an empty set. If it does, then the queries do return the same sets. if it is not empty, then the queries are different in some respect, and the result set shows you the rows that are different.

Tags:

Sql

Oracle