Difference between "TOP" and "SAMPLE" in TeraData SQL

‘Sample’ command:

Sel * from tablename
sample 100

That will give you a sample of 100 different records from the table. The SAMPLE command will give DIFFERENT results each time you run it.

TOP command:

sel top 100 * from tablename;

This will give the first 100 rows of the table. The TOP command will give you THE SAME results each time you run it.


From TOP vs SAMPLE:

TOP 10 means "first 10 rows in sorted order". If you don't have an ORDER BY, then by extension it will be interpreted as asking for "ANY 10 rows" in any order. The optimizer is free to select the cheapest plan it can find and stop processing as soon as it has found enough rows to return.

If this query is the only thing running on your system, TOP may appear to always give you exactly the same answer, but that behavior is NOT guaranteed.

SAMPLE, as you have observed, does extra processing to try to randomize the result set yet maintain the same approximate distribution. At a very simple level, for example, it could pick a random point at which to start scanning the table and a number of rows to skip between rows that are returned.

Tags:

Sql

Teradata