SQL Read Where IN (Long List from .TXT file)

You have a few options, of which one option is my recommended one.

Option 1

Create a table in your database like so:

create table ID_Comparer (
    ID int primary key
);

With a programming language of your choice, empty out the table and then load the 5000+ IDs that you want to eventually query in this table.

Then, write one of these queries to extract the data you want:

select *
from main_table m
where exists (
    select 1 from ID_Comparer where ID = m.ID
)

or

select *
from main_table m
inner join ID_Comparer c on m.ID = c.ID

Since ID_Comparer and (assuming that) main_table's ID is indexed/keyed, matching should be relatively fast.

Option 1 modified

This option is just like the one above but helps a bit with concurrency. That means, if application 1 is wanting to compare 2000 IDs whereas application 2 is wanting to compare 5000 IDs with your main table at the same time, you'd not want to delete data from comparer table. So, change the table a bit.

create table ID_Comparer (
    ID int primary key,
    token char(32), -- index this
    entered date default current_date() -- use the syntax of your DB
);

Then, use your favorite programming language to create a GUID. Load all the ID and the same GUID into the table like so:

1, 7089e5eced2f408eac8b390d2e891df5
2, 7089e5eced2f408eac8b390d2e891df5
...

Another process doing the same thing will be loading its own IDs with a GUID

2412, 96d9d6aa6b8d49ada44af5a99e6edf56
9434, 96d9d6aa6b8d49ada44af5a99e6edf56
...

Now, your select:

select *
from main_table m
where exists (
    select 1 from ID_Comparer where ID = m.ID and token = '<your guid>'
)

OR

select *
from main_table m
inner join ID_Comparer c on m.ID = c.ID and token = '<your guid>'

After you receive your data, be sure to do delete from ID_Comparer where token = '<your guid>' - that'd just be nice cleanup

You could create a nightly task to remove all data that's more than 2 days old or some such for additional housekeeping.

Since ID_Comparer and (assuming that) main_table's ID is indexed/keyed, matching should be relatively fast even when the GUID is an additional keyed lookup.

Option 2

Instead of creating a table, you could create a large SQL query like so:

select * from main_table where id = <first id>
union select * from main_table where id = <second id>
union select * from main_table where id = <third id>
...

OR

select * from main_table where id IN (<first 5 ids>)
union select * from main_table where id IN (<next 5 ids>)
union select * from main_table where id IN (<next 5 ids>)
...

If the performance is acceptable and if creating a new table like in option 1 doesn't feel right to you, you could try one of these methods.

(assuming that) main_table's ID is indexed/keyed, individual matching might result in faster query rather than matching with a long list of comma separated values. That's a speculation. You'll have to see the query plan and run it against a test case.

Which option to choose?

Testing these options should be fast. I'd recommend trying all these options with your database engine and the size of your table and see which one suits your use-case the most.


Step 1: Copy all your values in sublime or notepad++ Step 2: Press ctrl+h Choose the "Regular expressions" option Step 3: To add "," to the end of each line,

type $ in the "Find what" field, and "," in the "Replace with" field. Then hit "Replace All".

Then simply copy paste the values in your SQL query

SELECT COUNT(*) FROM `admins` WHERE id in (4,
5,
6,
9,
10,
14,
62,
63,
655,
656,
657,
658,
659,
661,
662)

PS: Do remove comma from the last value.