How do I limit the number of active threads in python?

If you want to limit the number of parallel threads, use a semaphore:

threadLimiter = threading.BoundedSemaphore(maximumNumberOfThreads)

class EncodeThread(threading.Thread):

    def run(self):
        threadLimiter.acquire()
        try:
            <your code here>
        finally:
            threadLimiter.release()

Start all threads at once. All but maximumNumberOfThreads will wait in threadLimiter.acquire() and a waiting thread will only continue once another thread goes through threadLimiter.release().


"Each of my threads is using subprocess.Popen to run a separate command line [process]".

Why have a bunch of threads manage a bunch of processes? That's exactly what an OS does that for you. Why micro-manage what the OS already manages?

Rather than fool around with threads overseeing processes, just fork off processes. Your process table probably can't handle 2000 processes, but it can handle a few dozen (maybe a few hundred) pretty easily.

You want to have more work than your CPU's can possibly handle queued up. The real question is one of memory -- not processes or threads. If the sum of all the active data for all the processes exceeds physical memory, then data has to be swapped, and that will slow you down.

If your processes have a fairly small memory footprint, you can have lots and lots running. If your processes have a large memory footprint, you can't have very many running.