Why is this singleton implementation "not thread safe"?

I'm posting this just to simplify suggested solution by @OlivierMelançon and @se7entyse7en: no overhead by import functools and wrapping.

import threading

lock = threading.Lock()

class SingletonOptmizedOptmized(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            with lock:
                if cls not in cls._instances:
                    cls._instances[cls] = super(SingletonOptmizedOptmized, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

class SingletonClassOptmizedOptmized(metaclass=SingletonOptmizedOptmized):
    pass

Difference:

>>> timeit('SingletonClass()', globals=globals(), number=1000000)
0.4635776
>>> timeit('SingletonClassOptmizedOptmized()', globals=globals(), number=1000000)
0.192263300000036

If you're concerned about performance you could improve the solution of the accepted answer by using the check-lock-check pattern to minimize locking acquisition:

class SingletonOptmized(type):
    _instances = {}

    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._locked_call(*args, **kwargs)
        return cls._instances[cls]

    @synchronized(lock)
    def _locked_call(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(SingletonOptmized, cls).__call__(*args, **kwargs)

class SingletonClassOptmized(metaclass=SingletonOptmized):
    pass

Here's the difference:

In [9]: %timeit SingletonClass()
488 ns ± 4.67 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [10]: %timeit SingletonClassOptmized()
204 ns ± 4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

I suggest you choose a better singleton implementation. The metaclass-based implementation is the most frequently used.

As for for thread-safety, neither your approach nor any of the ones suggested in the above link are thread safe: it is always possible that a thread reads that there is no existing instance and starts creating one, but another thread does the same before the first instance was stored.

You can use a with lock controller to protect the __call__ method of a metaclass-based singleton class with a lock.

import threading

lock = threading.Lock()

class Singleton(type):
    _instances = {}

    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            with lock:
                if cls not in cls._instances:
                    cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]


class SingletonClass(metaclass=Singleton):
    pass

As suggested by se7entyse7en, you can use a check-lock-check pattern. Since singletons are only created once, your only concern is that the creation of the initial instance must be locked. Although once this is done, retrieving the instance requires no lock at all. For that reason we accept the duplication of the check on the first call so that all further call do not even need to acquire the lock.