Bytecode optimization

Python is a dynamic language. This means that you have a lot of freedom in how you write code. Due to the crazy amounts of introspection that python exposes (which are incredibly useful BTW), many optimizations simply cannot be performed. For example, in your first example, python has no way of knowing what datatype list is going to be when you call it. I could create a really weird class:

class CrazyList(object):
    def append(self, value):
        def new_append(value):
            print "Hello world"

        self.append = new_append

Obviously this isn't useful, but I can write this and it is valid python. If I were to pass this type to your above function, the code would be different than the version where you "cache" the append function.

We could write a similar example for += (it could have side-effects that wouldn't get executed if the "compiler" optimized it away).

In order to optimize efficiently, python would have to know your types ... And for a vast majority of your code, it has no (fool-proof) way to get the type data so it doesn't even try for most optimizations.


Please note that this is a micro optimization (and a well documented one). It is useful in some cases, but in most cases it is unnecessary if you write idiomatic python. e.g. your list example is best written using the .extend method as you've noted in your post. Most of the time, if you have a loop that is tight enough for the lookup time of a method to matter in your overall program runtime, then either you should find a way to rewrite just that loop to be more efficient or even push the computation into a faster language (e.g. C). Some libraries are really good at this (numpy).


With that said, there are some optimizations that can be done safely by the "compiler" in a stage known as the "peephole optimizer". e.g. It will do some simple constant folding for you:

>>> import dis
>>> def foo():
...     a = 5 * 6
... 
>>> dis.dis(foo)
  2           0 LOAD_CONST               3 (30)
              3 STORE_FAST               0 (a)
              6 LOAD_CONST               0 (None)
              9 RETURN_VALUE        

In some cases, it'll cache values for later use, or turn one type of object into another:

>>> def translate_tuple(a):
...   return a in [1, 3]
... 
>>> import dis
>>> dis.dis(translate_tuple)
  2           0 LOAD_FAST                0 (a)
              3 LOAD_CONST               3 ((1, 3))
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE

(Note the list got turned into a tuple and cached -- In python3.2+ set literals can also get turned into frozenset and cached).


In general Python optimises virtually nothing. It won't even optimise trivial things like x = x. Python is so dynamic that doing so correctly would be extremely hard. For example the list.append method can't be automatically cached in your first example because it could be changed in another thread, something which can't be done in a more static language like Java.

Tags:

Python