Python sum of ASCII values of all characters in a string

print sum(map(ord,my_string))

This would be the easiest.


You can use an intermediate bytearray to speed things up:

>>> sum(bytearray("abcdefgh"))
804

This is not 17 times faster than the generator—it involves the creation of an intermediate bytearray and sum still has to iterate over Python integer objects—but on my machine it does speed up summing an 8-character string from 2μs to about 700ns. If a timing in this ballpark is still too inefficient for your use case, you should probably write the speed-critical parts of your application in C anyway.

If your strings are sufficiently large, and if you can use numpy, you can avoid creating temporary copies by directly referring to the string's buffer using numpy.frombuffer:

>>> import numpy as np
>>> np.frombuffer("abcdefgh", "uint8").sum()
804

For smaller strings this is slower than a temporary array because of the complexities in numpy's view creation machinery. However, for sufficiently large strings, the frombuffer approach starts to pay off, and it of course always creates less garbage. On my machine the cutoff point is string size of about 200 characters.

Also, see Guido's classic essay Python Optimization Anecdote. While some of its specific techniques may by now be obsolete, the general lesson of how to think about Python optimization is still quite relevant.


You can time the different approaches with the timeit module:

$ python -m timeit -s 's = "a" * 20' 'sum(ord(ch) for ch in s)' 
100000 loops, best of 3: 3.85 usec per loop
$ python -m timeit -s 's = "a" * 20' 'sum(bytearray(s))'
1000000 loops, best of 3: 1.05 usec per loop
$ python -m timeit -s 'from numpy import frombuffer; s = "a" * 20' \
                      'frombuffer(s, "uint8").sum()' 
100000 loops, best of 3: 4.8 usec per loop

You can speed it up a bit (~40% ish, but nowhere near as fast as native C) by removing the creation of the generator...

Instead of:

sum(ord(c) for c in string)

Do:

sum(map(ord, string))

Timings:

>>> timeit.timeit(stmt="sum(map(ord, 'abcdefgh'))")
# TP: 1.5709713941578798
# JC: 1.425781011581421
>>> timeit.timeit(stmt="sum(ord(c) for c in 'abcdefgh')")
# TP: 1.7807035140629637
# JC: 1.9981679916381836