Removing all non-numeric characters from string in Python

>>> import re
>>> re.sub("[^0-9]", "", "sdkjh987978asd098as0980a98sd")
'987978098098098'

This should work for both strings and unicode objects in Python2, and both strings and bytes in Python3:

# python <3.0
def only_numerics(seq):
    return filter(type(seq).isdigit, seq)

# python ≥3.0
def only_numerics(seq):
    seq_type= type(seq)
    return seq_type().join(filter(seq_type.isdigit, seq))

Not sure if this is the most efficient way, but:

>>> ''.join(c for c in "abc123def456" if c.isdigit())
'123456'

The ''.join part means to combine all the resulting characters together without any characters in between. Then the rest of it is a list comprehension, where (as you can probably guess) we only take the parts of the string that match the condition isdigit.


@Ned Batchelder and @newacct provided the right answer, but ...

Just in case if you have comma(,) decimal(.) in your string:

import re
re.sub("[^\d\.]", "", "$1,999,888.77")
'1999888.77'

Tags:

Python

Numbers