How to extract the bits of larger numeric Numpy data types

This is the solution I use:

def unpackbits(x, num_bits):
    if np.issubdtype(x.dtype, np.floating):
        raise ValueError("numpy data type needs to be int-like")
    xshape = list(x.shape)
    x = x.reshape([-1, 1])
    mask = 2**np.arange(num_bits, dtype=x.dtype).reshape([1, num_bits])
    return (x & mask).astype(bool).astype(int).reshape(xshape + [num_bits])

This is a completely vectorized solution that works with any dimension ndarray and can unpack however many bits you want.


You can do this with view and unpackbits

Input:

unpackbits(arange(2, dtype=uint16).view(uint8))

Output:

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]

For a = arange(int(1e6), dtype=uint16) this is pretty fast at around 7 ms on my machine

%%timeit
unpackbits(a.view(uint8))

100 loops, best of 3: 7.03 ms per loop

As for endianness, you'll have to look at http://docs.scipy.org/doc/numpy/user/basics.byteswapping.html and apply the suggestions there depending on your needs.


I have not found any function for this too, but maybe using Python's builtin struct.unpack can help make the custom function faster than shifting and anding longer uint (note that I am using uint64).

>>> import struct
>>> N = np.uint64(2 + 2**10 + 2**18 + 2**26)
>>> struct.unpack('>BBBBBBBB', N)
(2, 4, 4, 4, 0, 0, 0, 0)

The idea is to convert those to uint8, use unpackbits, concatenate the result. Or, depending on your application, it may be more convenient to use structured arrays.

There is also built-in bin() function, which produces string of 0s and 1s, but I am not sure how fast it is and it requires postprocessing too.

Tags:

Python

Numpy