Is it possible to convert a really large int to a string quickly in python

You wrote in the comments that you want to get the length of the integer in decimal format. You don't need to convert this integer to a string, you can use "common logarithm" instead:

import math
math.ceil(math.log(a, 10))

Moreover, if you know that:

a = plaintextOrd**bigNumber

then math.log(a, 10) is equal to math.log(plaintextOrd, 10) * bigNumber, which shouldn't take more than a few milliseconds to calculate:

>>> plaintextOrd = 12345
>>> bigNumber = 67890
>>> a = plaintextOrd**bigNumber
>>> len(str(a))
277772
>>> import math
>>> math.ceil(math.log(a, 10))
277772
>>> math.ceil(math.log(plaintextOrd, 10) * bigNumber)
277772

It should work even if a wouldn't fit on your hard drive:

>>> math.ceil(math.log(123456789, 10) * 123456789012345678901234567890)
998952457326621672529828249600

As mentioned by @kaya3, Python standard floats aren't precise enough to describe the exact length of such a large number.

You could use mpmath (arbitrary-precision floating-point arithmetic) to get results with the desired precision:

>>> from mpmath import mp
>>> mp.dps = 1000
>>> mp.ceil(mp.log(123456789, 10) * mp.mpf('123456789012345678901234567890'))
mpf('998952457326621684655868656199.0')

Some quick notes on the "I need it for this function".

  • You don't need the first/second logic:
    • [:a] == [a*0:a*(0+1)]
    • [a:a+a] == [a*1:a*(1+1)]

So we have

    new = []
    for i in range(parts):
        new.append(string[a*i:a*(i+1)])

or just new = [string[a*i:a*(i+1)] for i in range(parts)].

Note that you have silently discarded the last len(string) % parts characters.

In your second loop, you shadow i with for i in i, which happens to work but is awkward and dangerous. It can also be replaced with string2 = ''.join(new), which means you can just do string2 = string[:-(len(string) % parts)].

You then see if the strings are the same length, and then add the extra letters to the end of the last list. This is a little surprising, e.g. you would have

>>> divideStringIntoParts(3, '0123456789a')
['012', '345', '6789a']

When most algorithms would produce something that favors even distributions, and earlier elements, e.g.:

>>> divideStringIntoParts(3, '0123456789a')
['0124', '4567', '89a']

Regardless of this, we see that you don't really care about the value of the string at all here, just how many digits it has. Thus you could rewrite your function as follows.

def divide_number_into_parts(number, parts):
    '''
    >>> divide_number_into_parts(12345678901, 3)
    [123, 456, 78901]
    '''
    total_digits = math.ceil(math.log(number + 1, 10))
    part_digits = total_digits // parts
    extra_digits = total_digits % parts

    remaining = number
    results = []
    for i in range(parts):
        to_take = part_digits
        if i == 0:
            to_take += extra_digits
        digits, remaining = take_digits(remaining, to_take)
        results.append(digits)
    # Reverse results, since we go from the end to the beginning
    return results[::-1]


def take_digits(number, digits):
    '''
    Removes the last <digits> digits from number.
    Returns those digits along with the remainder, e.g.:
    >>> take_digits(12345, 2)
    (45, 123)
    '''
    mod = 10 ** digits
    return number % mod, number // mod

This should be very fast, since it avoids strings altogether. You can change it to strings at the end if you'd like, which may or may not benefit from the other answers here, depending on your chunk sizes.


Faster than function str conversion of int to str is provided by GMPY2

Source of Example Below

import time
from gmpy2 import mpz

# Test number (Large)
x = 123456789**12345

# int to str using Python str()
start = time.time()
python_str = str(x)
end = time.time()

print('str conversion time {0:.4f} seconds'.format(end - start))

# int to str using GMPY2 module
start = time.time()
r = mpz(x)
gmpy2_str = r.digits()
end = time.time()

print('GMPY2 conversion time {0:.4f} seconds'.format(end - start))
print('Length of 123456789**12345 is: {:,}'.format(len(python_str)))
print('str result == GMPY2 result {}'.format(python_str==gmpy2_str))

Results (GMPY2 was 12 times faster in test)

str conversion time 0.3820 seconds
GMPY2 conversion time 0.0310 seconds
Length of 123456789**12345 is: 99,890
str result == GMPY2 result True

Tags:

Python