Find the Smoothest Number

CJam - 13

q~,>{mfW=}$0=

Try it at http://cjam.aditsu.net/

Example input: 2001 2014
Example output: 2002

Explanation:

q~ reads and evaluates the input, pushing the 2 numbers on the stack (say min and max)
, makes an array [0 1 ... max-1]
> slices the array starting at min, resulting in [min ... max-1]
{…}$ sorts the array using the block to calculate the sorting key
mf gets an array with all the prime factors of a number, in order
W= gets the last element of the array (W=-1), thus obtaining the largest prime factor to be used as a sorting key
0= gets the first element of the (sorted) array


Regex (.NET PCRE flavour), 183 129 bytes

Don't try this at home!

This is not really a contender for the win. But Eric Tressler suggested solving this problem with nothing but a regex, and I couldn't resist giving it a go. This might be is possible in PCRE as well (and even shorter, see below), but I chose .NET because my solution needs arbitrary-length lookbehinds. Here we go:

(?<=^(1+),.*)(?=\1)(?=((11+)(?=.*(?=\3$)(?!(11+?)\4+$))(?=\3+$)|(?!(11+)\5+$)1+))(?!.+(?=\1)(?:(?!\2)|(?=((11+)(?=.*(?=\7$)(?!(11+?)\8+$))(?=\7+$)|(?!(11+)\9+$)1+)).*(?=\2$)(?=\6)))1+

The input is encoded as an inclusive comma-separated range, where both numbers are given in unary notation using 1s. The match will be the trailing S 1s where S is the smoothest number in the range. Ties are broken in favour of the smallest number.

So the second example from the question would be the following string (match underlined)

111111111,1111111111111111
                 =========

It is based on the (by now rather well-known) prime-checking regex, variations of which are embedded in there a whopping 6 times.

Here is a version using free-spacing and comments for those who want to know what's going on.

# Note that the beginning of the match we're looking for is somewhere
# in the second part of the input.
(?<=^(1+),.*)          # Pick up the minimum range MIN in group 1
(?=\1)                 # Make sure there are at least MIN 1s ahead

                       # Now there will be N 1s ahead of the cursor
                       # where MIN <= N <= MAX.


(?=(                   # Find the largest prime factor of this number
                       # store it in group 2.
  (11+)                # Capture a potential prime factor P in group 3
  (?=                  # Check that it's prime
    .*(?=\3$)          # Move to a position where there are exactly 
                       # P 1s ahead
    (?!(11+?)\4+$)     # Check that the remaining 1s are not composite
  )
  (?=\3+$)             # Now check that P is a divisor of N.
|                      # This does not work for prime N, so we need a 
                       # separate check
  (?!(11+)\5+$)        # Make sure that N is prime.
  1+                   # Match N
))

(?!                    # Now we need to make sure that here is not 
                       # another (smaller) number M with a smaller 
                       # largest prime factor

  .+                   # Backtrack through all remaining positions
  (?=\1)               # Make sure there are still MIN 1s ahead

  (?:
    (?!\2)             # If M is itself less than P we fail 
                       # unconditionally.
  |                    # Else we compare the largest prime factors.
    (?=(               # This is the same as above, but it puts the
                       # prime factor Q in group 6.
      (11+)
      (?=
        .*(?=\7$)
        (?!(11+?)\8+$)
      )
      (?=\7+$)
    |
      (?!(11+)\9+$)
      1+
    ))
    .*(?=\2$)          # Move to a position where there are exactly 
                       # P 1s ahead
    (?=\6)             # Try to still match Q (which means that Q is
                       # less than P)
  )
)
1+                     # Grab all digits for the match

You can test it online over here. Don't try too large inputs though, I make no guarantees about the performance of this monster.

Edit:

I ended up porting this to PCRE (which only requires two steps), and shortening the regex by almost a third. Here is the new version:

^(1+),.*?\K(?=\1)(?=((11+)(?=.*(?=\3$)(?!(11+?)\4+$))(?=\3+$)|(?!(11+)\5+$)1+))(?!.+(?=\1)(?:(?!\2)|(?=((?2))).*(?=\2$)(?=\6)))1+

This is essentially the same, with two changes:

  • PCRE does not support arbitrary-length lookbehind (which I used to get the MIN into group 1). However, PCRE does support \K which resets the beginning of the match to the current cursor position. Hence (?<=^(1+),.*) becomes ^(1+),.*?\K, which already saves two bytes.
  • The real savings come from PCRE's recursion feature. I'm not actually using recursion, but you can use (?n) to match group n again, similar to a subroutine call. Since the original regex contained the code for finding a number's largest prime factor twice, I was able to replace the whole bulk of the second one with a simple (?2).

Regex (PCRE flavour), 66 (65) bytes

Inspired by seeing that both Martin Ender and jaytea, two regex geniuses, wrote regex solutions to this code golf, I wrote my own from scratch. The famous prime-checking regex does not appear anywhere in my solution.

Do not read this if you don't want some unary regex magic spoiled for you. If you do want to take a shot at figuring out this magic yourself, I highly recommend starting by solving some problems in ECMAScript regex:

  1. Match prime numbers (if you aren't already familiar with doing this in regex)
  2. Match powers of 2 (if you haven't already done so). Or just work your way through Regex Golf, which includes Prime and Powers. Make sure to do both the Classic and Teukon problem sets.
  3. Find the shortest way to match powers of N where N is some constant (i.e. specified in the regex, not the input) which can be composite (but is not required to be). For example, match powers of 6.

  4. Find a way of matching Nth powers, where N is some constant >=2. For example, match perfect squares. (For a warmup, match prime powers.)

  5. Match correct multiplication statements. Match triangular numbers.

  6. Match Fibonacci numbers (if you're as crazy as I am), or if you want to stick to something shorter, match correct statements of exponentiation (for a warmup, return as a match the logarithm in base 2 of a power of 2 – bonus, do the same for any number, rounding it however you like), or factorial numbers (for a warmup, match primorial numbers).

  7. Match abundant numbers (if you're as crazy as I am)

  8. Calculate an irrational number to requested precision (e.g. divide the input by the square root of 2, returning the rounded result as a match)

(The regex engine I wrote may be of help, as it is very fast at unary math regexes and includes a unary numerical mode which can test ranges of natural numbers (but also has a strings mode which can evaluate non-unary regexes, or unary with delimiters). By default it is ECMAScript compatible, but has optional extensions (which can selectively add subsets of PCRE, or even molecular lookahead, something that no other regex engine has).)

Otherwise, read on, and also read this GitHub Gist (warning, many spoilers) which chronicles the journey of pushing ECMAScript regex to tackle natural number functions of increasing difficulty (starting with teukon's set of puzzles, not all of them mathematical, which sparked this journey).

As with the other regex solutions to this problem, the input is given as two numbers in bijective unary, separated by a comma, representing an inclusive range. Only one number is returned. The regex could be modified to return all of the numbers that share the same smallest largest prime factor, as separate matches, but that would require variable-length lookbehind and either putting \K in a lookahead or returning the result as a capture instead of a match.

The technique used here of repeated implicit division by smallest prime factor is identical to that used in the Match strings whose length is a fourth power answer I posted a while back.

With no further ado: ((.+).*),(?!.*(?=\1)(((?=(..+)(\5+$))\6)*)(?!\2)).*(?=\1)\K(?3)\2$

You can try it out here.

And the free-spacing version, with comments:

                        # No ^ anchor needed, because this algorithm always returns a
                        # match for valid input (in which the first number is less than
                        # or equal to the second number), and even in /g mode only one
                        # match can be returned. You can add an anchor to make it reject
                        # invalid ranges.

((.+).*),               # \1 = low end of range; \2 = conjectured number that is the
                        # smallest number in the set of the largest prime factor of each
                        # number in the range; note, it is only in subsequent tests that
                        # this is implicitly confined to being prime.
                        # We shall do the rest of our work inside the "high end of range"
                        # number.

(?!                     # Assert that there is no number in the range whose largest prime
                        # factor is smaller than \2.
  .*(?=\1)              # Cycle tail through all numbers in the range, starting with \1.

  (                     # Subroutine (?3):
                        # Find the largest prime factor of tail, and leave it in tail.
                        # It will both be evaluated here as-is, and later as an atomic
                        # subroutine call. As used here, it is not wrapped in an atomic
                        # group. Thus after the return from group 3, backtracking back
                        # into it can increase the value of tail – but this won't mess
                        # with the final result, because only making tail smaller could
                        # change a non-match into a match.

    (                   # Repeatedly divide tail by its smallest prime factor, leaving
                        # only the largest prime factor at the end.

      (?=(..+)(\5+$))   # \6 = tool to make tail = \5 = largest nontrivial factor of
                        # current tail, which is implicitly the result of dividing it
                        # by its smallest prime factor.
      \6                # tail = \5
    )*
  )
  (?!\2)                # matches iff tail < \ 2
)

# now, pick a number in the range whose largest prime factor is \2
.*(?=\1)                # Cycle tail through all numbers in the range, starting with \1.
\K                      # Set us up to return tail as the match.
(?3)                    # tail = largest prime factor of tail
\2$                     # Match iff tail == \2, then return the number whose largest
                        # prime factor is \2 as the match.

The algorithm can be easily ported to ECMAScript by replacing the subroutine call with a copy of the subroutine, and returning the match as a capture group instead of using \K. The result is 80 bytes in length:

((x+)x*),(?!.*(?=\1)((?=(xx+)(\4+$))\5)*(?!\2)).*(?=\1)(((?=(xx+)(\8+$))\9)*\2$)

Try it online!

Note that ((.+).*) can be changed to ((.+)+), dropping the size by 1 byte (from 66 to 65 bytes) with no loss of correct functionality – but the regex exponentially explodes in slowness.

Try it online! (79 byte ECMAScript exponential-slowdown version)