Most efficient numerical selectBetween

Varying samples

(Updated to include internal function)

For varying samples, I think the best non-compiled method is to use Pick as you did. You can speed up the selector list slightly by using Subtract, and you can reduce the number of arithmetical operations by 1 as follows:

sb2[samp_, {min_, max_}] := Pick[
    samp,
    UnitStep[Subtract[samp, min] Subtract[samp, max]],
    0
]

In comparison to your answer, sb2 has 2 subtractions, 1 multiplication and 1 UnitStep call, while selectBetween has 2 subtractions, 1 addition and 2 UnitStep calls.

Comparison:

samp = RandomReal[10, 10^6];

r1 = selectBetween[samp, {.001, .01}]; //RepeatedTiming
r2 = sb2[samp, {.001, .01}]; //RepeatedTiming

r1===r2

{0.00690, Null}

{0.0059, Null}

True

Slightly faster.

Varying sample addendum 1

Here is a different selector that is slightly faster:

sb3[samp_, {min_, max_}] := Pick[
    samp,
    Unitize @ Clip[samp, {min, max}, {0,0}],
    1
]

Comparison:

samp = RandomReal[10, 10^6];

r1 = selectBetween[samp, {.001, .01}]; //RepeatedTiming
r2 = sb3[samp, {.001, .01}]; //RepeatedTiming

r1===r2

{0.0069, Null}

{0.0055, Null}

True

Note that the selector would need to be modified if the interval could contain 0.

Varying sample addendum 2

If you don't mind using an undocumented internal function, you could try:

samp = RandomReal[10, 10^6];

r1 = selectBetween[samp, {.001, .01}]; //RepeatedTiming
r2 = Random`Utilities`SelectWithinRange[samp, {.001, .01}]; //RepeatedTiming

r1 === r2

{0.00688, Null}

{0.0019, Null}

True

Fixed sample

If the sample stays the same, and you are selecting different ranges of data, then the following will be much faster:

nf = Nearest[samp]; //AbsoluteTiming

r3 = nf[(.01+.001)/2, {All, (.01-.001)/2}];//RepeatedTiming

Sort @ r1 === Sort @ r3

{0.153673, Null}

{0.000071, Null}

True

Creating the NearestFunction is slow, but only needs to be done once, and then using the NearestFunction will be extremely fast.


BoolEval was made exactly for these kinds of problems. Suppose you want to select elements from array which are between lo and hi. Then use:

Needs["BoolEval`"]
BoolPick[array, lo < array < hi]

Simple, concise, and as fast as it gets for an unsorted list (with documented functions). Using <= instead of < is also possible, and it is handled correctly.