Fastest way to get last significant bit position in a ulong (C#)?

public static UInt64 CountTrailingZeros(UInt64 input)
{
    if (input == 0) return 64;

    UInt64 n = 0;

    if ((input & 0xFFFFFFFF) == 0) { n = 32; input = input >> 32; }
    if ((input & 0xFFFF) == 0) { n = n + 16; input = input >> 16; }
    if ((input & 0xFF) == 0) { n = n + 8; input = input >> 8; }
    if ((input & 0xF) == 0) { n = n + 4; input = input >> 4; }
    if ((input & 3) == 0) { n = n + 2; input = input >> 2; }
    if ((input & 1) == 0) { ++n; }

    return n;
}

I changed the answer of Michael D. O'Connor to match Your question.


I have measured perfomance of all answers.

The winner is not present here classic De Bruijn sequence approach.

    private const ulong DeBruijnSequence = 0x37E84A99DAE458F;

    private static readonly int[] MultiplyDeBruijnBitPosition =
    {
        0, 1, 17, 2, 18, 50, 3, 57,
        47, 19, 22, 51, 29, 4, 33, 58,
        15, 48, 20, 27, 25, 23, 52, 41,
        54, 30, 38, 5, 43, 34, 59, 8,
        63, 16, 49, 56, 46, 21, 28, 32,
        14, 26, 24, 40, 53, 37, 42, 7,
        62, 55, 45, 31, 13, 39, 36, 6,
        61, 44, 12, 35, 60, 11, 10, 9,
    };

    /// <summary>
    /// Search the mask data from least significant bit (LSB) to the most significant bit (MSB) for a set bit (1)
    /// using De Bruijn sequence approach. Warning: Will return zero for b = 0.
    /// </summary>
    /// <param name="b">Target number.</param>
    /// <returns>Zero-based position of LSB (from right to left).</returns>
    private static int BitScanForward(ulong b)
    {
        Debug.Assert(b > 0, "Target number should not be zero");
        return MultiplyDeBruijnBitPosition[((ulong)((long)b & -(long)b) * DeBruijnSequence) >> 58];
    }

The fastest way is to inject Bit Scan Forward (bsf) Bit Instruction to assembly after JIT compiler instead of BitScanForward body, but this requires much more efforts.


With .NET Core 3.0 introducing hardware intrinsics, the fastest solution should be

ulong value = 18;
ulong result = System.Runtime.Intrinsics.X86.Bmi1.X64.TrailingZeroCount(value);

Alternatively, the new System.Numerics.Bitoperations methods also use hardware intrinsics:

int result2 = System.Numerics.BitOperations.TrailingZeroCount(value);

public static UInt64 CountLeadingZeros(UInt64 input)
{
    if (input == 0) return 64;

    UInt64 n = 1;

    if ((input >> 32) == 0) { n = n + 32; input = input << 32; }
    if ((input >> 48) == 0) { n = n + 16; input = input << 16; }
    if ((input >> 56) == 0) { n = n + 8; input = input << 8; }
    if ((input >> 60) == 0) { n = n + 4; input = input << 4; }
    if ((input >> 62) == 0) { n = n + 2; input = input << 2; }
    n = n - (input >> 63);

    return n;
}

I'll bet this'll be faster. From here.