Find the "Bittiest" Number

x86 Machine Language, 18 bytes

31 D2 AD F3 0F B8 F8 39 FA 77 03 87 FA 93 E2 F2 93 C3 

The above bytes define a function that accepts the address of the array in the esi register and the number of elements in the array in the ecx register, and returns the "bittiest" number in the array in the eax register.

Note that this is a custom calling convention that accepts arguments in the ecx and esi registers, but it is otherwise much like a C function that takes the length of the array and a pointer to the array as its two arguments. This custom calling convention treats all registers as caller-save, including ebx.

The implementation of this function pulls some dirty tricks, which assume that the array has at least 1 element, as provided for in the challenge. It also assumes that the direction flag (DF) is clear (0), which is standard in all calling conventions that I'm aware of.

In ungolfed assembly-language mnemonics:

; ecx = length of array
; esi = address of first element in array
Find:
    31 D2          xor    edx, edx                ; start with max bit count set to 0
Next:
    AD             lods   eax, DWORD PTR [esi]    ; load the next value from the array, and
                                                  ;   increment ptr by element size
    F3 0F B8 F8    popcnt edi, eax                ; count # of set bits in value
    39 FA          cmp    edx, edi                ; if # of set bits in value is less than
    77 03          ja     SHORT Skip              ;   the running maximum, skip next 2 insns
    87 FA          xchg   edx, edi                ; save current # of set bits (for comparisons)
    93             xchg   eax, ebx                ; save current array value (for comparisons)
Skip:
    E2 F2          loop   SHORT Next              ; decrement element count, looping until it is 0
    93             xchg   eax, ebx                ; move running maximum value to eax
    C3             ret                            ; return, with result in eax

The key feature of this code is, of course, the x86 popcnt instruction, which counts the number of set bits in an integer. It iterates through the input array, keeping track of both the value of the maximum element and the number of set bits that it contains. It checks each value in the array to see if its number of set bits is higher than any value it has seen before. If so, it updates the tracking values; if not, it skips this step.

The popcnt instruction is a large (4-byte) instruction, but there's nothing that can be done to avoid that. However, the very short (1-byte) lods instruction has been used to load values from the array while simultaneously incrementing the pointer, the short (2-byte) loop instruction has been used for loop control (automatically decrementing the element counter and looping as long as there are more elements remaining to go through), and the very short (1-byte) xchg instruction has been used throughout.

An extra xchg had to be used at the end in order to enable use of the lods instruction, which always loads into the eax register, but that trade-off is more than worth it.

Try it online!

My first attempt was a 20-byte function. So far, 18 bytes is the best I have been able to come up with. I'm curious to see if anyone else can beat this score!

The only route of improvement that I see would be if a LOOPA instruction existed. Unfortunately, it doesn't—the only condition codes supported by LOOP are E/Z and NE/NZ. But maybe someone else can stretch their mind further than me!


JavaScript (ES6),  49 48 47  45 bytes

Saved 2 bytes thanks to @user81655

a=>a.sort(g=(p,q)=>!p|-!q||g(p&p-1,q&q-1))[0]

Try it online!

How?

We .sort() the input list with a recursive function that, given p and q, clears the least significant bit set in each variable until one of them is equal to 0 (or both of them simultaneously). This allows to order the list from most to least bits set. We then return the first entry, i.e. the "bittiest" one.

Commented

a =>                 // a[] = input list
  a.sort(            // sort a[] ...
    g = (p, q) =>    // ... using the recursive function g:
      !p | -!q       //     -> +1 if p = 0 and q ≠ 0,
                     //     or -1 if q = 0,
      ||             //     or  0 if p ≠ 0 and q ≠ 0, in which case ...
        g(           //     ... we do a recursive call:
          p & p - 1, //       clear the least significant bit set in p
          q & q - 1  //       clear the least significant bit set in q
        )            //     end of recursive call
  )[0]               // end of sort(); return the first entry

Jelly, 13 12 8 bytes

%Ø%B§µÞṪ

Try it online!

Explanation

     µÞ  | sort input by
%Ø%      | modulo by 2^32 (Ø% is a quick for 2^32)
   B     | converted to binary
    §    | sum
       Ṫ | get the last

Edit: thank you all for the kind response to my first question! I think I've fixed it now, it seems to work for all test cases.

Original code

2*31
B%¢S€iṀị