Why does GCC create a shared object instead of an executable binary according to file?

What am I doing wrong?

Nothing.

It sounds like your GCC is configured to build -pie binaries by default. These binaries really are shared libraries (of type ET_DYN), except they run just like a normal executable would.

So your should just run your binary, and (if it works) not worry about it.

Or you could link your binary with gcc -no-pie ... and that should produce a non-PIE executable of type ET_EXEC, for which file will say ELF 64-bit LSB executable.


file 5.36 says it clearly

file 5.36 actually prints it clearly if the executable is PIE or not as shown at: https://unix.stackexchange.com/questions/89211/how-to-test-whether-a-linux-binary-was-compiled-as-position-independent-code/435038#435038

For example, a PIE executable shows as:

main.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, not stripped

and a non-PIE one as:

main.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped

The feature was introduced in 5.33 but it did just a simple chmod +x check. Before that it just printed shared object for PIE.

In 5.34, it was meant to start checking the more specialized DF_1_PIE ELF metadata, but due to a bug in the implementation at commit 9109a696f3289ba00eaa222fd432755ec4287e28 it actually broke things and showed GCC PIE executables as shared objects.

The bug was fixed in 5.36 at commit 03084b161cf888b5286dbbcd964c31ccad4f64d9.

The bug is present in particular in Ubuntu 18.10 which has file 5.34.

It does not manifest itself when linking assembly code with ld -pie because of a coincidence.

Source code breakdown is shown in the "file 5.36 source code analysis" section of this answer.

The Linux kernel 5.0 determines if ASLR can be used based on ET_DYN

The root cause of the file "confusion", is that both PIE executables and shared libraries are position independent and can be placed in randomized memory locations.

At fs/binfmt_elf.c the kernel only accepts those two types of ELF files:

/* First of all, some simple consistency checks */
if (interp_elf_ex->e_type != ET_EXEC &&
        interp_elf_ex->e_type != ET_DYN)
        goto out;

Then, only for ET_DYN does it set the load_bias to something that is not zero. The load_bias is then what determines the ELF offset: How is the address of the text section of a PIE executable determined in Linux?

/*
 * If we are loading ET_EXEC or we have already performed
 * the ET_DYN load_addr calculations, proceed normally.
 */
if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
        elf_flags |= elf_fixed;
} else if (loc->elf_ex.e_type == ET_DYN) {
        /*
         * This logic is run once for the first LOAD Program
         * Header for ET_DYN binaries to calculate the
         * randomization (load_bias) for all the LOAD
         * Program Headers, and to calculate the entire
         * size of the ELF mapping (total_size). (Note that
         * load_addr_set is set to true later once the
         * initial mapping is performed.)
         *
         * There are effectively two types of ET_DYN
         * binaries: programs (i.e. PIE: ET_DYN with INTERP)
         * and loaders (ET_DYN without INTERP, since they
         * _are_ the ELF interpreter). The loaders must
         * be loaded away from programs since the program
         * may otherwise collide with the loader (especially
         * for ET_EXEC which does not have a randomized
         * position). For example to handle invocations of
         * "./ld.so someprog" to test out a new version of
         * the loader, the subsequent program that the
         * loader loads must avoid the loader itself, so
         * they cannot share the same load range. Sufficient
         * room for the brk must be allocated with the
         * loader as well, since brk must be available with
         * the loader.
         *
         * Therefore, programs are loaded offset from
         * ELF_ET_DYN_BASE and loaders are loaded into the
         * independently randomized mmap region (0 load_bias
         * without MAP_FIXED).
         */
        if (elf_interpreter) {
                load_bias = ELF_ET_DYN_BASE;
                if (current->flags & PF_RANDOMIZE)
                        load_bias += arch_mmap_rnd();
                elf_flags |= elf_fixed;
        } else
                load_bias = 0;

I confirm this experimentally at: What is the -fPIE option for position-independent executables in gcc and ld?

file 5.36 behaviour breakdown

After studying how file works from its source. We will conclude that:

  • if Elf32_Ehdr.e_type == ET_EXEC
    • print executable
  • else if Elf32_Ehdr.e_type == ET_DYN
    • if DT_FLAGS_1 dynamic section entry is present
      • if DF_1_PIE is set in DT_FLAGS_1:
        • print pie executable
      • else
        • print shared object
    • else
      • if file is executable by user, group or others
        • print pie executable
      • else
        • print shared object

And here are some experiments that confirm that:

Executable generation        ELF type  DT_FLAGS_1  DF_1_PIE  chdmod +x      file 5.36
---------------------------  --------  ----------  --------  -------------- --------------
gcc -fpie -pie               ET_DYN    y           y         y              pie executable
gcc -fno-pie -no-pie         ET_EXEC   n           n         y              executable
gcc -shared                  ET_DYN    n           n         y              pie executable
gcc -shared                  ET_DYN    n           n         n              shared object
ld                           ET_EXEC   n           n         y              executable
ld -pie --dynamic-linker     ET_DYN    y           y         y              pie executable
ld -pie --no-dynamic-linker  ET_DYN    y           y         y              pie executable

Tested in Ubuntu 18.10, GCC 8.2.0, Binutils 2.31.1.

The full test example for for each type of experiment is described at:

  • gcc -pie and gcc -no-pie: What is the -fPIE option for position-independent executables in gcc and ld?

    Keep in mind that -pie is set on by default since Ubuntu 17.10, related: 32-bit absolute addresses no longer allowed in x86-64 Linux?

  • gcc -shared (.so shared library): https://github.com/cirosantilli/cpp-cheat/tree/b80ccb4a842db52d719a16d3716b02b684ebbf11/shared_library/basic

  • ld experiments: How to create a statically linked position independent executable ELF in Linux?

ELF type and DF_1_PIE are determined respectively with:

readelf --file-header main.out | grep Type
readelf --dynamic     main.out | grep FLAGS_1

file 5.36 source code analysis

The key file to analyse is magic/Magdir/elf.

This magic format determines file types depending only on the values of bytes at fixed positions.

The format itself is documented at:

man 5 magic

So at this point you will want to have the following documents handy:

  • http://www.sco.com/developers/devspecs/gabi41.pdf ELF standard at the ELF header section
  • http://www.cirosantilli.com/elf-hello-world/#elf-header my ELF file format introduction and breakdown

Towards the end of the file, we see:

0       string          \177ELF         ELF
!:strength *2
>4      byte            0               invalid class
>4      byte            1               32-bit
>4      byte            2               64-bit
>5      byte            0               invalid byte order
>5      byte            1               LSB
>>0     use             elf-le
>5      byte            2               MSB
>>0     use             \^elf-le

\177ELF are the 4 magic bytes at the start of every ELF file. \177 is the octal for 0x7F.

Then by comparing with the Elf32_Ehdr struct from the standard, we see that byte 4 (the 5th byte, the first one after the magic identifier), determines the ELF class:

e_ident[EI_CLASSELFCLASS]

and some of its possible values are:

ELFCLASS32 1
ELFCLASS64 2

In file source then, we have:

1 32-bit
2 64-bit

and 32-bit and 64-bit are the strings that file outputs to stdout!

So now we search for shared object in that file, and we are led to:

0       name            elf-le
>16     leshort         0               no file type,
!:mime  application/octet-stream
>16     leshort         1               relocatable,
!:mime  application/x-object
>16     leshort         2               executable,
!:mime  application/x-executable
>16     leshort         3               ${x?pie executable:shared object},

So this elf-le is some kind of identifier that gets included on the previous part of the code.

Byte 16 is exactly the ELF type:

Elf32_Ehdr.e_type

and some of its values are:

ET_EXEC 2
ET_DYN  3

Therefore, ET_EXEC always gets printed as executable.

ET_DYN however has two possibilities depending on ${x:

  • pie executable
  • shared object

${x asks: are the file executable or not by either user, group or other? If yes, show pie executable, else shared object.

This expansion is done in the varexpand function in src/softmagic.c:

static int
varexpand(struct magic_set *ms, char *buf, size_t len, const char *str)
{
    [...]
            case 'x':
                    if (ms->mode & 0111) {
                            ptr = t;
                            l = et - t;
                    } else {
                            ptr = e;
                            l = ee - e;
                    }
                    break;

There is, however, one more hack! In src/readelf.c function dodynamic, if the DT_FLAGS_1 flags entry of the dynamic section (PT_DYNAMIC) is present, then the permissions in st->mode are overridden by the presence or absence of the DF_1_PIE flag:

case DT_FLAGS_1:
        if (xdh_val & DF_1_PIE)
                ms->mode |= 0111;
        else
                ms->mode &= ~0111;
        break;

The bug in 5.34 is that the initial code was written as:

    if (xdh_val == DF_1_PIE)

which means that if another flag was set, which GCC does by default due to DF_1_NOW, the executable showed as shared object.

The DT_FLAGS_1 flags entry is not described in the ELF standard so it must be a Binutils extension.

That flag has no uses in the Linux kernel 5.0 or glibc 2.27, so I seems to be purely informative to indicate that a file is PIE or not.