Why and how are some shared libraries runnable, as though they are executables?

That library has a main() function or equivalent entry point, and was compiled in such a way that it is useful both as an executable and as a shared object.

Here's one suggestion about how to do this, although it does not work for me.

Here's another in an answer to a similar question on S.O, which I'll shamelessly plagiarize, tweak, and add a bit of explanation.

First, source for our example library, test.c:

#include <stdio.h>                  

void sayHello (char *tag) {         
    printf("%s: Hello!\n", tag);    
}                                   

int main (int argc, char *argv[]) { 
    sayHello(argv[0]);              
    return 0;                       
}                   

Compile that:

gcc -fPIC -pie -o libtest.so test.c -Wl,-E

Here, we are compiling a shared library (-fPIC), but telling the linker that it's a regular executable (-pie), and to make its symbol table exportable (-Wl,-E), such that it can be usefully linked against.

And, although file will say it's a shared object, it does work as an executable:

> ./libtest.so 
./libtest.so: Hello!

Now we need to see if it can really be dynamically linked. An example program, program.c:

#include <stdio.h>

extern void sayHello (char*);

int main (int argc, char *argv[]) {
    puts("Test program.");
    sayHello(argv[0]);
    return 0;
}

Using extern saves us having to create a header. Now compile that:

gcc program.c -L. -ltest

Before we can execute it, we need to add the path of libtest.so for the dynamic loader:

export LD_LIBRARY_PATH=./

Now:

> ./a.out
Test program.
./a.out: Hello!

And ldd a.out will show the linkage to libtest.so.

Note that I doubt this is how glibc is actually compiled, since it is probably not as portable as glibc itself (see man gcc with regard to the -fPIC and -pie switches), but it demonstrates the basic mechanism. For the real details you'd have to look at the source makefile.


Let's dive for an answer in a random glibc repo on GitHub. This version provides a „banner“ in the file version.c.

In the same file there are some interesting points: the __libc_print_version function that prints the text to stdout and the __libc_main (void) symbol which is documented as the entry point. So this symbol is called when running the library.

So how does the linker or compiler knows exactly that this is the entry point function?

Let's dive into the makefile. In the linker flags there is an interesting one:

# Give libc.so an entry point and make it directly runnable itself.
LDFLAGS-c.so += -e __libc_main

So this is the linker flag for setting the entry point for the library. When building a library you can provide the -e function_name flag for the linker to enable an executable behavior. What does it really do? Let's look into the manual (somewhat outdated but still valid):

The linker command language includes a command specifically for defining the first executable instruction in an output file (its entry point). Its argument is a symbol name:

ENTRY(symbol)

Like symbol assignments, the ENTRY command may be placed either as an independent command in the command file, or among the section definitions within the SECTIONS command--whatever makes the most sense for your layout.

ENTRY is only one of several ways of choosing the entry point. You may indicate it in any of the following ways (shown in descending order of priority: methods higher in the list override methods lower down).

the `-e' entry command-line option;
the ENTRY(symbol) command in a linker control script;
the value of the symbol start, if present;
the address of the first byte of the .text section, if present;
The address 0. 

For example, you can use these rules to generate an entry point with an assignment statement: if no symbol start is defined within your input files, you can simply define it, assigning it an appropriate value---

start = 0x2020;

The example shows an absolute address, but you can use any expression. For example, if your input object files use some other symbol-name convention for the entry point, you can just assign the value of whatever symbol contains the start address to start:

start = other_symbol ;

(the current documentation can be found here)

The ld linker actually does create an executable with an entry point function if you provide the command line option -e (which is the most popular solution), provide a function symbol start, or spcify a symbol address for the assembler.

However please note that it is not guaranteed to work with other linkers (I do not know if llvm's lld has the same flag). I am not aware why this should be useful for the purposes other than providing the information about the SO file.