Why does global variable definition in C header file work?

This relies on so called "common symbols" which are an extension to standard C's notion of tentative definitions (https://port70.net/~nsz/c/c11/n1570.html#6.9.2p2), except most UNIX linkers make it work across translation units too (and many even with shared dynamic libaries)

AFAIK, the feature has existed since pretty much forever and it had something to do with fortran compatibility/similarity.

It works by the compiler placing giving uninitialized (tentative) globals a special "common" category (shown in the nm utility as "C", which stands for "common").

Example of data symbol categories:

  #!/bin/sh -eu
(
cat <<EOF
int common_symbol; //C
int zero_init_symbol = 0; //B
int data_init_symbol = 4; //D
const int const_symbol = 4; //R
EOF
) | gcc -xc - -c -o data_symbol_types.o
nm data_symbol_types.o

Output:

0000000000000004 C common_symbol
0000000000000000 R const_symbol
0000000000000000 D data_init_symbol
0000000000000000 B zero_init_symbol

Whenever a linker sees multiple redefinitions for a particular symbol, it usually generates linkers errors.

But when those redefinitions are in the common category, the linker will merge them into one. Also, if there are N-1 common definitions for a particular symbol and one non-tentative definition (in the R,D, or B category), then all the definitions are merged into the one nontentative definition and also no error is generated.

In other cases you get symbol redefinition errors.

Although common symbols are widely supported, they aren't technically standard C and relying on them is theoretically undefined behavior (even though in practice it often works).

clang and tinycc, as far as I've noticed, do not generate common symbols (there you should get a redefinition error). On gcc, common symbol generation can be disabled with -fno-common.

(Ian Lance Taylor's serios on linkers has more info on common symbols and it also mentions how linkers even allow merging differently sized common symbols, using the largest size for the final object: https://www.airs.com/blog/archives/42 . I believe this weird trick was once used by libc's to some effect)

Tags:

C