How are system commands like ls created?

You can determine the nature of an executable in Unix using the file command and the type command.

type

You use type to determine an executable's location on disk like so:

$ type -a ls
ls is /usr/bin/ls
ls is /bin/ls

So I now know that ls is located here on my system in 2 locations:/usr/bin/ls & /bin/ls. Looking at those executables I can see they're identical:

$ ls -l /usr/bin/ls /bin/ls
-rwxr-xr-x. 1 root root 120232 Jan 20 05:11 /bin/ls
-rwxr-xr-x. 1 root root 120232 Jan 20 05:11 /usr/bin/ls

NOTE: You can confirm they're identical beyond their sizes by using cmp or diff.

with diff
$ diff -s /usr/bin/ls /bin/ls
Files /usr/bin/ls and /bin/ls are identical
with cmp
$ cmp /usr/bin/ls /bin/ls
$ 

Using file

If I query them using the file command:

$ file /usr/bin/ls /bin/ls
/usr/bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0x303f40e1c9349c4ec83e1f99c511640d48e3670f, stripped
/bin/ls:     ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0x303f40e1c9349c4ec83e1f99c511640d48e3670f, stripped

So these would be actual physical programs that have been compiled from C/C++. If they were shell scripts they'd typically present like this to file:

$ file somescript.bash 
somescript.bash: POSIX shell script, ASCII text executable

What's ELF?

ELF is a file format, it is the output of a compiler such as gcc, which is used to compile C/C++ programs such as ls.

In computing, the Executable and Linkable Format (ELF, formerly called Extensible Linking Format) is a common standard file format for executables, object code, shared libraries, and core dumps.

It typically will have one of the following extensions in the filename: none, .o, .so, .elf, .prx, .puff, .bin


It is a binary executable (compiled into machine code, like most of the system). Shell scripts are more like "glue" to join parts together to quickly and flexibly create solutions out of existing stuff. That's the power of *nix.

You need the source code (c, sometimes c++, are the most common languages on *nix), not just the compiled executable. As it is open source, you can get the code for everything from online repositories (core utilities are usually from the gnu project). However, it's a bit tricky if you don't know how to use git or other version tracking systems.

Here is the ls.c file, if it helps: http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/ls.c