How Does Linux deal with shell scripts?

If you use strace you can see how a shell script is executed when it's run.

Example

Say I have this shell script.

$ cat hello_ul.bash 
#!/bin/bash

echo "Hello Unix & Linux!"

Running it using strace:

$ strace -s 2000 -o strace.log ./hello_ul.bash
Hello Unix & Linux!
$

Taking a look inside the strace.log file reveals the following.

...
open("./hello_ul.bash", O_RDONLY)       = 3
ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7fff0b6e3330) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "#!/bin/bash\n\necho \"Hello Unix & Linux!\"\n", 80) = 40
lseek(3, 0, SEEK_SET)                   = 0
getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=4*1024}) = 0
fcntl(255, F_GETFD)                     = -1 EBADF (Bad file descriptor)
dup2(3, 255)                            = 255
close(3)     
...

Once the file's been read in, it's then executed:

...
read(255, "#!/bin/bash\n\necho \"Hello Unix & Linux!\"\n", 40) = 40
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc0b38ba000
write(1, "Hello Unix & Linux!\n", 20)   = 20
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
read(255, "", 40)                       = 0
exit_group(0)                           = ?

In the above we can clearly see that the entire script appears to be being read in as a single entity, and then executed there after. So it would "appear" at least in Bash's case that it reads the file in, and then executes it. So you'd think you could edit the script while it's running?

NOTE: Don't, though! Read on to understand why you shouldn't mess with a running script file.

What about other interpreters?

But your question is slightly off. It's not Linux that's necessarily loading the contents of the file, it's the interpreter that's loading the contents, so it's really up to how the interpreter's implemented whether it loads the file entirely or in blocks or lines at a time.

So why can't we edit the file?

If you use a much larger script however you'll notice that the above test is a bit misleading. In fact most interpreters load their files in blocks. This is pretty standard with many of the Unix tools where they load blocks of a file, process it, and then load another block. You can see this behavior with this U&L Q&A that I wrote up a while ago regarding grep, titled: How much text does grep/egrep consume each time?.

Example

Say we make the following shell script.

$ ( 
    echo '#!/bin/bash'; 
    for i in {1..100000}; do printf "%s\n" "echo \"$i\""; done 
  ) > ascript.bash;
$ chmod +x ascript.bash

Resulting in this file:

$ ll ascript.bash 
-rwxrwxr-x. 1 saml saml 1288907 Mar 23 18:59 ascript.bash

Which contains the following type of content:

$ head -3 ascript.bash ; echo "..."; tail -3 ascript.bash 
#!/bin/bash
echo "1"
echo "2"
...
echo "99998"
echo "99999"
echo "100000"

Now when you run this using the same technique above with strace:

$ strace -s 2000 -o strace_ascript.log ./ascript.bash
...    
read(255, "#!/bin/bash\necho \"1\"\necho \"2\"\necho \"3\"\necho \"4\"\necho \"5\"\necho \"6\"\necho \"7\"\necho \"8\"\necho \"9\"\necho \"10\"\necho 
...
...
\"181\"\necho \"182\"\necho \"183\"\necho \"184\"\necho \"185\"\necho \"186\"\necho \"187\"\necho \"188\"\necho \"189\"\necho \"190\"\necho \""..., 8192) = 8192

You'll notice that the file is being read in at 8KB increments, so Bash and other shells will likely not load a file in its entirety, rather they read them in in blocks.

References

  • The #! magic, details about the shebang/hash-bang mechanism on various Unix flavours

This is more shell dependent than OS dependent.

Depending on the version, ksh read the script on demand by 8k or 64k bytes block.

bash read the script line by line. However, given the fact lines can be of arbitrary lenght, it reads each time 8176 bytes from the beginning of the next line to parse.

This is for simple constructions, i.e. a suite of plain commands.

If shell structured commands are used (a case the accepted answer miss to consider) like a for/do/done loop, a case/esac switch, an here document, a subshell enclosed by parentheses, a function definition, etc. and any combination of the above, shell interpreters reads up to the end of the construction to first make sure there is no syntax error.

This is somewhat inefficient as the same code can be read again and again a large number of times but mitigated by the fact this content is normally cached.

Whatever the shell interpreter, it is very unwise to modify a shell script while it is being executed as the shell is free to read again any portion of the script and this can lead to unexpected syntax errors if out of sync.

Note too that bash might crash with a segmentation violation when it is unable to store an overly large script construction ksh93 can read flawlessly.


That depends on how the interpreter running the script works. All the kernel does is to notice the file to execute starts with #!, essentially runs the rest of the line as a program and gives it the executable as argument. If the interpreter listed there reads that file line by line (as interactive shells do with what you type), that is what you get (but multi-line loop structures are read and kept around for repeating); if the interpreter slurps the file into memory, processes it (perhaps compiles it to an intermediate representation, like Perl and Pyton do) the file is read in full before executing.

If you delete the file in the meanwhile, the file isn't deleted until the interpreter closes it (as always, files go away when the last reference, be it a directory entry or a process keeping it open) dissapears.