How does bash file redirection to standard in differ from shell (`sh`) on Linux?

A similar script, without sudo, but similar results:

$ cat
sed -e 's/^/--/'

$ bash <

$ dash <

With bash, the rest of the script goes as input to sed, with dash, the shell interprets it.

Running strace on those: dash reads a block of the script (eight kB here, more than enought to hold the whole script), and then spawns sed:

read(0, "#!/bin/bash\nsed -e 's/^/--/'\nwho"..., 8192) = 36
stat("/bin/sed", {st_mode=S_IFREG|0755, st_size=73416, ...}) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|...

Which means that the filehandle is at the end of the file, and sed will not see any input. The remaining part being buffered within dash. (If the script was longer than the block size of 8 kB, the remaining part would be read by sed.)

Bash, on the other hand, seeks back to the end of the last command:

read(0, "#!/bin/bash\nsed -e 's/^/--/'\nwho"..., 36) = 36
stat("/bin/sed", {st_mode=S_IFREG|0755, st_size=73416, ...}) = 0
lseek(0, -7, SEEK_CUR)                  = 29
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|...

If the input comes from a pipe, as here:

$ cat | bash

rewinding cannot be done, as pipes and sockets are not seekable. In this case, Bash falls back to reading the input one character at a time to avoid overreading. (fd_to_buffered_stream() in input.c) Doing a full system call for each byte is not very effective in principle. In practise, I don't think the reads will be a great overhead compared e.g. to the fact that most things the shell does involve spawning whole new processes.

A similar situation is this:

echo -e 'foo\nbar\ndoo' | bash -c 'read a; head -1'

The subshell has to make sure read only reads up the the first newline, so that head sees the next line. (This works with dash too.)

In other words, Bash goes to additional lengths to support reading the same source for the script itself, and for commands executed from it. dash doesn't. The zsh, and ksh93 packaged in Debian go with Bash on this.

The shell is reading the script from standard input. Inside the script, you run a command which also wants to read standard input. Which input is going to go where? You can't tell reliably.

The way shells work is that they read a chunk of source code, parse it, and if they find a complete command, run the command, then proceed with the remainder of the chunk and the remainder of the file. If the chunk doesn't contain a complete command (with a terminating character at the end — I think all shells read up to the end of a line), the shell reads another chunk, and so on.

If a command in the script tries to read from the same file descriptor that the shell is reading the script from, then the command will find whatever comes after the last chunk that it read. This location is unpredictable: it depends on what chunk size the shell picked, and that can depend not only on the shell and its version but on the machine configuration, available memory, etc.

Bash seeks to the end of a command's source code in the script before executing the command. This is not something that you can count on, not only because other shells don't do it, but also because this only works if the shell is reading from a regular file. If the shell is reading from a pipe (e.g. ssh <, data that's been read is read and can't be unread.

If you want to pass input to a command in the script, you need to do so explicitly, typically with a here document. (A here document is usually the most convenient for multi-line input, but any method will do.) The code you wrote only works in a few shells, only if the script is passed as input to the shell from a regular file; if you expected that the second whoami would be passed as input to sudo …, think again, keeping in mind that most of the time the script is not passed to the shell's standard input.

sudo su -l root <<'EOF'

Note that this decade, you can use sudo -i root. Running sudo su is a hack from the past.