Rule for invoking subshell in Bash?

The parentheses always start a subshell. What's happening is that bash detects that sleep 5 is the last command executed by that subshell, so it calls exec instead of fork+exec. The sleep command replaces the subshell in the same process.

In other words, the base case is:

  1. ( … ) create a subshell. The original process calls fork and wait. In the subprocess, which is a subshell:
    1. sleep is an external command which requires a subprocess of the subprocess. The subshell calls fork and wait. In the subsubprocess:
      1. The subsubprocess executes the external command → exec.
      2. Eventually the command terminates → exit.
    2. wait completes in the subshell.
  2. wait completes in the original process.

The optimization is:

  1. ( … ) create a subshell. The original process calls fork and wait. In the subprocess, which is a subshell until it calls exec:
    1. sleep is an external command, and it's the last thing this process needs to do.
    2. The subprocess executes the external command → exec.
    3. Eventually the command terminates → exit.
  2. wait completes in the original process.

When you add something else after the call the sleep, the subshell needs to be kept around, so this optimization can't happen.

When you add something else before the call to sleep, the optimization could be made (and ksh does it), but bash doesn't do it (it's very conservative with this optimization).


From the Advanced Bash Programming Guide:

"In general, an external command in a script forks off a subprocess, whereas a Bash builtin does not. For this reason, builtins execute more quickly and use fewer system resources than their external command equivalents."

And a little further down:

"A command list embedded between parentheses runs as a subshell."

Examples:

[root@talara test]# echo $BASHPID
10792
[root@talara test]# (echo $BASHPID)
4087
[root@talara test]# (echo $BASHPID)
4088
[root@talara test]# (echo $BASHPID)
4089

Example using OPs code (with shorter sleeps because I am impatient):

echo $BASHPID

sleep 2
(
    echo $BASHPID
    sleep 2
    echo $BASHPID
)

The output:

[root@talara test]# bash sub_bash
6606
6608
6608

An additional note to @Gilles answer.

As said by Gilles: The parentheses always start a subshell.

However, the numbers that such sub-shell have might repeat:

$ (echo "$BASHPID and $$"; sleep 1)
2033 and 31679
$ (echo "$BASHPID and $$"; sleep 1)
2040 and 31679
$ (echo "$BASHPID and $$"; sleep 1)
2047 and 31679

As you can see, the $$ keeps repeating, and that is as expected, because (execute this command to find the correct man bash line):

$ LESS=+/'^ *BASHPID' man bash

BASHPID
Expands to the process ID of the current bash process. This differs from $$ under certain circumstances, such as subshells that do not require bash to be re-initialized.

That is: If the shell is not re-initialized, the $$ is the same.

Or with this:

$ LESS=+/'^ *Special Parameters' man bash

Special Parameters
$ Expands to the process ID of the shell. In a () subshell, it expands to the process ID of the current shell, not the subshell.

The $$ is the ID of the current shell (not the subshell).