John Gallagher

A Bash Puzzle

April 2022

Quick quiz: do these two snippets of Bash do the same thing?

Snippet 1:

someCommand
otherCommand
Snippet 2:
if true; then
    someCommand
    otherCommand
fi
At first glance it might seem like the answer should be 'yes', those two snippets do the same thing. Since the conditional always evaluates to 'true', the commands in the if statement are always executed, as they would be if they were not inside an if statement.

The correct answer, though, is that it depends!

One way that these two snippets could behave differently is if one of the two commands did some sort of introspection. Say, for example, it found the file containing its own source code and checked whether that code contained the string "if true; then", and behaved differently based on whether or not that string was present. This would be a pretty strange thing to do in a normal program. Indeed, I've never come across this sort of program in the wild, and that's not what this post is about.

However, I have encountered a case where code I worked on for a living behaved differently depending on whether or not it was wrapped in an if statement whose condition always evaluated to 'true'. How could this be? Well, let's say someCommand is something that consumes stdin, e.g. cat >/dev/null. For otherCommand we could choose anything; let's pick echo "hello", just so that we can tell if it executes. Now let's run both snippets:

Snippet 1:

$ bash <<-EOF
cat >/dev/null
echo "hello"
EOF
$
Snippet 2:
$ bash <<-EOF
if true; then
    cat >/dev/null
    echo "hello"
fi
EOF
hello
$
The second snippet printed 'hello', which is what we probably expected. But how could the first snippet not have printed 'hello'? How could it matter whether a couple lines of code are inside or outside of an if true block?

To understand this behavior, there are two key things to note:

In particular, this is what happens when the first snippet executes:
  1. The bash process hasn't been provided with a source file or the -c option, so it starts reading commands from stdin. It parses as far as the end of the first full command (cat >/dev/null).
  2. The bash process executes the command it just parsed, starting a cat subprocess. Since we didn't use any stdin redirection, the cat process inherits stdin from the bash parent process.
  3. The cat subprocess reads from stdin, since we didn't specify any files for it to read from. When the subprocess reads from stdin, it reads from the same pipe as the parent process, the pipe containing the contents of the heredoc. It consumes the next bytes in the pipe, which are echo "hello", and writes them to /dev/null. Having reached the end of the contents in the pipe, the cat process exits.
  4. The bash process tries to read the next command from stdin, but finds nothing left in the pipe, and so exits.
Essentially, the subprocess is consuming the code that we intended Bash to read.

Why doesn't this happen when you add the if true, as in the second snippet?

When passed code via stdin, Bash reads, parses, and executes one statement at a time, a behavior that is useful when Bash is invoked interactively. When it encounters an if statement, it reads to the end of the statement, the corresponding fi. Thus, the bash process ends up reading the entirety of the input before the cat process can steal it. If there were more statements after the end of the if statement though, they would fail to execute:

$ bash <<-EOF
if true; then
    cat >/dev/null
    echo "hello"
fi
echo "hello2"
EOF
hello
$

Side Note

This could all be avoided by putting the commands in a file, or by using the -c option to pass the code as a command line argument. Why would you ever run Bash code by passing it via stdin? In our case, this was part of a utility that, among other things, would run code on various remote machines:

ssh someuser@somemachine bash <<-EOF
    ... some code ...
EOF
This was more convenient than copying a file onto the remote before executing, and simpler than -c with regards to escaping. Things went wrong when occasionally, deep in the process tree created by executing the code in the heredoc, something would read from stdin. The result was that I was for a while very puzzled when the commands at the end of the block of code would fail to execute, even though they were clearly right there in the execution path.

Takeaway

If you have to pipe commands to Bash, then make sure none of the commands consume the stdin of the parent bash process. If they do need to read from stdin, redirect the appropriate thing, or maybe use </dev/null if you know the command doesn't need to read anything but will try anyway.