John Gallagher
Quick quiz: do these two snippets of Bash do the same thing?
Snippet 1:
someCommand
otherCommand
Snippet 2:
if true; then
someCommand
otherCommand
fi
At first glance it might seem like the answer should be 'yes', those two snippets do the
same thing. Since the conditional always evaluates to 'true', the commands in the
if
statement are always executed, as they would be if they were not inside an
if
statement.
The correct answer, though, is that it depends!
One way that these two snippets could behave differently is if one of the two commands
did some sort of introspection. Say, for example, it found the file containing its
own source code and checked whether that code contained the string "if true; then"
, and
behaved differently based on whether or not that string was present. This would be a
pretty strange thing to do in a normal program. Indeed, I've never come across this
sort of program in the wild, and that's not what this post is about.
However, I have encountered a case where code I worked on for a living behaved differently
depending on whether or not it was wrapped in an if
statement whose condition
always evaluated to 'true'. How could this be? Well, let's say someCommand
is
something that consumes stdin, e.g. cat >/dev/null
. For otherCommand
we could choose anything; let's pick echo "hello"
, just so that we can tell if it
executes. Now let's run both snippets:
Snippet 1:
$ bash <<-EOF
cat >/dev/null
echo "hello"
EOF
$
Snippet 2:
$ bash <<-EOF
if true; then
cat >/dev/null
echo "hello"
fi
EOF
hello
$
The second snippet printed 'hello', which is what we probably expected. But how could the first
snippet not have printed 'hello'? How could it matter whether a couple lines of code are inside
or outside of an if true
block?
To understand this behavior, there are two key things to note:
bash
process via stdincat
subprocess inherits stdin from the bash
parent processbash
process hasn't been provided with a source file or the -c
option,
so it starts reading commands from stdin. It parses as far as the end of the first full command
(cat >/dev/null
).bash
process executes the command it just parsed, starting a cat
subprocess. Since we didn't
use any stdin redirection, the cat
process inherits stdin from the bash
parent process.
cat
subprocess reads from stdin, since we didn't specify any files for
it to read from. When the subprocess reads from stdin, it reads from the same pipe
as the parent process, the pipe containing the contents of the heredoc. It consumes the
next bytes in the pipe, which are echo "hello"
,
and writes them to /dev/null
. Having reached the end of the contents in
the pipe,
the cat
process exits.bash
process tries to read the next command from stdin, but finds nothing left in
the pipe, and so exits.
Why doesn't this happen when you add the if true
, as in the second snippet?
When passed code via stdin, Bash reads, parses, and executes one statement at a time, a
behavior that is useful when Bash is invoked interactively. When it encounters an
if
statement, it reads to the end of the statement, the corresponding
fi
. Thus, the bash
process ends up reading the entirety of the
input before the cat process can steal it. If there were more statements after the end of the
if
statement though, they would fail to execute:
$ bash <<-EOF
if true; then
cat >/dev/null
echo "hello"
fi
echo "hello2"
EOF
hello
$
This could all be avoided by putting the commands in a file, or by using the -c
option
to pass the code as a command line argument. Why would you ever run Bash code by passing it
via stdin? In our case, this was part of a utility that, among other things, would run code on
various remote machines:
ssh someuser@somemachine bash <<-EOF
... some code ...
EOF
This was more convenient than copying a file onto the remote before executing, and simpler
than -c
with regards to escaping.
Things went wrong when occasionally, deep in the process tree created by executing the code
in the heredoc, something would read from stdin. The result was that
I was for a while very puzzled when the commands at the end of the block of code
would fail to execute, even though they were clearly right there in the execution path.
If you have to pipe commands to Bash, then make sure none of the commands consume the stdin
of the parent bash
process. If they do need to read from stdin, redirect the appropriate
thing, or maybe use </dev/null
if you know the command doesn't need to
read anything but will try anyway.