The ExtensibleExceptional Shell
Es is the exceptional shell.
That is to say, es is a shell which features an exception mechanism, on which it relies for a number of uses, and getting a feel for that exception mechanism is a core part of proficiency in working with es.
Personally, when I first happened upon es, I thought the idea of an exception mechanism in a shell seemed a bit strange.
To my mind, it was overkill to have such a “real” programming-language concept in a shell.
What I didn’t realize at the time was that exceptions make it possible to unify what would otherwise be a few, less-powerful, and separate mechanisms for non-structured control flow.
How exceptions work
An exception in es is a list.
The list’s first element is its type.
There are a few exception types which have particular meaning to the shell: break, eof, error, exit, retry, return, and signal.
Depending on the type of exception, there are conventions for what the remaining list elements mean.
For example, the error exception’s second element contains the name of the command which produced the error, and the remaining elements comprise an error message which can be used to understand the problem.
Raising exceptions in es is done with the throw function, or may be done by built-ins like %parse or the many parts of the shell which may need to raise an error.
Catching them is done with the catch command, which takes as arguments a catcher, which must be a lambda, and a body, which is typically a code fragment.
The body is evaluated, and if an exception is raised then the catcher is evaluated with the raised exception as its arguments.
An exception can be re-raised from a catcher, to be caught by the catcher which wraps it.
If the retry exception is raised from a catcher, then instead of sending it “up the chain”, catch re-runs its body (more on retry later).
If an exception is raised and not caught by any catcher, then it will reach the top level and cause the currently-running shell to exit, possibly printing a diagnostic message while doing so.
fn-while = $&noreturn @ cond body {
catch @ e value {
if {~ $e break} {
result $value
} {
throw $e $value
}
} {
let (result = <=true)
forever {
if {!$cond} {
throw break $result
} {
result = <=$body
}
}
}
}
A verson of the while function, which catches the break exception and re-raises any others.
catch is not the only mechanism in es for catching exceptions.
Certain exceptions are automatically caught by different parts of the shell (more on those particulars in the descriptions of specific exceptions), but there is one other general exception-handling mechanism which is important to mention:
unwind-protect
The unwind-protect function is es’ cleanup mechanism.
It takes two arguments: a body and a cleanup command.
An example of its use is in a definition of the %readfrom function:
fn-%readfrom = $&noreturn @ var input cmd {
local ($var = /tmp/es.$var.$pid)
unwind-protect {
$input > $$var
# text of $cmd is command file
$cmd
} {
rm -f $$var
}
}
A version of %readfrom.
Note here that $&noreturn is used, like in the previous example.
This function, which is what the <{cmd} input/output substitution syntax desugars into, takes a variable name and two commands as input.
It sets the variable name to a temporary file, runs the $input command with its output redirected to the file, and then runs the body command $cmd.
The problem to be solved is that this temporary file needs to be removed when the command finishes, but something might happen during $cmd which raises an exception and interrupts the process.
unwind-protect, therefore, is used; its cleanup argument contains that removal, which runs whether or not an exception is raised from the body.
After the cleanup is run, what happens next depends on how the body exits.
If an exception was raised, then that same exception is re-raised.
Otherwise, after finishing, unwind-protect returns the result of the body.
This last point is part of what makes unwind-protect useful; rather than needing to explicitly cache the body’s return value, it is done automatically.
It’s a little bit magic (meant negatively), but rather predicatble and almost always implicitly the right behavior.
Control flow exceptions
eof
The eof exception is raised by the %parse function when the shell reaches the end of its input, and it is caught by the REPL functions %interactive-loop and %batch-loop, which it causes to return.
It does not have any extra data associated with it.
The best thing to do with this exception is to ignore it: re-raise it if it is caught, and allow it to terminate whatever loop function it is meant for.
The existence of this exception implies that imitating the ignoreeof setting from csh should be possible.
This isn’t the case right now, as any EOF received by %parse causes it to forever raise eof; enabling a form of ignoreeof is a follow-up for a later version of the shell.
break $value
The break exception exists to do what the break statement does in many languages: terminate the loop it’s in.
It is caught by for and while (but not forever), and in both cases its argument list is used as the return value of the loop.
One problem with exceptions, which often appears in the use of break, is their dynamic scope.
This can be illustrated with a simple example:
fn thrice cmd {
for (i = 1 2 3) {
$cmd
}
}
while {true} {
thrice {
echo running cmd
throw break
}
}
There is one break exception raised here, but two loops that it could interrupt.
If exceptions had lexical scope, then the while loop, which lexically encloses the throw call, would be interrupted.
However, exceptions have dynamic scope, so the while loop is the one that is interrupted.
This is a silly, contrived example, but in some cases it can be more confusing and unpleasant if it comes as a surprise as dynamic scope often is, so it is worthwhile to be aware of the behavior.
return $value
The return exception is similar to break, except instead of loops, it causes functions to exit with the raised exception’s $value.
This includes named functions as well as anonymous lambda expressions, but critically, not regular code fragments.
So, these both catch the return raised in their bodies properly:
fn func {
echo printed
return value
echo not printed
}
@ {
echo printed
return value
echo not printed
}
But this doesn’t, and the exception will be caught by its enclosing function (or the shell’s top level):
{
echo printed
return value
echo not printed
}
Sometimes, it is useful for a function to accept arguments but not catch return.
This is true for while, %not, %and, %or, and many other cases where functions are used as syntax or other not-traditionally-a-function uses.
In these cases, the $&noreturn primitive is useful to define a function that doesn’t catch return:
fn-%not = $&noreturn @ cmd {
if {$cmd} {false} {true}
}
retry
The retry exception is a unique exception which, when caught by the catch command while running its catcher, will cause the body to be re-run.
It is used primarily in the %interactive-loop function.
In the following example, try-something will be run repeatedly until it completes without raising an exception.
catch @ {
throw retry
} {
try-something
}
Here is where I will begin editorializing: retry should be removed from es.
As seen in the while example earlier, it is common practice for a catcher to re-raise exceptions, and retry almost never interacts correctly with that.
Functions built around exception handling, such as unwind-protect, have to be significantly more complicated in their implementations in order to handle retry sensibly.
These pitfalls would be justifiable if retry were critically useful, but it isn’t: throw retry is rarely used outside of REPL functions, and it is actually never necessary, as the same behavior can be achieved by placing the handler inside a loop!
In short: retry is unnecessary and difficult, since it very occasionally turns catch into an ersatz looping construct, when es already has perfectly good constructs for loops that would cause far less difficulty.
The signal exception
The signal exception is the way that received signals are modeled in es.
They can be handled using the catch command like any other exception.
signal exceptions have one additional element specifying which signal caused the exception to be raised.
There is an extra step required to raise signal exceptions: the $signal variable.
This variable contains a list of signal names along with potential prefixes which configure the behavior to perform when that signal is received.
It might look something like:
; echo $signals
sigurg .sigint -sigtstp /sigquit /sigterm
What these elements mean is as follows:
sigurg: If a SIGURG is received, throw it as an exception.
.sigint: If a SIGINT is received, echo a newline and throw it as an exception. This . prefix is only usable for sigint and exists to make interactive sessions look a bit prettier.
-sigtstp: Ignore (via SIG_IGN; see your signal(3)) the SIGTSTP signal in this and any child processes (assuming they don’t reset their own handlers, of course).
/sigquit and /sigterm: Ignore the SIGQUIT and SIGTERM signals in the current process, but perform the default behavior in any children.
Any signal not in $signals is handled in the default manner (see your signal(7) for details).
So, to catch SIGUSR1 in a block of es script would look like:
signals = $signals sigusr1
catch @ e type rest {
if {~ $e signal && ~ $type sigusr1} {
echo caught sigusr1
} {
throw $e $type $rest
}
} {
do-something-slow
}
This pattern works fairly well for scripts, especially when the same constructs are handling signals at the same time as other exceptions, but it does fall down for users who want to change how their currently-running interactive loop handles signals.
Improving that situation requires making the %interactive-loop function more flexible, which may happen at some point in the future.
One thing to note about signals is that they will not be delivered as an exception until a catcher finishes running.
To quote the CHANGES file from when this behavior was introduced,
in 0.79 and 0.8, a signal coming in while %interactive-loop was in
its exception-catching routine would cause the shell to exit.
(this is a new twist on the old signal comes in while signal handler
is running problem.) this was “fixed” by preventing any delivery
of signals while the handler of a catch routine was running. i’m
not sure that this is a good thing. signals now should be delivered
immediately after the catcher finishes running, which means right
before the body starts again in the retry case.
One more thing to note about signal: es has special behavior to handle a signal exception which reaches the top level.
When other exceptions cause the shell to exit, they can influence its exit status, but there is no natural way for a process to exit as if it was killed by a signal.
This caused an awkward asymmetry.
If the shell was configured to throw on a certain exception, but didn’t actually catch it, it wouldn’t exit with the proper signal-containing exit status:
; echo <={es -c 'kill -TERM $pid'}
terminated
sigterm
; echo <={es -c 'signals = sigterm; kill -TERM $pid'}
uncaught exception: signal sigterm
1
These days, this symmetry has been resolved: if a signal exception reaches the top level, then the shell will remove its handling for the signal and attempt to kill itself with the signal.
If that doesn’t work, the shell will just exit with the status 1, but it is more common in practical use for this to succeed.
; echo <={es -c 'kill -TERM $pid'}
terminated
sigterm
; echo <={es -c 'signals = sigterm; kill -TERM $pid'}
terminated
sigterm
The error and exit exceptions
error $type $message
error is the standard way that the shell indicates problems with its built-in commands.
Its first argument is typically called type, but contains a word that indicates where the error originated; its second is a string describing the error.
The easiest way to summon these exceptions is to call a built-in with missing arguments.
; catch @ e type msg {echo e\t$e; echo type\t$type; echo msg\t$msg} {catch}
e error
type $&catch
msg usage: catch catcher body
By default, the interactive loop will catch error exceptions and print their messages to standard error.
exit $value
The exit exception is the mechanism by which the exit built-in command works.
It is not caught by anything and causes the shell to exit with the exception’s $value argument, if it is a number 0–255, or a simple 0 or 1 corresponding with whether $value is true.
exit was a relatively late-added exception done in order to allow commands like unwind-protect to perform unwinding on the shell’s exit—much like the EXIT trap in other shells.
There is, however, one remaining way (outside of bugs) that the shell can exit immediately, without throwing any exception or giving scripts any opportunity to clean up state, and that is the topic of the next section.
es -e and the false exception
Like many other shells, es has a flag -e which causes the shell to exit if any commands return a false status.
It is implemented in a very typical way for a shell: the behavior is disabled in certain contexts, like if tests or the body of a <=; it causes interactive sessions to exit; and it works by immediately calling exit(3), preventing the kind of cleanup that things like the exit exception allow.
These are all pretty poor behaviors.
The way the -e behavior is simply turned off in some contexts is particularly tricky, and the reason many people recommend avoiding the -e flag entirely, but the lack of any ability to recover on exit is also essentially a deal breaker to most people’s willingness to use it.
The thing is, though, these are all symptoms of the fact that -e is just a very mediocre version of exceptions: a safety mechanism by which errors automatically terminate execution unless explicitly handled.
If we take the code that calls exit(3) on a false result and replace it with a snippet that raises a false exception, then these problems could be very effectively resolved.
There is one particular reason I think that creating a new exception, false, would be a better idea than trying to raise an error on these false results.
Unlike with errors, es doesn’t actually “know” if a false result is a problem, or something normal and expected.
Normal and expected false results, after all, are core to how the if command works, as well as the <= construct.
Because of that, in order to allow scripts to typically work either with or without -e without having to rewrite them, the false exception should be automatically caught and handled by both if and <=.
This would allow unwinding to work, or error-handling to be performed (like, you know, an exception system), and would have fewer pitfalls to use than other shells’ version of the feature.