A (re-)introduction to the extensible shell
Es is the extensible shell.
The best (if slightly out-of-date) introduction to the shell is the original es paper presented at Usenix 1993 or the es man page, but I'll provide a shorter, incomplete introduction to the shell here, the state of the shell after three decades, and some thoughts on what interesting work there is left to be done.
The current version is hosted on the GitHub repository.
What is es?
Es is a Unix shell first developed in the early 1990s by Byron Rakitzis and Paul Haahr, based directly on Rakitzis' earlier port of the shell rc from Plan 9 to Unix.
As the paper puts it,
[w]hile rc was an experiment in adding modern syntax to Bourne shell semantics, es is an exploration of new semantics combined with rc-influenced syntax: es has lexically scoped variables, first-class functions, and an exception mechanism, which are concepts borrowed from modern programming languages such as Scheme and ML.
Simple commands closely resemble other shells, with pipes, redirections, $variables
, and all that. Redirection syntax particularly resembles rc.
make -npq >[2] /dev/null | grep '.*:'
Also like rc, es has list-typed variables, no rescanning, and no double quotes.
These together make variables significantly more straightforward to use than in POSIX-compatible shells.
; args = -l 'Long Document.pdf'
; ls $args
-rw-r--r-- 1 jpco jpco 12345 Aug 31 15:44 'Long Document.pdf'
From Scheme, es draws features like first-class functions and lexical bindings.
fn map cmd args {
for (i = $args)
$cmd $i
}
map @ i { cd $i; rm -f * } /tmp /var/tmp
In this example, @ i { cd $i; rm -f * }
is a lambda expression—an inline function—which takes one argument, i
, cd
s to it, and then rm -f
s everything in the directory.
Note that both map
and the lambda expression use variables named i
, but this works just fine, since function arguments and variables defined in for
expressions are both lexically bound, meaning that they're only defined within the code the definitions wrap (the body of the for
and the body of the function, respectively).
Nearly everything in es is a function under the hood, and functions are just variables whose names start with fn-
.
; echo {command > file}
{%create <={%one file} {command}}
; echo $fn-%create
%openfile w
; # this is not very useful
; fn-%create = echo
; command > file
1 file {command}
This lets users redefine swaths of shell behavior.
For example, the %write-history
function is called to write a command to history after reading it.
To make the shell avoid writing duplicate commands to history, one can write:
let (write = $fn-%write-history; last-cmd = ())
fn %write-history cmd {
if {!~ $cmd $last-cmd} {
$write $cmd
last-cmd = $cmd
}
}
We can go through this example line-by-line.
-
let (write = $fn-%write-history; last-cmd = ())
This creates a binding of the current definition of %write-history
to write
, and ()
(the empty list) to last-cmd
.
-
fn %write-history cmd {
This creates a new definition of %write-history
, with the bindings defined in the previous line.
Because of the let
line, the old definition of %write-history
is bound within this function.
This is a very common idiom in es, used for “spoofing” functions: extending their definitions to match user preferences.
The let
also bound the last-cmd
variable; it doesn't have any definition initially, but because this binding is created outside the function, the value persists across calls.
-
if {!~ $cmd $last-cmd} {
This compares $cmd
against $last-cmd
.
If they differ, then...
-
$write $cmd
We call the prior definition of %write-history
with the new $cmd
.
-
last-cmd = $cmd
Then, we set last-cmd
to $cmd
.
Because last-cmd
persists across function calls, this effectively saves this command for future calls.
This is essentially the same behavior as HISTCONTROL=ignoredups
in Bash, and it's reasonable to note that the Bash version is quite a bit more concise than the es one.
However, the es method has benefits “at scale”, when considering shell features in aggregate.
Many shells add behaviors by adding special variables and options with specific valid values or special “sub-languages” to configure specific behaviors.
When configuring many parts of the shell, this adds up to a huge set of things to remember and a culture of “special tricks” available in each shell.
Es takes a different approach instead, exposing the core behaviors of the shell in a way that allows users to customize them in arbitrary ways.
This combination of features creates a shell which is highly customizable and very scriptable, without carrying a huge bag of features to do so.
The difference can maybe be best illustrated as:
; man es | wc -l
1695
; man zshall | wc -l
29739
What's been happening with es?
Es was mostly developed over the course of 1992-1995.
The major development went through the release of version 0.84; 0.88 was released after a brief hiatus from the original authors, and then both of them got busy with life and work.
After that, maintainership passed through a couple hands, leading eventually to the current maintainer James Haggerty, but development was largely focused on keeping es functional and hosted over the decades as OSes and build systems have evolved.
This left the shell as an incomplete experiment: Paul and Byron weren't able to get around to implementing a good number of what they meant to, and their near- to medium-term plans certainly didn't sum up to everything the shell could be made to do.
However: at its core es has a simple and powerful design which removes a huge amount of the friction of shell scripting.
Its ethos of providing fewer and more powerful language and runtime mechanisms makes it relatively easy to know top to bottom.
It is, genuinely, an extremely elegant piece of software that I am very glad to use every day.
Es futures
So, where is es now, and where is it headed?
Well, there are a few major directions I would like to see the shell to go in:
-
More usage!
The current es community is small, and I would love to see it grow.
Packaging it for more OSes and Linux distros will help, as would more writing about the shell and more documentation online.
Quite a bit of existing knowledge about the shell is wrapped up inside the old mailing list or the source code, and users shouldn't be reasonably expected to dig around GitHub or years worth of old emails to understand a piece of software enough to use it effectively.
Better tooling would be helpful as well; syntax highlighting for editors, maybe even some kind of LSP integration, as well as reviving (and documenting) the esdebug
script.
-
“Modernization”: Recent work has been focused in part on updating the portability of es to a more current state, fixing it on modern Unices.
Now that that's mostly complete, I would like to add job control (or, at least, the ability to have job control; it's a whole thing) and programmable input, which would allow for more of those fancy interactive features present in other shells.
-
More extensibility: while es is already flexible, that flexibility is something that hasn't been fully exploited or developed.
Much of this is pretty experimental, so there aren't a lot of concrete designs.
Paul and Byron were looking at extensible parsing so that much of the shell's fancy syntactic sugar (redirections, pipes, backgrounding, backquotes) could be configured and changed within the shell.
The startup sequence of the shell is hard-coded and it could be done in the script itself.
Finally, the primitives backing many es commands do not necessarily have to be linked in with the shell statically.
Dynamic loading of libararies has been standardized and has good support across Unices, so with the right runtime support, the shell could have a default set of broad Unix primitives, augmented with loadable primitives for OS-specific behavior, such as GUI scripting in Haiku or Capsicum support in BSDs.
Adding these up, along with some smaller changes to do with the environment and signal delivery, could create a shell that could be essentially librarified in the same way as Tcl.
In fact, this was a long-term goal of the original authors.
-
Runtime improvements: es was never optimized in either runtime or memory to a meaningful degree, as it has always been largely experimental.
There are a few potential directions to improve: switching to a bytecode interpreter or something like it, optimizing shell data structure memory use, adding tail-call optimization, adding some continuation-like mechanism, or combination of those.
AST flattening seems like a promising direction in particular.
The GC could also use some re-evaluation.
Another practical benefit of this work would also be removing the current heavy reliance on setjmp(3)
/longjmp(3)
, which would make the shell much friendlier to integration with non-C code.
To write
- Job control and the extensible shell
- Effective es
- The shell-forward desktop
- Better documentation of the internals: GC, exceptions, that kind of thing.