Giving the extensible shell extensible input
The way that es reads input from scripts or from an interactive terminal session has historically been badly lacking, compared to other shells.
Whereas interactive features tend to be a major, or the only, major attractive feature of many shells (those shells often being limited to POSIX-compatibility in the language itself), in es the support for interactivity is meager.
To some degree, this is intentional; es, coming from rc, tries to be relatively minimalist by default.
Unlike, say, fish, the default behavior when dropping into es for the first time is always probably going to be a simple prompt:
;
Defaults aside, though, es is intended to be extensible.
Even when the default behavior is lean or even simplistic, the shell ought to make it possible for a user to get a more advanced interactive setup akin to fish if that’s what they want.
The major reasons for the lack of extensiblity in es today are all due to accidents of implementation.
How input works
Like other shells, es can be given shell input in a few forms: a raw string (es -c); a named file (es script.es, or . script.es); or via the shell’s standard input, either non-interactively (curl | es) or interactively at a terminal.
When started, the shell creates a new Input object corresponding with the file or string the shell is meant to read and evaluate.
It decides whether the shell should be “interactive”, either because the -i flag was given or because the input is standard input and standard input is a TTY.
Based on this, the shell calls either %interactive-loop or %batch-loop.
These two loop functions are important (and complicated) enough that they deserve their own page of documentation, but at their core each of them are a loop containing a bit of code that looks something like this:
let (cmd = <={%parse $prompt}) {
if {!~ $#cmd 0} {
$cmd
}
}
This is a loop of essentially two steps:
%parse reads shell input and parses it, returning a parsed command which is bound to $cmd
- If
$cmd isn’t empty (as is the case with, say, an empty line) then it is evaluated
It makes plenty of sense that %parse parses its input, of course.
But why is %parse also responsible for reading shell input?
Why does it take $prompt as arguments?
Can that reading behavior be changed at all?
It’s that exact behavior that we would want to get at, if we want to make es capable of fancy interactivity à la fish or zsh.
But if we look at the definition of %parse, we get:
; whatis %parse
$&parse
It’s all locked away within the $&parse primitive.
$&parse does too much
Given what the $&parse primitive actually does, it would more accurately be called $&readfromshellinputandparse.
$&parse does do what it claims to, parse unstructured input to produce a shell syntax tree, but to actually get at that input, it reads from the shell’s Input using either special buffered read logic (which can’t be used anywhere else) or readline(3) (which also can’t be called any other way in es).
On top of that, the way es been historically implemented, no es script can be invoked while $&parse is running.
This last bit, which is the biggest problem of all, comes down to issues with memory management.
Es needs to track every bit of shell state for the sake of its garbage collector, but while the yacc-generated parser is running, some references to the shell’s data held by the parser will be unknown to es until parsing is complete, meaning that the GC can’t run correctly during parsing, meaning that normal es code (which requires the GC) can’t run during parsing.
This stinks!
So much happens inside of $&parse, and so much of that is made up of mysterious internal mechanisms (the Input) which are barely even visible to a user, that it is essentially only usable within the context of these REPL functions.
(Exercise for the reader: figure out a way to wrap %parse such that it can be fed an arbitrary string.)
And, because it has all of readline buried inside of it, there’s nothing like a $&readline primitive that users can call, either—and no way at all to use readline to get any input that won’t then be modified by the parser.
In a shell which has been praised for its orthogonality, this is a glaring deficiency.
As an exercise, let’s forget the existing implementation and imagine what a hypothetical $&parse should do.
At first blush, we want a $&parse which
- takes a string argument containing unstructured es code, and
- returns a structured AST.
However, this runs into a problem: for most languages, including es, it’s actually impossible to know beforehand how much input the parser will need before it’s done, because it depends directly on the syntactic structure of the input.
Consider the following contrived es snippet as an example:
{
echo <=%read
echo <=%read
} << EOF
hello world
goodbye world
EOF
echo hello world
echo goodbye world
The first seven lines must be parsed together, but the remaining three must be parsed separately.
But other than by actually parsing, it’s hard to know this; you have to track the {s and }s, and, worse, you have to know when the heredoc starts and ends.
You could, potentially, do a sort of pre-parsing pass to figure out how to delimit commands, but it isn’t worth it.
The best way to handle this is to simply feed input to $&parse line-by-line and allow it to tell you when it needs more input and when it’s done.
With that in mind, how would we change our imagined design for $&parse?
There are two tactics that parsers use for this.
A push-style parser is one where you get some input, call the parser with that one line of input, and check based on the parser’s return value if that was enough input to produce something, or if it wants more.
In es this kind of setup might look like:
let (line = (); cmd = ()) {
while {~ $cmd ()} {
line = <=%read
cmd = <={%parse $line}
}
$fn-%dispatch $cmd
}
Unfortunately, with this design, there’s a serious catch: now you have to worry about the state of the parser between calls to $&parse.
Because each %parse call can potentially leave the parser in a partially-done state, in order to have reliably correct behavior, you need some way to say either “keep working on what you’ve already got, parser” or “I’m starting fresh! Throw out all your state!”
This could be some kind of %parse-reset function, or it could be some kind of “parser handle”, where the shell tracks multiple parsers’ state with some ID, and the user can supply that ID to indicate which parse run they want to use.
This starts to get legitimately complicated, and it can all be avoided by using the other style of parser: pull-style.
This is how the internal yacc-generated parser in es works.
As a user, you just call the parser once, and when you do, you supply it some mechanism to request more input whenever it needs to.
In yacc, this mechanism is a function named yylex().
In es, since we have fancy things like lambda expressions, we can give %parse its reader command directly as an argument.
This might look like:
let (cmd = <={%parse %read}) {
if {!~ $#cmd 0} {
result = <={$fn-%dispatch $cmd}
}
}
The change between the original loop and what we have here is very small—we’ve just gone from the original {%parse $prompt} to {%parse %read}.
(We’ve lost $prompt in this change, but that will be addressed in a second.)
In this case, %read is that reader command that we give to %parse to call whenever it wants to read more shell input.
$&parse manages the Input, and redirects the standard input of %read to the shell input when calling it.
This reader command will be called at least once, and potentially many more times, until the parser receives enough input to produce a completely-parsed command.
This setup fixes the problems with state that the push-style parser would have, as there would be no more incomplete-parser state to track: every call to %parse would start with a fresh parser, and would run until the parser is done.
Rebuilding around this new $&parse
So, going with a pull-style $&parse and giving it a reader command that it uses to read from the shell’s Input seems like the way to go, in terms of simplifying $&parse.
But pulling out all the built-in reading logic that $&parse had and replacing it with %read loses all the interactive features we had come to expect, like readline, history, prompting, and all that.
So, let’s add that stuff back in.
Let’s reorient the design here so that instead of changing our loop functions, we’re instead producing a %parse function which works the same as before on top of our new, smaller-scoped $&parse primitive.
We have no way to call readline, so we will have to add one: call it $&readline.
We’ll talk about this new primitive in more detail later, but for now it suffices to say that $&readline should take one optional argument, a prompt, and return either a line of input or the empty list, with the same semantics as $&read.
(Conveniently, these are also essentially the exact calling semantics of the readline(3) function.)
To make things consistent for folks who do and do not include readline in their es, we define a %read-line function which wraps $&readline if present and otherwise is @ prompt {echo -n $prompt; %read}.
We already have a hook function for writing to history, %write-history, so we also need to add a call to that.
Putting these things together, with the appropriate scaffolding, creates a %parse which looks like:
fn %parse prompt {
if %is-interactive {
let (in = (); p = $prompt(1))
unwind-protect {
$&parse {
let (r = <={%read-line $p}) {
in = $in $r
p = $prompt(2)
result $r
}
}
} {
if {!~ $#fn-%write-history 0 && !~ $#in 0} {
%write-history <={%flatten \n $in}
}
}
} {
$&parse # fall back to built-in read
}
}
The new definition of %parse.
The interactive logic is wrapped in an %is-interactive check, for obvious reasons.
We also buffer the input that gets read and write it to shell history, and there is also a little logic for picking the right prompt at the right time.
Note that a couple specific behaviors become visible here, now that they’re in es script, and they can be changed:
- we use the first prompt before the first line, and the second prompt before each subsequent line
- we buffer a whole command before writing it to history, rather than each line to history individually
- we write inputs to history even when they cause
$&parse to throw an exception, such as with syntax errors
So now we have a new $&parse that has less built-in behavior, and a new %parse which wraps it to maintain the same behavior that it had before.
Reimplementing $&parse
This is the part of the page where I reveal that this change has already been made to es; as of quite recently, $&parse has been reimplemented to take a reader command, and %parse has been rewritten to wrap this new $&parse.
But, at the beginning here, I wrote that es was bound by its implementation to have a $&parse that could not invoke any es script, so what changed?
I’ll describe the problem in more detail before talking going over the solution.
Historically, heap-allocated memory in es was divided into two spaces: it would either be in “ealloc space”, describing the set of references which are allocated by malloc() or realloc() and which need to be manually free()d; or they would be in “GC space”, the particular set of references tracked by the garbage collector and automatically freed when appropriate.
Just about anything actually visible to an es user would be in GC space, including the parse tree constructed and eventually returned by $&parse.
But the parser code, being produced by a POSIX yacc, didn’t have any idea about the root list for es’ GC.
So while the parser was running and constructing the parse tree, there was a good chance that parse tree was live, but untracked, in GC space.
This meant that if a GC run ever occurred, those live references were at risk of being invalidated.
Fixing this took a couple different attempts.
My first angle was to fix the parser so that any live references it was holding would be in the GC’s root list.
I tried implementing this with a hand-written parser with the appropriate Ref() macro calls scattered throughout the parser code, but I abandoned this when it was turning out to be very slow, probably because of all the new root-tracking code in the parser.
My next attempt was to rewrite the parser using the lemon parser generator from the SQLite project.
Lemon is a pretty great parser generator which has two features in particular that I was interested in: it allows a lot of control over how the parser code is generated, and it generates a push-style parser.
Together, these meant that while the shell was reading input, all the state held by the parser could be encapsulated in a single Parser object which could be wired up so the entire parser could be GC’d.
This ended up running into some of the same problems as the first attempt.
In general, I was finding that swapping out the entire es parser was a major change, and had the potential to introduce a lot of bugs, both in terms of memory management and in terms of the parsing itself.
It was starting to feel absurd to do all this when I wasn’t actually trying to change the parser at all.
So I shelved the idea for a while.
What brought me back to the project was the realization that I didn’t have to change the parser in order to change how it performs memory management; I could change the memory management system instead.
So, to keep the parser’s untracked memory references safe during parsing, I moved them “out of the way”.
I added a third kind of memory, which I called pspace, for “parser space”.
Pspace is a variation on GC space, with one critical difference: it is only ever collected once, where exactly one pointer and its referents are copied out to some other space (typically GC space), at which point the pspace is destroyed.
The point of this is that while the shell can’t know which references are live during parsing, it certainly knows what’s live once parsing has finished—the parse tree.
This allowed me to get a proof-of-concept going of this new $&parse, at which point I immediately ran into a new problem.
I had forgotten that, because code coming from places like the environment is stored in string form, merely running es script often (and at unpredictable times) requires parsing it.
Being able to run the parser in the middle of another parse run, therefore, is necessary.
This was yet another hurdle for the shell’s yacc-generated parser, and this time, unfortunately, there was no clear workaround: a portable (POSIX) yacc-based compiler simply must rely on static (global) variables, which means that having two running concurrently in a single thread is not going to work.
However, while digging around, I found that yacc parser generators these days seem to be a total duopoly: you can have bison, or you can have byacc.
Unlike with C compilers, I couldn’t find any alternatives to try out at all.
So, I cheesed it—es can still be built with either bison or byacc, but it requires at least one non-portable extension which both of them support.
To make up for this, I decided to also put the generated parser into the repo, so that anybody just building the shell won’t need a yacc at all.
The extension in question is the line
%define api.pure full
which moves the statically-allocated variables involved in parsing—particularly yylval—into the parameter lists of the parser functions.
After this, the rest of the concurrent parsing work was a matter of moving other static variables into the stack.
I defined a new Parser struct which absorbed all of this state, as well as the members of Input which only live over the course of a single parse.
Finally, at this point, the changes to actually implement the $&parse described above could begin.
Doing this, after everything else, was actually reasonably simple.
There is a function fill() which is called to, well, fill the input buffer used by the lexical analyzer.
This function would previously call either readline or read (not the primitive; custom, internal logic of the Input).
Changing it to call a command was a matter of saving the command given to $&parse in the Parser, and then looking it up and calling it at fill() time.
The major effort had to do with decisions about small behavioral corner-cases with the EOF and NUL characters.
Historically, the EOF character had a fairly traumatic impact on an Input. If one was encountered, the Input would be put into what could be called “EOF mode”, where it would only ever again return EOF characters.
I suspect this was fairly robust for the rc-like behavior intended by the authors, but it was noted a long time ago that it prevented the shell from being capable of something like csh’s ignoreeof.
For those using the new-style $&parse, it can also lead to surprising behavior, where an EOF generated by the shell’s input file descriptor can lead to a reader command later not even being called—even if the reader command doesn’t read from that file descriptor.
So that has now been changed; more discretion is given to the reader command.
Another way that the reader command has more discretion than before is if there is no input file descriptor at all.
Previously, that would immediately trigger an EOF, but now the reader command is called, with its standard input redirected to /dev/null so that it has the chance to produce an EOF or not on its own.
As far as NUL characters go, that is largely a matter of the $&read primitive.…
The $&read and $&readline primitives
Es gained the $&read primitive relatively late in its Paul Haahr-implemented era, with the 0.9-alpha1 release.
As with some other features added during this period, it didn’t get much of a chance to be “battle tested”, which left it relatively unoptimized, and with some small bugs: in particular, it was completely incapable—to the point of crashing the shell—of reading NUL bytes.
Because of this, $&read needed some improvements before it could become the primary mechanism for reading shell input.
The first change was for performance: Historically, $&read would only read one byte at a time, so as to avoid over-reading from a non-seekable file descriptor such as a pipe.
This meant that just reading an eight-kilobyte script (the size of my own .esrc at the time of writing) would require eight thousand read(3) calls, which is a pretty significant inefficiency even with modern kernels and libcs which do their best to cache things.
This was fixed, for seekable file descriptors, by reading into a probably-larger-than-one-line buffer and seeking to the correct character before returning.
The second change was for NUL bytes.
The crash when $&read encountered a NUL was fixed a year or two ago, but because the goal at the time was to merely fix a crashing bug, it was only replaced with an exception.
This was still insufficient to replace the existing, intentional, tested behavior for shell input, which was to skip any NUL bytes and print a warning to standard error.
To make this work in a reasonably simple way, I changed $&read so that if it reads a line which contains a NUL character, it splits its return value on that NUL.
This should actually work quite well with things like GNU’s find -print0, but also allows skipping the NUL byte or throwing an exception on one.
However, it does create a problem: because reading NUL bytes is rare enough that it caused the shell to crash for decades without anybody bothering to introduce a fix, introducing a %read which almost always returns a list of length zero or one, but rarely returns more, creates a lurking mechanism for issues.
To fix this, %read is written by default to join any lines that have more than one element, effectively adding behavior to skip NULs.
This is reasonably consistent with other shells’ read builtins, and still allows users to opt into other forms of handling NULs when context demands it.
The freshly-added $&readline primitive also required some design decisions.
In particular, the major question is this: when working with a standard input which isn’t a terminal, what is the best behavior?
Should $&readline still call readline(3), with all the noisy prompting and echoing that implies, or should it configure readline to behave more like $&read, or should it just fall back to calling $&read directly?
My expectation is that a user who wants to use readline typically does not want to have the extra mess of readline unless they are reading from a TTY.
This is consistent with the behavior of the shell itself when it implicitly decides whether to be interactive.
Unfortunately, no mechanism currently exists in es to allow a user to test if a file descriptor is a TTY, and I have hesitated to add even more behavior to the shell for the sake of this change.
Because of this, $&readline falls back to $&read if its standard input is not a TTY.
In the future, this should probably be changed so that $&readline is more straightforwardly a wrapper around readline(3).
This decision also introduces a small backwards-incompatibility in the shell.
Previously, the following behavior would occur if input was given to es -i on standard input:
; echo 'echo hello world' | es -i
; echo hello world
hello world
; ;
What’s happening here is that, even though this es’ standard input is not a TTY, because -i has been explicitly provided and its input is the shell’s standard input, readline is called; the extra output here is its prompt and the input it reads being echoed to its standard output.
With the $&readline primitive, that logic has changed.
Instead of calling readline when the shell is interactive and the input is the shell’s input, readline is called when the shell is interactive and the input is a TTY.
This only really causes a change in the case given above:
; echo 'echo hello world' | es -i
hello world
;
Different shells have different behavior in this case, so the old behavior is clearly not a universal requirement, and the TTY-based behavior is easier to explain and reason about without getting into details of shell internals.
However, this is a change, and it marks a change from rc’s precedent, so it is worth calling out.
Performance
Despite some of the considerations above, the major downside of the implementation thus far is its impact on shell performance.
This impact is to some degree inevitable, since migrating any behavior from C to es is going to worsen performance.
However, most of it isn’t inherent to the change here, but is instead fixable in the long term.
Some of the changes are internal. For example, a number of buffers and other bits of state that were previously retained across parser calls are now allocated and freed each time, so with certain workloads, allocations have increased quite a bit.
Some of the changes are external: previously the shell would read scripts and other non-interactive input in multiple-kilobyte chunks, but now files are read line-by-line, which means that in order to read a script, calls to read(3) and lseek(3) are an order of magnitude more numerous.
Fixing this would require creating a read-in-chunks mechanism in es.
Some experimentation thus far suggests that this is the major factor, and that adding this read-in-chunks mechanism would make most of the performance impact disappear.
Implications and further work
At the beginning, I described a bit about how the shell internally creates an Input object and calls either %batch-loop or %interactive-loop.
The new $&parse, just like the old one, uses the file descriptor established in this internally-generated Input, now redirecting it to the standard input of the reader command.
While the reader command has been made external, this other logic—opening an input, picking a REPL function to use, and redirecting that input—are still all still internal, invisible, and unchangeable.
In particular, we still can’t just use $&parse as a normal function:
; # can you explain the following behavior?
; echo 'hello world' | echo <={$&parse $&read}
;
Exposing these built-in behaviors and making shell input less magic would be a good path forward.
An improvement on the current state of the art would be some mechanism for file descriptors (or, more generally, “handles”) to have lexical scope.
This would provide a more robust way to isolate state between parts of the shell runtime, which would be ideal.
I have been playing around with the idea of “variable-bound file handles”, a mechanism for opening an internal file descriptor which is bound directly to a variable, not exposing a user-visible file descriptor.
As I imagine it, these handles would be reference-counted and automatically closed when the last variable reference is dropped.
In addition, some metadata could be stored about the handle, such as its filename and, potentially, the current line number; this metadata could effectively eat much of what remains in the Input struct.
This work serves to help make $&parse into a more orthogonal, normal primitive, but it also does so with the readline library.
Previously, readline was relatively deeply integrated into the input files, and at one point support for the alternative editline library was dropped, presumably because supporting its integration was obnoxious.
Now, everything to do with readline is nearly fully isolated into a single file which defines a few primitives which can be hooked into the shell, plus a few bits of initial.es which do that hooking.
Formalizing and extending this structure for readline, and making it into a repeatable pattern for other optional libraries, seems like a promising path forward for the shell.
This deserves its own page, but the idea would be a mechanism to specify a C file which defines primitives and an es script which wraps those primitives in functions to integrate into the shell at build time.
This could reduce the stakes of contributing to upstream es by allowing people to offer code that is only included if opted-into at build time.
From this, further sophistication could be developed, such as dynamic runtime primitives using dlopen(3) and namespacing of primitives to better express notions such as “which version of $&parse are you using?”
And then, of course, there are extensibility improvements that have become possible for $&readline itself.
Now that it is possible to run es script while parsing, we can add custom key bindings and a completion hook (or multiple completion hooks, depending on the design).
Programmable completion logic would be useful in its own right, but custom key bindings should also allow for integration with things like the funny shell history server atuin, so that iterating through or searching shell history would consult the remote server.
These extensions to $&readline represent the most immediate and significant usability improvement for users of the shell who aren’t deep into implementation.