Serving a web site from a shell script is fun and easy
This web site is served from an es script.
That's a pretty unique choice among websites, so it might be worth a little explanation as to why I did that, and why I haven't gotten annoyed about the choice.
We'll also go over some of the details of the script itself, at least in the form it takes as of this time of writing. The server code is here, and the git repository, including all the page sources and other content, is hosted here.
A tour of the server
The server loop
Right at the beginning of the server is the most important part.
It's what makes the whole thing go.
After a bit of messing about to get a $server-port depending on if the shell was started in a Docker container, we have:
if {~ $NCAT_SUBSHELL_MODE ()} {
local (NCAT_SUBSHELL_MODE = yes)
forever {ncat -k -l -p $server-port -e $0 || exit 3}
}
This bit of code is controlled by the $NCAT_SUBSHELL_MODE variable, which is unset when the script is originally invoked.
When this is the case, then the script sets $NCAT_SUBSHELL_MODE to yes (really the value doesn't matter, since the shell only ever checks whether there is any value), and runs an infinite loop of ncat commands.
This ncat is the part that handles the actual TCP networking.
Sorry if you thought the shell would natively handle that—es isn't quite capable of something like that yet.
In particular, this command is written for the ncat distributed as part of the nmap project; certain other versions of netcat lack the -e argument this server depends on.
The exact ncat invocation looks like this.
ncat -k -l -p $server-port -e $0
The -k and -l flags are what make ncat run as a TCP server.
The -p $server-port configures the port on which to listen.
This is set to 8080 when running within Docker, since that's the standard, and 8181 when running outside a Docker container, since that's more likely to be free on an arbitrary host.
The last flag, -e, configures ncat to execute a command when a request is received. The command can read its standard input to look at the request, and can write to standard output to specify the response.
ncat is actually pretty clever about this—it configures everything ahead of time so that instead of needing to buffer the response in memory, it sends it as the child process writes to stdout.
The argument we give to -e is the script itself, stored in $0.
When we invoke the subcommand, because we've set $NCAT_SUBSHELL_MODE, we skip running the server loop and instead move on to the rest of the script which handles the individual requests.
Request handling
While ncat does all the hard work of handling TCP networking, it doesn't actually do anything about HTTP, so that has to be implemented in the script.
So the first thing we do is define a respond function which takes a numeric status code code, a MIME type type, and optional flags to control things like caching or compression, and uses those arguments to print the headers of the reply.
Then we define a couple helper functions to serve a “page”, which is what we call our custom templated HTML files with smatterings of es code thrown in.
As an example, we can look at the actual source of the server source code page:
<; cat tmpl/header.html >
<title>jpco.io | The script that served this title</title>
<meta name=description content="The script that served this description">
<; build-nav /server.html >
<main>
<p>
This is the script that served this request, written in <a href=/es><i>es</i></a>.
<pre id=main-block>
<; sed -e 's/&/\&/g' -e 's/</\</g' < serve.es >
</pre>
</main>
Using this extremely basic templating system we give ourselves access to the shell within the page, and we use that to add the page header, the navigation bar, and print the server script.
The templates, such as they are, are the most obviously deficient part of this whole setup, but are good enough to serve what this site actually needs, which isn't much.
The build-page function reads these templatized files line by line and prints the output or runs the command for each line as appropriate.
The function build-nav, which is called from the page, is defined in the server script as it is nearly universal, and simply prints some HTML which formats the path argument given to it.
Then the serve-page function simply wraps up respond and build-page into a single convenient function call.
After these helper functions are defined, we get to the business of handling the request.
We read the method, the path (which we call reqpath to avoid colliding with the normal $path), and the HTTP version:
(method reqpath version) = <={%split ' ' <={~~ <=%read *\r}}
We have to handle the \r\ns in the request explicitly, which is annoying, but not too much of a problem.
Fortunately, ncat inserts \rs as necessary so we don't have to think about them when writing the responses.
After the first line, and a bit of handling for query strings, we move on to reading the headers.
# TODO: it would be nice if we made all the header names lowercase
let (header = ())
while {!~ <={header = <=%read} \r} {
let ((n v) = <={~~ $header *': '*\r})
if {!~ $#n 0} {
head-$n = $v
}
}
Here we read in headers and save the header values within variables of the form head-$name, so that $head-Host contains something like jpco.io.
The TODO here refers to the fact that HTTP headers are case-insensitive, so the headers Host or host (or, technically validly, hOsT) should be handled uniformly, but es is all case-sensitive.
So right now we don't handle that well.
Routing
After reading in the request, we have everything we need to serve the response.
The whole router is just one big if statement.
if (
# redirect www.jpco.io to jpco.io
{~ $head-host www.jpco.io || ~ $head-Host www.jpco.io} {
destination = https://jpco.io$reqpath
respond 301 text/plain
echo Redirecting to $destination ...
}
# draft built pages; only serve these locally
# before "real" pages so we can draft changes too
{!$IN_DOCKER && access -f draft/$reqpath^.es} {
serve-page draft/$reqpath^.es
}
{!$IN_DOCKER && access -f draft/$reqpath/index.html.es} {
serve-page draft/$reqpath/index.html.es
}
# built pages. don't cache these
{access -f page/$reqpath^.es} {
serve-page page/$reqpath^.es
}
{access -f page/$reqpath/index.html.es} {
serve-page page/$reqpath/index.html.es
}
# static files
{access -f static/$reqpath} {
serve static/$reqpath cache
}
# 404
{
respond 404 text/html
build-page < page/404.html.es $reqpath
}
)
Here's where it all comes together.
- If the request is coming to
www.jpco.io, redirect it to jpco.io.
- If we're in “dev mode” and not in a Docker container, serve the request as a page if it matches a file in the draft directory.
- If the request matches a file in the page directory, serve it as a page.
- If the request matches a static resource, serve it verbatim.
- Otherwise, serve the 404 page with a 404 code, since we didn't find anything.
How it's run
When I'm working on changes to the site, I can run this script as
; ./serve.es
and it works great.
Pages are always served live, so all I have to do is save the page I'm working on and reload.
The server is also always served live thanks to ncat -e $0, so unless I'm making a change to that little loop, I don't even need to re-run the server script when I make a change.
In “prod”, I package up the contents of the repo from HEAD as well as a fresh es built from HEAD and the couple of binary dependencies (ncat, man, file) into a Docker container and serve it from Google Cloud Run.
Cloud Run takes care of details around HTTPS for me, which is nice.
Building and deploying a new version is done with a command like the following:
@ image {gcloud builds submit --tag $image . && gcloud run deploy --platform managed --image=$image} gcr.io/jpco-cloud/web:0.76
I won't go into the Dockerfile here since it's pretty extremely basic, but it's in the repository for this site if anybody really wants to take a look.
Okay… but why?
This is obviously not a very good general web server.
It's relatively slow in the first place compared to something in a so-called blazingly-fast language, and I imagine it scales pretty poorly.
But none of that actually matters.
I didn't write this server to serve any web site, I wrote it to serve this web site, and this web site is really pretty dead simple, and it doesn't get very much traffic at all, so I don't care about complicated server-side logic, templating, or the degree to which the fast is blazing.
What I really want is exactly what this server gives me.
I want a really convenient environment to write new pages in without bothering with any sort of recompilation flow.
I want a router that is extremely simple but more flexible than files in directories.
And I want all of it without some kind of goofy toolchain, framework, or runtime dependencies that do more to get in my way than help me serve this extremely simple site.
I'm not a web developer so whenever I'm not actively working on this site, I'm not really thinking about any web technologies, so using fancy special-purpose tools is a net increase to my cognitive load, not the other way around.
Admittedly, there's also some aesthetic joy to it.
I prefer a website that's pretty bare, and I like to stay “close to the metal” of HTTP.
I like to have that little bit of extra control, since I'm not doing anything particularly fancy or high-stakes.
And I also just like to be able to say that I'm serving my personal web site from a shell script.