Ashd is a modular HTTP server based on a multi-program architecture. Whereas most other HTTP servers are monolithic programs with, perhaps, loadable modules, Ashd is composed of several different programs, each of which handles requests in different ways, passing requests to each other over a simple protocol (not unlike Unix pipelines). The design of Ashd brings with it a number of nice properties, the following being the most noteworthy ones.
htparser
, as long as one does
not count its, quite optional, SSL implementation) is implemented in
less than 1,000 lines of C code (and most are considerably smaller
than that), allowing them to be easily studied and understood.userplex
program ensures that serving of user home
directories (/~user/
URLs, if you will) only happens by
code that is actually logged in as the user in question; and
the htparser
program, being the only program which speaks
directly with the clients, can run perfectly well as a non-user
(like nobody
) and be chroot'ed into an empty
directory.dirplex
program, which only handles service from
physical directories, to care about virtual directories, virtual
hosts, HTTP protocol parameters or authentication; just as there is no
need for the patplex
pattern matcher to know about file
types or directory hierarchies. Each program's configuration file
format can be kept as simple as possible, and indeed most programs
lack configuration files entirely and are configured simply with
command-line options.Ashd can be said to be rather mature by now. Having tested it on moderately busy sites (see the Performance section below for an example), no crashes or other signs of instability have been observed over years of continuous operation, and it has not displayed any problems with any particular user-agents. It does lack a few features present in other HTTP servers, but nothing that I, for one, have experienced as a problem; and it also supports a few features not always present in other servers (such as chunked request-bodies).
Though the server as a whole is called "Ashd", there
is no actual program by that name. The htparser
program
of Ashd implements a minimal HTTP server. It speaks HTTP (1.0 and 1.1)
with clients, but it does not know the first thing about actually
handling the requests it receives. Rather, having started a handler
program as specified on the command-line when started, it packages the
requests up and passes them (with Unix socket file-descriptor passing)
to that handler program. That handler program may choose to only look
at part of the URL and pass the request on to other handler programs
based on what it sees. In that way, the handler programs form a
tree-like structure, corresponding roughly to the URL space of the
server. In order to do that, the packaged request which is passed
between the handler programs contains the part of the URL which
remains to be parsed, referred to as the "rest string" or the "point"
(in deference to Emacs parlance).
For an actual, technical description of the architecture and protocols, see the ashd(7) manpage.
As a concrete example, here is how the request
to /~fredrik/ashd/index
is handled by this particular
server.
htparser
. It
sets the rest string to ~fredrik/ashd/index
and passes it
to the patplex
process that it was instructed to start by
way of command-line argument.patplex
program, instructed by its configuration
file, recognizes the initial tilde of the rest string, strips it off,
and passes the request to the userplex
program. If userplex
is not already running, it starts
it, passing the control socket (over which requests are passed) on its
standard input, with command-line arguments as specified in
the patplex
configuration file.fredrik/ashd/index
, the userplex
program strips off the rest string until the first slash, treating the
stripped-off part as a username, fredrik
. Having done
some tests (as configurable with command-line options) to determine
that the username is valid, it checks to see if it has a request
handler already running for that user. If not, it forks off, logs in
as the user in question and starts a request handler. The request
handler can be explicitly provided by the user by creating an
executable file named ~/.ashd/handler
, but is otherwise
started as specified on userplex
's command line;
normally, and in this case, an instance of the dirplex
program.dirplex
program receives the request with the
rest string set to ashd/index
. Having been instructed (by
way of command-line arguments) to handle the physical
directory ~/htpub
, it starts chipping off slash-separated
elements of the rest string. Starting with the ashd
element, it finds a directory under htpub
with that name,
and interprets the next element, index
, relative to
it. Finding no entry by that exact name, it looks more thoroughly and
finds index.html
instead. Having found the physical
file ~/htpub/ashd/index.html
, it does pattern matching on
that physical filename according to its configuration, finding that it
should fork out the sendfile
program to handle the
request.sendfile
program handles the request by sending
the file contents exactly as they are back to htparser
over the socket passed between the various handler programs, and then,
not being a persistent program, exits. The only
thing sendfile
does with the rest string is to check that
it is now empty. htparser
itself takes care of any
chunking or other transfer encoding that might be necessary for HTTP
keep-alive.The closest thing to a screenshot, the following text dump is an example of how an Ashd process tree might look.
$ ps -AH lS F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND 1 65534 2216 1 20 0 24628 908 ? Ss ? 1:54 /usr/local/bin/htparser -Sf -p /var/run/ashd.pid -u nobody -r /var/tmp plain -- errlogger -n ashd patplex /usr/local/etc/ashd/rootpat 0 0 2215 1 20 0 3904 512 ? Ss ? 0:00 errlogger -n ashd patplex /usr/local/etc/ashd/rootpat 0 0 2225 2215 20 0 4012 552 ? S ? 0:03 patplex /usr/local/etc/ashd/rootpat 4 0 2495 2225 20 0 129380 680 ? S ? 0:00 sudo -u www-data accesslog /var/log/http/access.log dirplex /srv/www/htdocs 4 33 2496 2495 20 0 3928 412 ? S ? 0:03 accesslog /var/log/http/access.log dirplex /srv/www/htdocs 0 33 2497 2496 20 0 3944 644 ? S ? 57:35 dirplex /srv/www/htdocs 0 33 2518 2497 20 0 266024 17404 ? S ? 2:10 /usr/bin/python /usr/local/bin/ashd-wsgi ashd.wsgidir 0 33 4032 2497 20 0 4140 620 ? S ? 0:00 callfcgi multifscgi 5 php-cgi 0 33 4033 4032 20 0 3900 364 ? S ? 0:00 multifscgi 5 php-cgi 0 33 4034 4033 20 0 247204 2332 ? S ? 0:01 php-cgi 0 33 4035 4033 20 0 247204 2400 ? S ? 0:01 php-cgi 0 33 4036 4033 20 0 248508 568 ? S ? 0:01 php-cgi 0 33 4037 4033 20 0 247204 2340 ? S ? 0:01 php-cgi 0 33 4038 4033 20 0 248240 3084 ? S ? 0:01 php-cgi 0 33 1080 2497 20 0 3932 488 ? S ? 0:00 callcgi GET /gitweb/?p=ashd.git;a=blame;f=src/htparser.c;hb=HEAD 0 33 1081 1080 20 0 143944 11136 ? S ? 0:00 /usr/bin/perl gitweb/index.cgi gitweb/index.cgi 0 33 1088 1081 20 0 9780 1344 ? D ? 0:00 /usr/bin/git --git-dir=/srv/git/r/ashd.git blame -p HEAD -- src/htparser.c 0 0 3297 2225 20 0 12344 584 ? S ? 0:00 userplex -g users -d public_html dirplex -c apache-compat public_html 4 504 3298 3297 20 0 3944 636 ? Ss ? 0:00 dirplex -c apache-compat public_html 4 500 3344 3297 20 0 3928 552 - Ss ? 0:01 accesslog -a /home/fredrik/.ashd/log/access dirplex htpub 0 500 3419 3344 20 0 3944 664 - S ? 0:01 dirplex htpub 0 500 3420 3419 20 0 238960 5252 - Sl ? 2:08 /usr/bin/python3 /usr/local/bin/ashd-wsgi3 -m /home/fredrik/.ashd/sockets/pdm3 ashd.wsgidir 0 500 4044 3419 20 0 119412 1672 - S ? 0:14 psendfile 0 500 2159 3419 20 0 3932 464 - S ? 0:00 htextauth -s ./auth -- dirplex -c ./sub.cf /home/pub 0 500 2160 2159 20 0 3944 524 - S ? 0:03 dirplex -c ./sub.cf /home/pub 0 500 31056 3419 20 0 4140 496 - S ? 0:00 callfcgi php-cgi 0 500 31057 31056 20 0 247456 732 - S ? 0:00 php-cgi 4 506 3586 3297 20 0 3944 664 ? Ss ? 0:03 dirplex -c apache-compat public_html 0 506 3830 3586 20 0 7728 1732 ? S ? 0:00 callfcgi php-cgi 0 506 15184 3830 20 0 247464 5772 ? S ? 0:00 php-cgi 4 505 4045 3297 20 0 3944 600 ? Ss ? 0:00 dirplex -c apache-compat public_html 4 507 6376 3297 20 0 3944 496 ? Ss ? 0:00 dirplex -c apache-compat public_html 4 510 9476 3297 20 0 3944 632 ? Ss ? 0:00 dirplex -c apache-compat public_html 4 1000 12610 3297 20 0 3944 480 ? Ss ? 0:00 dirplex -c apache-compat public_html 4 513 24954 3297 20 0 3944 524 ? Ss ? 0:00 dirplex -c apache-compat public_html 0 513 24955 24954 20 0 4140 520 ? S ? 0:00 callfcgi php-cgi 0 513 24956 24955 20 0 249788 800 ? S ? 0:00 php-cgi 4 515 27761 3297 20 0 3944 472 ? Ss ? 0:00 dirplex -c apache-compat public_html 4 502 18758 3297 20 0 3944 524 ? Ss ? 0:00 dirplex -c apache-compat public_html $
The Ashd programs of primary interest are the following:
htparser
htparser
is the program
that listens to TCP connections and speaks HTTP with the clients.dirplex
dirplex
is the program used for serving files from
actual directories, in a manner akin to how most other HTTP servers
work. In order to do that, dirplex
maps URLs into
existing physical files, and then performs various kinds of
pattern-matching against the names of those physical files to
determine the program to call to actually serve them.patplex
patplex
can be used to
implement such things as virtual directories or virtual hosts.sendfile
dirplex
for serving ordinary files. It
handles caching using the Last-Modified and related headers. It also
handles MIME-type detection if a specific MIME-type was not
specified.callcgi
userplex
/~user/
URLs. When a request is made
for the directory of a specific user, it makes sure that the request
handler runs as the user in question. This functionality was actually
what prompted me to begin writing Ashd as a whole, since I was
severely annoyed by the fact that Apache serves user directories as
the www-data
(or similar) user. Serving a user directory
properly as its owner ensures both that all dynamic content can access
all the relevant files they may need, that any files they create or
modify can be properly owned by the right user and that no other users
need access to one's home directory; and that one user cannot violate
the "web space" of other users just by running PHP scripts to do
that. It also relieves the web server from various weird security
considerations which comes from trusting users with running code as
another user.Outside the main cast, there are also the
htls
,
accesslog
,
htextauth
,
callscgi
,
callfcgi
,
httimed
,
httrcall
,
htpipe
,
errlogger
,
psendfile
and multifscgi
programs.
There is also a Python module, which comes with
the ashd-wsgi
and scgi-wsgi
programs for
serving WSGI scripts and an undocumented program for serving files
with server-side includes. It also contains rather general
(documented) modules for writing custom Ashd handlers very
conveniently. There are versions of the Python module and programs for
both Python 2 and Python 3. The Python 2 module has been verified to
work with Jython.
Ashd is primarily documented in the same manual pages that this
page links to. For a practical introduction,
read the
accompanying INSTALL file and/or see the simple configuration
examples that are included in the
examples
directory of the source tree.
The latest release of Ashd is 0.13. Download it here.
The latest release of the Python module is 0.5. Download the Python 2 version here, or the Python 3 version here.
The latest source code is available through
Git
at <git://git.dolda2000.com/ashd>
,
also viewable through Gitweb.
Ashd has, at least to my knowledge, not been extensively benchmarked, so its performance characteristics are not well known. It should also be noted that optimization has not been a priority when writing it, with precedence given to brevity and clarity. (Which, on the other hand, means that if optimization should at some point be necessary, there should be much low-hanging fruit to pick.)
The closest thing I have done to benchmarking on Ashd is running it to serve the moderately busy site havenandhearth.com, where most of the traffic consists of static files. There is dynamic content as well, but it receives far less traffic. On this site, Ashd serves on average about 1.5 million requests per day on about 100 simultaneous HTTP connections, with temporary peaks of slightly above 100 requests per second on 1000-1500 simultaneous connections. A good portion (I would estimate it to about 20%) of the traffic happens via HTTPS. Under these circumstances, the programs involved in the most common requests consume CPU time as follows.
Program | Average CPU usage |
---|---|
htparser | 0.53% |
patplex | 0.041% |
dirplex | 0.12% |
sendfile | 1.3% |
accesslog | 0.036% |
Total | 2.0% |
The above measurements are calculated from the cumulative CPU time used by the respective programs after having run for several weeks. By comparison, the PHP engine running the site's discussion forum, which receives about 100,000 requests per day, uses 5.4% CPU. The CPU is an Intel Core i7-920.
In this context, it should be noted that the multi-process
architecture of Ashd makes it inherently parallel to some degree,
despite the individual programs being single-threaded. It is probably
to be expected that htparser
will be the first
bottleneck, particularly because of its single-threaded nature.