Commit | Line | Data |
---|---|---|
e7e3e593 FT |
1 | dirplex(1) |
2 | ========== | |
3 | ||
4 | NAME | |
5 | ---- | |
6 | dirplex - Physical directory handler for ashd(7) | |
7 | ||
8 | SYNOPSIS | |
9 | -------- | |
10 | *dirplex* [*-hN*] [*-c* 'CONFIG'] 'DIR' | |
11 | ||
12 | DESCRIPTION | |
13 | ----------- | |
14 | ||
15 | The *dirplex* handler maps URLs into physical files or directories, | |
16 | and, having found a matching file or directory, it performs various | |
17 | kinds of pattern-matching against its physical name to determine what | |
18 | handler to call in order to serve the request. The mapping procedure | |
19 | and pattern matching are described below. | |
20 | ||
21 | Having found a handler to serve a file or directory with, *dirplex* | |
22 | adds the `X-Ash-File` header to the request with a path to the | |
23 | physical file, before passing the request on to the handler. | |
24 | ||
25 | *dirplex* is a persistent handler, as defined in *ashd*(7). | |
26 | ||
27 | OPTIONS | |
28 | ------- | |
29 | ||
30 | *-h*:: | |
31 | ||
1406acb5 | 32 | Print a brief help message to standard output and exit. |
e7e3e593 FT |
33 | |
34 | *-N*:: | |
35 | ||
36 | Do not read the global configuration file `dirplex.rc`. | |
37 | ||
38 | *-c* 'CONFIG':: | |
39 | ||
40 | Read an extra configuration file. If 'CONFIG' contains any | |
41 | slashes, it is opened by that exact name. Otherwise, it is | |
42 | searched for in the same way as the global configuration file | |
43 | (see CONFIGURATION below). | |
44 | ||
45 | URL-TO-FILE MAPPING | |
46 | ------------------- | |
47 | ||
48 | Mapping URLs into physical files is an iterative procedure, each step | |
49 | looking in one single physical directory, starting with 'DIR'. For | |
50 | each step, a path element is stripped off the beginning of the rest | |
51 | string and examined, the path element being either the leading part of | |
5ba4cb3a FT |
52 | the rest string up until (but not including) the first slash, or the |
53 | entire rest string if it contains no slashes. If the rest string is | |
54 | empty, the directory being examined is considered the result of the | |
55 | mapping. Otherwise, any escape sequences in the path element under | |
56 | consideration are unescaped before examining it. | |
e7e3e593 | 57 | |
b70b2d4f | 58 | If the path element names a directory in the current directory, the |
5ba4cb3a FT |
59 | procedure continues in that directory, unless there is nothing left of |
60 | the rest string, in which case *dirplex* responds with a HTTP 301 | |
61 | redirect to the same URL, but ending with a slash. Otherwise, the | |
62 | remaining rest string begins with a slash, which is stripped off | |
63 | before continuing. If the path element names a file, that file is | |
64 | considered the result of the mapping (even if the rest string has not | |
65 | been exhausted yet). | |
e7e3e593 FT |
66 | |
67 | If the path element does not name anything in the directory under | |
68 | consideration, but contains no dots, then the directory is searched | |
69 | for a file whose name before the first dot matches the path | |
70 | element. If there is such a file, it is considered the result of the | |
71 | mapping. | |
72 | ||
73 | If the result of the mapping procedure is a directory, it is checked | |
dc387345 | 74 | for the presence of a file named by the *index-file* configuration |
e7e3e593 FT |
75 | directive (see CONFIGURATION below). If there is such a file, it is |
76 | considered the final result instead of the directory itself. If the | |
77 | index file name contains no dots and there is no exact match, then, | |
78 | again, the directory is searched for a file whose name before the | |
79 | first dot matches the index file name. | |
80 | ||
b70b2d4f FT |
81 | See also 404 RESPONSES below. |
82 | ||
e7e3e593 FT |
83 | CONFIGURATION |
84 | ------------- | |
85 | ||
86 | Configuration in *dirplex* comes from several sources. When *dirplex* | |
87 | starts, unless the *-N* option is given, it tries to find a global | |
f9a65eb2 FT |
88 | configuration file named `dirplex.rc`. It looks in `$HOME/.ashd/etc`, |
89 | and then in all directories named by the *PATH* environment variable, | |
90 | appended with `../etc/ashd`. For example, then, if *PATH* is | |
91 | `/usr/local/bin:/bin:/usr/bin`, the directories `$HOME/.ashd/etc`, | |
92 | `/usr/local/etc/ashd`, `/etc/ashd` and `/usr/etc/ashd` are searched | |
93 | for `dirplex.rc`, in that order. Only the first file found is used, | |
94 | should there exist several. | |
e7e3e593 FT |
95 | |
96 | If the *-c* option is given to *dirplex*, it too specifies a | |
97 | configuration file to load. If the name given contains any slashes, it | |
98 | is opened by that exact name. Otherwise, it is searched for in the | |
99 | same manner as the global configuration file. | |
100 | ||
101 | In addition, all directories traversed by *dirplex* when mapping a URL | |
102 | into a physical file may contain a file called `.htrc`, which may | |
103 | specify extra configuration options for all files in and beneath that | |
104 | directory. | |
105 | ||
106 | `.htrc` files are checked periodically and reread if changed. The | |
107 | global configuration file and any file named by the *-c* option, | |
108 | however, are never reexamined. | |
109 | ||
110 | When using the configuration files for deciding what to do with a | |
111 | found file, they are examined in order of their "distance" from that | |
112 | file. `.htrc` files found in the directory or directories containing | |
113 | the file are considered "closest" to the file under consideration, | |
114 | followed by any configuration file named by the *-c* option, followed | |
115 | by the global configuration file. | |
116 | ||
117 | Each configuration file is a sequence of configuration stanzas, each | |
118 | stanza being an unindented starting line, followed by zero or more | |
119 | indented follow-up lines adding options to the stanza. The starting | |
120 | line of a stanza is referred to as a "configuration directive" | |
121 | below. Each line is a sequence of whitespace-separated words. A word | |
122 | may contain whitespace if such whitespace is escaped, either by | |
123 | enclosing the word in double quotes, or by escaping individual | |
124 | whitespace characters with a preceding backslash. Backslash quoting | |
125 | may also be used to treat double quotes or another backslash literally | |
126 | as part of the word. Empty lines are ignored, and lines whose first | |
127 | character after leading whitespace is a hash character (`#`) are | |
128 | treated as comments and ignored. | |
129 | ||
fda48525 | 130 | The following configuration directives are recognized: |
e7e3e593 | 131 | |
aa7e4406 FT |
132 | *include* ['FILENAME'...]:: |
133 | ||
16c2bec3 | 134 | Read the named files and act as if their contents stood in |
aa7e4406 FT |
135 | place of the *include* stanza. A 'FILENAME' may be a glob |
136 | pattern, in which case all matching files are used, sorted by | |
137 | their filenames. If a 'FILENAME' is a relative path, it is | |
138 | treated relative to the directory containing the file from | |
139 | which the *include* stanza was read, even if the inclusion has | |
140 | been nested. Inclusions may be nested to any level. | |
141 | ||
e7e3e593 FT |
142 | *index-file* ['FILENAME'...]:: |
143 | ||
144 | The given 'FILENAMEs' are used for finding index files (see | |
145 | URL-TO-FILE MAPPING above). Specifying *index-file* overrides | |
146 | entirely any previous specification in a more distant | |
147 | configuration file, rather than adding to it. Zero 'FILENAMEs' | |
148 | may be given to turn off index file searching completely. The | |
149 | *index-file* directive accepts no follow-up lines. | |
150 | ||
a19b6c77 FT |
151 | *dot-allow* ['PATTERN'...]:: |
152 | ||
153 | As described under 404 RESPONSES, a path element beginning | |
154 | with a dot character is normally rejected by default, but the | |
155 | *dot-allow* directive allows certain dot-files or -directories | |
156 | to be selectively allowed. Each 'PATTERN' is an ordinary glob | |
157 | pattern, the matching of which allows access to a given path | |
158 | element. When checking for access to dot-files or | |
159 | -directories, only the *dot-allow* directive "closest" to the | |
160 | file under consideration is used. It should be noted that the | |
161 | default configuration file for *dirplex* contains a | |
162 | *dot-allow* directive for the `.well-known` directory. | |
163 | ||
e7e3e593 FT |
164 | *child* 'NAME':: |
165 | ||
166 | Declares a named, persistent request handler (see *ashd*(7) | |
167 | for a more detailed description of persistent handlers). It | |
168 | must contain exactly one follow-up line, *exec* 'PROGRAM' | |
169 | ['ARGS'...], specifying the program to execute and the | |
170 | arguments to pass it. If given in a `.htrc` file, the program | |
171 | will be started in the same directory as the `.htrc` file | |
172 | itself. The *child* stanza itself serves as the identity of | |
173 | the forked process -- only one child process will be forked | |
174 | per stanza, and if that child process exits, it will be | |
175 | restarted the next time the stanza would be used. If a `.htrc` | |
176 | file containing *child* stanzas is reloaded, any currently | |
177 | running children are reused for *child* stanzas in the new | |
178 | file with matching names (even if the *exec* line has | |
179 | changed). | |
180 | ||
181 | *fchild* 'NAME':: | |
182 | ||
183 | Declares a named, transient request handler (see *ashd*(7) for | |
16c2bec3 | 184 | a more detailed description of transient handlers). It must |
67223ca4 | 185 | contain exactly one follow-up line, *exec* 'PROGRAM' |
e7e3e593 FT |
186 | ['ARGS'...], specifying the program to execute and the |
187 | arguments to pass it. In addition to the specified arguments, | |
188 | the HTTP method, raw URL and the rest string will be appended | |
9f974c1f FT |
189 | as described in *ashd*(7). If given in a `.htrc` file, the |
190 | program will be started in the same directory as the `.htrc` | |
191 | file itself. | |
e7e3e593 | 192 | |
5ff7def2 | 193 | *match* ['TYPE']:: |
e7e3e593 FT |
194 | |
195 | Specifies a filename pattern-matching rule. The | |
196 | pattern-matching procedure and the follow-up lines accepted by | |
197 | this stanza are described below, under MATCHING. | |
198 | ||
54490135 | 199 | *capture* 'HANDLER' ['FLAGS']:: |
e7e3e593 FT |
200 | |
201 | Only meaningful in `.htrc` files. If a *capture* directive is | |
202 | specified, then the URL-to-file mapping procedure as described | |
203 | above is aborted as soon as the directory containing the | |
204 | `.htrc` file is encountered. The request is passed, with any | |
205 | remaining rest string, to the specified 'HANDLER', which must | |
fda48525 | 206 | be a named request handler specified either in the same |
e7e3e593 | 207 | `.htrc` file or elsewhere. The *capture* directive accepts no |
16c2bec3 | 208 | follow-up lines. Note that the `X-Ash-File` header is not |
257cd4a2 FT |
209 | added to requests passed via *capture* directives. Normally, |
210 | *capture* directives will be ignored if they appear in the | |
211 | root directory that *dirplex* serves, but not if 'FLAGS' | |
212 | contain the character `D`. | |
54490135 | 213 | |
e7e3e593 FT |
214 | MATCHING |
215 | -------- | |
216 | ||
217 | When a file or directory has been found by the mapping procedure (see | |
218 | URL-TO-FILE MAPPING above), the name of the physical file is examined | |
219 | to determine a request handler to pass the request to. Note that only | |
220 | the physical file name is ever considered; any logical request | |
221 | parameters such as the request URL or the rest string are entirely | |
222 | ignored. | |
223 | ||
224 | To match a file, any *match* stanzas specified by any `.htrc` file or | |
225 | in the global configuration files are searched in order of their | |
5ff7def2 FT |
226 | "distance" (see CONFIGURATION above) from the actual file. Which |
227 | *match* stanzas are considered depends on the type of the file being | |
228 | matched: if an ordinary file is being matched, only *match* stanzas | |
229 | without any 'TYPE' parameter are considered, while if it is a | |
859d7a3d | 230 | directory, only those with the 'TYPE' parameter specified as |
5ff7def2 FT |
231 | *directory* are considered. 'TYPE' can also take the value *notfound*, |
232 | described below under 404 RESPONSES. | |
e7e3e593 FT |
233 | |
234 | A *match* stanza must contain at least one follow-up line specifying | |
235 | match rules. All rules must match for the stanza as a whole to match. | |
236 | The following rules are recognized: | |
237 | ||
238 | *filename* 'PATTERN'...:: | |
239 | ||
240 | Matches if the name of the file under consideration matches | |
241 | any of the 'PATTERNs'. A 'PATTERN' is an ordinary glob | |
242 | pattern, such as `*.php`. See *fnmatch*(3) for more | |
243 | information. | |
244 | ||
245 | *pathname* 'PATTERN'...:: | |
246 | ||
edcd094e | 247 | Matches if the entire path of the file under consideration |
e7e3e593 FT |
248 | matches any of the 'PATTERNs'. A 'PATTERN' is an ordinary glob |
249 | pattern, except that slashes are not matched by wildcards. See | |
edcd094e FT |
250 | *fnmatch*(3) for more information. If a *pathname* rule is |
251 | specified in a `.htrc` file, the path will be examined as | |
252 | relative to the directory containing the `.htrc` file, rather | |
253 | than to the root directory being served. | |
e7e3e593 FT |
254 | |
255 | *default*:: | |
256 | ||
257 | Matches if and only if no *match* stanza without a *default* | |
16c2bec3 | 258 | rule matches (in any configuration file). |
e7e3e593 | 259 | |
7711283c FT |
260 | *local*:: |
261 | ||
2f942860 FT |
262 | Valid only in `.htrc` files, *local* matches if and only if |
263 | the file under consideration resides in the same directory as | |
264 | the containing `.htrc` file. | |
7711283c | 265 | |
e7e3e593 FT |
266 | In addition to the rules, a *match* stanza must contain exactly one |
267 | follow-up line specifying the action to take if it matches. The | |
268 | following actions are recognized: | |
269 | ||
270 | *handler* 'HANDLER':: | |
271 | ||
272 | 'HANDLER' must be a named handler (see CONFIGURATION | |
273 | above). The named handler is searched for not only in the same | |
274 | configuration file as the *match* stanza, but in all | |
275 | configuration files that are valid for the file under | |
276 | consideration, in order of distance. As such, a more deeply | |
277 | nested `.htrc` file may override the specified handler without | |
278 | having to specify any new *match* stanzas. | |
279 | ||
280 | *fork* 'PROGRAM' ['ARGS'...]:: | |
281 | ||
282 | Run a transient handler for this file, as if it were specified | |
283 | by a *fchild* stanza. This action exists mostly for | |
284 | convenience. | |
285 | ||
8cc893f5 FT |
286 | A *match* stanza may also contain any number of the following, |
287 | optional directives: | |
77a840e5 FT |
288 | |
289 | *set* 'HEADER' 'VALUE':: | |
290 | ||
291 | If the *match* stanza is selected as the match for a file, the | |
292 | named HTTP 'HEADER' in the request is set to 'VALUE' before | |
293 | passing the request on to the specified handler. | |
294 | ||
8cc893f5 FT |
295 | *xset* 'HEADER' 'VALUE':: |
296 | ||
fda48525 | 297 | *xset* does exactly the same thing as *set*, except that |
8cc893f5 FT |
298 | 'HEADER' is automatically prepended with the `X-Ash-` |
299 | prefix. The intention is only to make configuration files | |
300 | look nicer in this very common case. | |
301 | ||
b70b2d4f FT |
302 | 404 RESPONSES |
303 | ------------- | |
304 | ||
16c2bec3 | 305 | A HTTP 404 response is sent to the client if |
b70b2d4f | 306 | |
16c2bec3 FT |
307 | * The mapping procedure fails to find a matching physical file; |
308 | * A path element is encountered during mapping which, after URL | |
309 | unescaping, either begins with a dot or contains slashes; | |
310 | * The mapping procedure finds a file which is neither a directory nor | |
ca69b584 | 311 | a regular file (nor a symbolic link to any of the same); |
16c2bec3 FT |
312 | * An empty, non-final path element is encountered during mapping; or |
313 | * The mapping procedure results in a file which is not matched by any | |
b70b2d4f FT |
314 | *match* stanza. |
315 | ||
5ff7def2 FT |
316 | By default, *dirplex* will send a built-in 404 response, but there are |
317 | two ways to customize the response: | |
318 | ||
319 | First, *match* stanzas with the type *notfound* will be matched | |
320 | against any request that would result in a 404 error. The filename for | |
321 | such matching is that of the last succesfully found component, which | |
322 | may be a directory, for example in case a name component could not be | |
323 | found in the real filesystem; or a file, for example in case a file | |
324 | was found, but not matched by any *match* stanzas. | |
325 | ||
326 | Otherwise, any request that would result in a 404 response but is | |
327 | matched by no *notfound* stanza is instead passed to a default handler | |
328 | named `.notfound`, which is handled internally in *dirplex* by | |
329 | default, but may be overridden just as any other handler may be in a | |
330 | `.htrc` file or by global configuration. Note, however, that any | |
331 | request not matched by a *notfound* stanza will not have the | |
332 | `X-Ash-File` header added to it. | |
b70b2d4f FT |
333 | |
334 | The built-in `.notfound` handler can also be used in *match* or | |
16c2bec3 FT |
335 | *capture* stanzas (for example, to restrict access to certain files or |
336 | directories). | |
e7e3e593 FT |
337 | |
338 | EXAMPLES | |
339 | -------- | |
340 | ||
341 | The *sendfile*(1) program can be used to serve HTML files as follows. | |
342 | ||
343 | -------- | |
eb968b93 FT |
344 | fchild send |
345 | exec sendfile | |
346 | ||
e7e3e593 | 347 | match |
fda48525 | 348 | filename *.html *.htm |
eb968b93 FT |
349 | xset content-type text/html |
350 | handler send | |
e7e3e593 FT |
351 | -------- |
352 | ||
353 | Assuming the PHP CGI interpreter is installed on the system, PHP | |
354 | scripts can be used with the following configuration, using the | |
355 | *callcgi*(1) program. | |
356 | ||
357 | -------- | |
16c2bec3 FT |
358 | # To use plain CGI, which uses more resources per handled request, |
359 | # but less static resources: | |
e7e3e593 FT |
360 | fchild php |
361 | exec callcgi -p php-cgi | |
16c2bec3 FT |
362 | |
363 | # To use FastCGI, which keeps PHP running at all times, but uses less | |
364 | # resources per handled request: | |
365 | child php | |
366 | exec callfcgi multifscgi 5 php-cgi | |
367 | ||
e7e3e593 FT |
368 | match |
369 | filename *.php | |
370 | handler php | |
371 | -------- | |
372 | ||
373 | If there is a directory without an index file, a file listing can be | |
374 | automatically generated by the *htls*(1) program as follows. | |
375 | ||
376 | -------- | |
377 | match directory | |
378 | default | |
379 | fork htls | |
380 | -------- | |
381 | ||
16c2bec3 FT |
382 | The following configuration can be placed in a `.htrc` file in order |
383 | to dedicate the directory containing that file to some external SCGI | |
384 | script engine. Note that *callscgi*, and therefore the script engine | |
fda48525 FT |
385 | itself, is started in the same directory, so that arbitrary code |
386 | modules or data files can be put directly in that directory and be | |
387 | easily found. | |
e7e3e593 FT |
388 | |
389 | -------- | |
390 | child foo | |
391 | exec callscgi scgi-wsgi -p . foo | |
392 | ||
393 | capture foo | |
394 | -------- | |
395 | ||
396 | AUTHOR | |
397 | ------ | |
398 | Fredrik Tolf <fredrik@dolda2000.com> | |
399 | ||
400 | SEE ALSO | |
401 | -------- | |
402 | *ashd*(7) |