1# This files contains examples and an explanation for the RULESFILE / RULE
2# feature.
3#
4# Rules for Lynx are experimental.  They provide a rudimentary capability
5# for URL rejection and substitution based on string matching.
6# Most users and most installations will not need this feature, it is here
7# in case you find it useful.  Note that this may change or go away in
8# future releases of Lynx; if you find it useful, consider describing your
9# use of it in a message to <lynx-dev@nongnu.org>.
10#
11# Syntax:
12# =======
13# Summary of common forms:
14#
15#   Fail           URL1
16#   Map            URL1  URL2      [CONDITION]
17#   Pass           URL1  [URL2]    [CONDITION]
18#   Redirect       URL1  URL2      [CONDITION]
19#   RedirectPerm   URL1  URL2      [CONDITION]
20#   UseProxy       URL1  PROXYURL  [CONDITION]
21#   UseProxy       URL1  "none"    [CONDITION]
22#
23#   Alert          URL1  MESSAGE   [CONDITION]
24#   AlwaysAlert    URL1  MESSAGE   [CONDITION]
25#   UserMsg        URL1  MESSAGE   [CONDITION]
26#   InfoMsg        URL1  MESSAGE   [CONDITION]
27#   Progress       URL1  MESSAGE   [CONDITION]
28#
29# As you may have guessed, comments are introduced by a '#' character.
30# Rules have the general form
31#   Operator  Operand1  [Operand2]  [CONDITION]
32# with words separated by whitespace.  Words containing space can be quoted
33# with "double quotes".  Although normally this should not be necessary
34# necessary for URLs, it has to be used for MESSAGE Operands in Alert etc.
35# See below for an explanation of the optional CONDITION.
36#
37# Recognized operators are
38#
39#   Fail  URL1
40# Reject access to this URL, stop processing further rules.
41#
42#   Map   URL1  URL2
43# Change the current URL to URL2, then continue processing.
44#
45#   Pass  URL1  [URL2]
46# Accept this URL and stop processing further rules; if URL2
47# is given, apply this as the last mapping.
48# See the next item for reasons why you generally don't want to "pass"
49# a changed URL.
50#
51#   RedirectTemp       URL1  URL2
52#   RedirectPerm       URL1  URL2
53#   Redirect [STATUS]  URL1  URL2
54# Stop processing further rules and redirect to URL2, just as if lynx had
55# received a HTTP redirection with URL2 as the new location.  This means that
56# URL2 is subject to any applicable permission checking, if it passes a new
57# request will be issued (which may result in a new round of rules checking,
58# with a new "current URL") or the new URL might be taken from the cache, and,
59# after successful loading, lynx' idea of what the loaded document's URL is
60# will be fully updated.  All this does not happen if you just "pass" a changed
61# URL (or let it fall through), so this is generally the preferred way for
62# substituting URLs.
63# If the RedirectPerm variant is used, or if the optional word is supplied and
64# is either "permanent" or "301", act as if lynx had received a permanent
65# redirection (with HTTP status 301).  In most cases this will not make a
66# noticeable difference.  Lynx may cache the location in a special way for 301
67# redirections, so that the redirection is followed immediately the next time
68# the same original URL is accessed, without re-checking of rules.  Therefore
69# the permanent variant should never be used if the desired outcome of rules
70# processing depends on variable conditions (see CONDITIONS below) or on
71# setting a special flag (see next item).
72#
73#   PermitRedirection  URL1
74# Mark following redirection as permitted, and continue processing.  Some
75# redirection locations are normally not allowed, because permitting them in a
76# response from an arbitrary remote server would open a security hole, and
77# others are not allowed if certain restrictions options are in effect.  Among
78# redirection locations normally always forbidden are lynxprog:  and lynxexec:
79# schemes.  With "default" anonymous restrictions in effect, many URL schemes
80# are disallowed if the user would not be allowed to use them with 'g'oto.
81# This rule allows to override the permission checking if rules processing ends
82# with a Redirect (including the RedirectPerm or RedirectTemp forms).  It is
83# ignored otherwise, in particular, it does not influence acceptance if rules
84# processing ends with a "Pass" and a real redirection is received in the
85# subsequent HTTP request.  If redirections are chained, it only applies to the
86# redirection that ends the same rules cycle.  Note that the new URL is still
87# subject to other permission checks that are not specific to redirections; but
88# using this rule may still weaken the expected effect of -anonymous,
89# -validate, -realm, and other restriction options, including TRUSTED_EXEC and
90# similar in lynx.cfg, so be careful where you redirect to if restrictions are
91# important!
92#
93#   UseProxy  URL1  PROXYURL
94# Stop processing further rules, and force access through the proxy given by
95# PROXYURL.  PROXYURL should have the same form as required for foo_proxy
96# environment variables and lynx.cfg options, i.e., (unless you are trying to
97# do something unusual) "http://some.proxy-server.dom:port/".  This rule
98# overrides any use of a proxy (or external gateway) that might otherwise apply
99# because of environment variables or lynx.cfg options, it also overrides any
100# "no_proxy" settings.
101#
102#   UseProxy  URL1  none
103# Mark request as NOT using any proxy (or external gateway), and continue
104# processing(!).  For a request marked this way, any subsequent UseProxy
105# rule with a PROXYURL will be ignored, and any use of a proxy (or external
106# gateway) that might otherwise apply because of environment variables or
107# lynx.cfg options will be overridden.  Note that the marking will not
108# survive a Redirect rule (since that will result, if successful, in a
109# new request).
110#
111#   Alert         URL1  MESSAGE
112#   AlwaysAlert   URL1  MESSAGE
113#   UserMsg       URL1  MESSAGE
114#   InfoMsg       URL1  MESSAGE
115#   Progress      URL1  MESSAGE
116# These produce various kinds of statusline messages, differing in whether
117# a pause is enforced and in its duration, immediately when the rule is
118# applied.  AlwaysAlert shows the message text even in non-interactive mode
119# (-dump, -source, etc.).  Rule processing continues after the message is
120# shown.  As usual, these rules only apply if URL1 matches.  MESSAGE is
121# the text to be displayed, it can contain one occurrence of "%s" which
122# will be replaced by the current URL, literal '%' characters should be
123# doubled as "%%".
124#
125# Rules are processed sequentially first to last for each request, a rule
126# applies if the current URL matches URL1.  The current URL is initally the
127# URL for the resource the user is trying to access, but may change as the
128# result of applied Map rules.  case-sensitive (!) string comparison is used,
129# in addition URL1 can contain one '*' which is interpreted as a wildcard
130# matching 0 or more characters.  So if for example
131# "http://example.com/dir/doc.html" is requested, it would match any of
132# the following:
133#   Pass  http:*
134#   Pass  http://example.com/*.html
135#   Pass  http://example.com/*
136#   Pass  http://example*
137#   Pass  http://*/doc.html
138# but not:
139#   Pass  http://example/*
140#   Pass  http://Example.COM/dir/doc.html
141#   Pass  http://Example.COM/*
142#
143# If a URL2 is given and also contains a '*', that character will be
144# replaced by whatever matched in URL1.  Processing stops with the
145# first matching "Fail" or "Pass" or when the end of the rules is reached.
146# If the end is reached without a "Fail" or "Pass", the URL is allowed
147# (equivalent to a final "Pass *").
148#
149# The requested URL will have been transformed to Lynx' normal
150# representation.  This means that local file resources should be
151# expected in the form "file://localhost/<path using slash separators>",
152# not in the machine's native representation for filenames.
153#
154# Anyone with experience configuring the venerable CERN httpd server will
155# recognize some of the syntax - in fact, the code implementing rules goes
156# back to a common ancestor.  But note the differences: all URLs and URL-
157# patterns here have to be given as absolute URLs, even for local files.
158# (Absolute URLs don't imply proxying.)
159#
160# CONDITIONS
161# ----------
162# All rules mentioned can be followed by an optional CONDITION, which can
163# be used to further restrict when the rule should be applied (in addition
164# to the match on URL1).  A CONDITION takes one of the forms
165#   "if"     CONDITIONFLAG
166#   "unless" CONDITIONFLAG
167# and currently two condition flags are recognized:
168#   "userspecified"   (or abbreviated "userspec")
169#   "redirected"
170# To explain these, first some terms need to be defined.  A "request"
171# is...
172#
173# A user action (like following a link, or entering a 'g'oto URL) can either be
174# rejected immediately (for example, because of restrictions in effect, or
175# because of invalid input), or can generate a "request".  For the purpose of
176# this discussion, a "request" is the sequence of processing done by lynx,
177# which might ultimately lead to an actual network request and loading and
178# display of data; a request can also result in rejection (for example, some
179# restrictions are checked at this stage), or in a redirection.  A redirection
180# in turn can be rejected (which makes the request fail), or can automatically
181# generate a new request.  A "request chain" is the sequence of one or more
182# requests triggered by the same user event that are chained together by
183# redirections.
184# For each request, some URL schemes are handled (or rejected) specially, see
185# Limitation 1 below, the others are passed to the generic access code.  Rules
186# processing occurs at the beginning of the generic access code, before a
187# request is dispatched to the scheme-specific protocol module (but after
188# checking whether the request can be satisfied by re-displaying an already
189# cached document).
190# With these definitions, the meaning of the possible CONDITIONFLAGS:
191#
192#   if redirected
193# The rule applies if the current request results from a redirection;
194# whether that was a real HTTP redirection or one generated by a rule
195# in the previous request makes no difference.  In other words, the
196# condition is true if the current request is not the first one in the
197# request chain.
198#
199#   if userspecified
200# The rule applies if the initial URL of the request chain was specified
201# by the user.  Lynx marks a request as "user specified" for URLs that
202# come from 'g'oto prompts, as well as for following links in a bookmark
203# or Jump file and some other special (lynx-generated) pages that may
204# contain URLs that were typed in by the user.
205# Note that this is not a property of the request, but of the whole request
206# chain (based on where the first request's URL came from).  The current
207# URL may differ from what the user typed
208# - because of initial fixups, including conversion of Guess-URLs and file
209#   paths to full URLs,
210# - because of Map rules applied, and/or
211# - because of a previous redirection.
212# So to make reasonably sure a suspicious or potentially dangerous URL has
213# been entered by the user, i.e. is not a link or external redirection
214# location that cannot be trusted, a combination of "userspecified" and
215# "redirected" flags should be used, for example
216#   Fail URL1 unless userspecified
217#   Fail URL1 if redirected
218#   ...
219#
220# CAVEAT
221# ======
222# First, to squash any false expectations, an example for what NOT TO DO.
223# It might be expected that a rule like
224#   Fail  file://localhost/etc/passwd		# <- DON'T RELY ON THIS
225# could be used to prevent access to the file "/etc/passwd".  This might
226# fool a naive user, but the more sophisticated user could still gain
227# access, by experimenting with other forms like (@@@ untested)
228# "file://<machine's domain name>/etc/passwd" or "/etc//passwd"
229# or "/etc/p%61asswd" or "/etc/passwd?" or "/etc/passwd#X" and so on.
230# There are many URL forms for accessing the same resource, and Lynx
231# just doesn't guarantee that URLs for the same resource will look the
232# same way.
233#
234# The same reservation applies to any attempts to block access to unwanted
235# sites and so on.  This isn't the right place for implementing it.
236# (Lynx has a number of mechanisms documented elsewhere to restrict access,
237# see the INSTALLATION file, lynx.cfg, lynx -help, lynx -restrictions.)
238#
239# Some more useful applications:
240#
241# 1. Disabling URLs by access scheme
242# ----------------------------------
243#   Fail  gopher:*
244#   Fail  finger:*
245#   Fail  lynxcgi:*
246#   Fail  LYNXIMGMAP:*
247# This should work (but no guarantees) because Lynx canonicalizes
248# the case of recognized access schemes and does not interpret
249# %-escaping in the scheme part (@@@ always?)
250#
251# Note that for many access schemes Lynx already has mechanisms to
252# restrict access (see lynx.cfg, -help, -restrictions, etc.), others
253# have to be specifically enabled.  Those mechanisms should be used
254# in preference.
255# Note especially Limitation 1 below.
256# This can be used for the remaining cases, or in addition by the
257# more paranoid.  Note that disabling "file:*" will also make many
258# of the special pages generated by lynx as temporary files (INFO,
259# history, ...) inaccessible, on the other hand it doesn't prevent
260# _writing_ of various temp files - probably not what you want.
261#
262# You could also direct access for a scheme to a brief text explaining
263# why it's not available:
264#   Redirect news:*   http://localhost/texts/newsserver-is-broken.html
265#
266# 2. Preventing accidental access
267# -------------------------------
268# If there is a page or site you don't want to access for whatever
269# reason (say there's a link to it that crashes Lynx [don't forget to
270# report a bug], or if that starts sending you a 5 Mb file you don't
271# want, or you just don't like the people...), you can prevent yourself
272# from accidentally accessing it:
273#    Fail  http://bad.site.com/*
274#
275# 3. Compressed files
276# -------------------
277# You have downloaded a bunch of HTML documents, and compressed them
278# to save space.  Then you discover that links between the files don't
279# work, because they all use the names of the uncompressed files.  The
280# following kind of rule will alow you to navigate, invisibly accessing
281# the compressed files:
282#   Map file://localhost/somedir/*.html file://localhost/somedir/*.html.gz
283# or, perhaps better:
284#   Redirect file://localhost/somedir/*.html file://localhost/somedir/*.html.gz
285#
286# 4. Use local copies
287# -------------------
288# You have downloaded a tree of HTML documents, but there are many links
289# between them that still point to the remote location.  You want to access
290# the local copies instead, after all that's why you downloaded them.  You
291# could start editing the HTML, but the following might be simpler:
292#  Map http://remote.com/docs/*.html file://localhost/home/me/docs/*.html
293# Or even combine this with compressing the files:
294#  Map http://remote.com/docs/*.html file://localhost/home/me/docs/*.html.gz
295#
296# Again, replacing the "Map" with "Redirect" is probably better - it will
297# allow you to see the _real_ location on the lynx INFO screen or in the
298# HISTORY list, will avoid duplicates in the cache if the same document is
299# loaded with two different URLs, and may allow you to 'e'dit the local
300# from within lynx if you feel like it.
301#
302# 5. Broken links etc.
303# --------------------
304# A user has moved from http://www.siteA.com/~jdoe to http://siteB.org/john,
305# or http://www.provider.com/company/ has moved to their own server
306# http://www.company.com, but there are still links to the old location
307# all over the place; they now are broken or lead to a stupid "this page
308# has moved, please update your bookmarks. Refresh in 5 seconds" page
309# which you're tired of seeing.  This will not fix your bookmarks, and
310# it will let you see the outdated URLs for longer (Limitation 3 below),
311# but for a quick fix:
312#   Redirect   http://www.siteA.com/~jdoe/*      http://siteB.org/john/*
313#   Redirect   http://www.provider.com/company/* http://www.company.com/*
314#
315# You could use "Map" instead of "Redirect", but this would let you see the
316# outdated URLs for longer and even bookmark them, and you are likely to
317# create invalid links if not all documents from a site are mapped
318# (Limitation 3).
319#
320# 6. DNS troubles
321# ---------------
322# A special case of broken links.  If a site is inaccessible because the
323# name cannot be resolved (your or their name server is broken, or the
324# name registry once again made a mistake, or they really didn't pay in
325# time...) but you still somehow know the address; or if name lookups are
326# just too slow:
327#   Map   http://www.somesite.com/*  http://10.1.2.3/*
328# (You could do the equivalent more cleanly by adding an entry to the hosts
329# file, if you have access to it.)
330#
331# Or, if a name resolves to several addresses of which one is down, and the
332# DNS hasn't caught up:
333#   Map   http://www.w3.org/*    http://www12.w3.org/*
334#
335# Note that this can break access to some name-based virtually hosted sites.
336#
337# In this case use of "Map" is probably preferred over "Redirect", as long
338# as the URL on the left side contains the real and preferred hostname or
339# the problem is only temporary.
340#
341# 7. Avoid redirections
342# ---------------------
343# Some sites have a habit to provide links that don't go to the destination
344# directly but always force redirection via some intermediate URL.  The
345# delay imposed by this, especially for users with slower connections and
346# for overloaded servers, can be avoided if the intermediate URLs always
347# follow some simple pattern: we can then anticipate the redirect that will
348# inevitably follow and generate it internally.  For example,
349#   Redirect http://lwn.net/cgi-bin/vr/*    http://*
350#
351# Warning: The page authors may not like this circumvention.  Often the
352# redirection is wanted by them to track access, sometimes in connection
353# with cookies.  Some sites may employ mechanisms that defeat the shortcut.
354# It is your responsibility to decide whether use of this feature is
355# acceptable.  (But note that the same effect can be achieved anyway for
356# any link by editing the URL, e.g. with the ELGOTO ('E') key in Lynx, so
357# a shortcut like this does not create some new kind of intrusion.)
358#
359# 8. Detailed proxy selection
360# ---------------------------
361# Basic use for this one should be obvious, if you have a need for it.
362# It simply allows selecting use (or non-use) of proxies on a more detailed
363# level than the traditional <scheme>_proxy and no_proxy variables, as well
364# as using different proxies for different sites.
365# For example, to request access through an anonymizing proxy for all pages
366# on a "suspicious" site:
367#   UseProxy  http://suspicious.site/*  http://anonymyzing.proxy.dom/
368# (as long as all URLs really have a matching form, not some alternative
369# like <http://suspicious.site:80/> or <http://SuSpIcIoUs.site/>!)
370#
371# To access some site through a local squid proxy, running on the same host
372# as lynx, except for some image types (say because you rarely access images
373# with lynx anyway, and if you do, you don't want them cached by the proxy):
374#   UseProxy  http://some.site/*.gif  none
375#   UseProxy  http://some.site/*.jpg  none
376#   UseProxy  http://some.site/*      http://localhost:3128/
377# Note that order is important here.
378#
379# To exempt a local address from all proxying:
380#   UseProxy  http://local.site/*  none
381#
382# Note however that for some purposes the "no_proxy" setting may be better
383# suited than "UseProxy ... none", because of its different matching logic
384# (see comments in lynx.cfg).
385#
386# 9. Invent your own scheme
387# -------------------------
388# Suppose you want to teach lynx to handle a completely new URL scheme.
389# If what's required for the new scheme is already available in lynx in
390# _some_ way, this may be possible with some inventive use of rules.
391# As an example, let's assume you want to introduce a simple "man:" scheme
392# for showing manual pages, so (for a Unix-like system, at least) "man:lynx"
393# would display the same help information as the "man lynx" command and so
394# on (we ignore section numbers etc. for simplicity here).
395# First, since lynx doesn't know anything about a "man:" scheme, it will
396# normally reject any such URLs at an early stage.  However, a trick exists
397# to bypass that hurdle: define a man_proxy environment variable *outside of
398# lynx, before starting lynx* (it won't work in lynx.cfg), the actual value
399# is unimportant and won't actually be used.  For example, in your shell:
400#   export man_proxy=X
401#
402# If you already have some kind of HTTP-accessible man gateway available,
403# the task then probably just amounts to transforming the URL into the right
404# form.  For one such gateway (in this case, a CGI script running on the
405# local machine), the rule
406#   Redirect man:* http://localhost/cgi-bin/dwww?type=runman&location=*/
407# or, alternatively,
408#   UseProxy man:* none
409#   Map      man:* http://localhost/cgi-bin/dwww?type=runman&location=*/
410# does it, for other setups the right-hand side just has to be modified
411# appropriately.  The "UseProxy" is to make sure the bogus man_proxy gets
412# ignored.
413#
414# If no CGI-like access is available, you might want to invoke your system's
415# man command directly for a man: URL.  Here is some discussion of how this
416# could be done, and why ultimately you may not want to do it; this is also
417# an opportunity to show examples for how some of the rules and conditions
418# can be used that haven't been discussed in detail elsewhere.
419# Lynx provides the lynxexec: (and the similar lynxprog:) scheme for running
420# (nearly) arbitrary commands locally.  At the heart of employing it for
421# man: would be a rule like this:
422#   Redirect          man:*  "lynxexec:/usr/bin/man *"
423# (It is a peculiarity of this scheme that the literal space and quoting
424# are necessary here.  Also note that Map cannot be used here instead of
425# Redirect, since lynxexec, as a special kind of URL, needs to be handled
426# "early" in a request.)
427# Of course, execution of arbitrary commands is a potentially dangerous
428# thing.  lynxexec has to be specifically enabled at compile time and in
429# lynx.cfg (or with command line options), and there are various levels
430# of control, too much to go into here.  It is assumed in the following that
431# lynxexec has been enabled to the degree necessary (allow /usr/bin/man
432# execution) but hopefully not too much.
433# What needs to be prevented is that allowing local execution of the man
434# command might unintentionally open up unwanted execution of other commands,
435# possibly by some trick that could be exploited.  For example, redirecting
436# man:* as above, the URL "man:lynx;rm -r *" could result in the command
437# "man lynx;rm -r *" executed by the system, with obvious disastrous results.
438# (This particular example won't actually work, for several reasons; but
439# for the purpose of discussion let's assume it did, there may be similar
440# ones that do.)
441# Because of such dangers, redirection to a lynxexec: is normally never
442# accepted by lynx.  We need at least a PermitRedirection rule to override
443# this protective limitation:
444#   PermitRedirection man:*
445#   Redirect          man:*  "lynxexec:/usr/bin/man *"
446# But now we have potentially opened up local execution more than is
447# acceptable via the man: scheme, so this needs to be examined.
448# There are two aspects to security here: (1) restricting the user, and (2)
449# protecting the user.  The first could also be phrased as protecting the
450# system from the user; the second as preventing lynx (and the system) from
451# doing things the user doesn't really want.  Aspect (1) is very important
452# for setups providing anonymous guest accounts and similarly restricted
453# environments.  (Otherwise shell access is normally allowed, and trying to
454# protect the system in lynx would be rather pointless.)  As far as access
455# to some URLs is concerned, the difference can be characterized in terms of
456# which sources  of URLs are trusted enough to allow access: for (1), only
457# links occurring in a limited number of documents are trusted enough for
458# some (or all) URLs, user input at 'g'oto prompts and the like is not (if
459# not completely disabled).  For (2) and assuming a user with normal shell
460# privileges, the user may be trusted enough to accept any URL explicitly
461# entered, but URLs from arbitrary external sources are not - someone might
462# try to use them to trick the user (by following an innocent-looking link)
463# or lynx (by following a redirection) into doing something undesirable.
464#
465# In the following we are concerned with (2); it is assumed that providers
466# of anonymous accounts would not want to follow this path, and would have
467# no need for additional schemes that imply local execution anyway.  (For
468# one thing, with the man example they would have to carefully check that
469# users cannot break out of the man command to a local shell prompt.)
470#
471# Getting back to the example, it was already mentioned that lynx does not
472# allow redirections to lynxexec.  In fact this continues to be disallowed
473# for real redirection received from HTTP servers.  But we have introduced
474# a new man: scheme, and the lynx code that does the redirection checking
475# doesn't know anything about special considerations for man: URLs, so
476# an external HTTP server might send a redirection message with "Location:
477# man:<something>", which lynx would allow, and which would in turn be
478# redirected by our rule to "lynxexec:/usr/bin/man <something>".  Unless
479# we are 100% sure that either this can never happen or that the lynxexec
480# URL resulting from this can have no harmful effect, this needs to be
481# prevented.  It can be done by checking for the "redirected" condition,
482# either by putting something like (the first line is of course optional)
483#   Alert  man:*  "Redirection to man: not allowed" if redirected
484#   Fail   man:*                                    if redirected
485# somewhere before the Redirect rule, or, reversing the logic, by adding
486# a condition to the redirection rules, i.e. they become
487#   PermitRedirection man:*                             unless redirected
488#   Redirect          man:*  "lynxexec:/usr/bin/man *"  unless redirected
489# (actually, putting the condition on either one of the rules would be
490# sufficient).  The second variant assumes that the attempted access to
491# man: via redirection will ultimately fail because there is no other way
492# to handle such URLs.
493#
494# The above should take care of rejecting man: URLs from redirections, but
495# what about regular links in HTML (like <A HREF="man:...">)?  As long as
496# it can be assumed that the user will always inspect each and every link
497# before following it, and never follow a link that can have harmful effect,
498# no further restrictions are necessary.  But this is a very big assumption,
499# unrealistic except perhaps in some single-user setups where the user is
500# is identical with the rule writer.  So normally most links have to be
501# regarded as suspect, and only URLs entered by the user can be accepted:
502#   Alert  man:*  "Redirection to man: not allowed" if redirected
503#   Fail   man:*                                    if redirected
504#   Alert  man:*  "Link to man: not allowed"        unless userspecified
505#   Fail   man:*                                    unless userspecified
506#
507# With these restrictions we have limited the ways our new man: scheme can
508# be used rather severely, to the point where its usefulness is questionable.
509# In addition to 'g'oto prompts, it may work in Jump files; also, should
510# links to man:<something> appear in HTML text, the user could retype them
511# manually or use the ELGOTO ('E') command with some trivial editing (like
512# adding a space) to "confirm" the URL.  Even if the precautions outlined
513# above are followed: THIS TEXT DOES NOT IMPLY ANY PROMISE THAT, BY FOLLOWING
514# THE EXAMPLES, LYNX WILL BE SAFE.  On the other hand, some of the precautions
515# *may* not be necessary: it is possible that careful use of TRUSTED_EXEC
516# options in lynx.cfg could offer enough protection while making the new
517# scheme more useful.
518#
519# If all this seems a bit too scary, that's intentional; it should be noted
520# that these considerations are not in general necessary for "harmless" URL
521# schemes, but appropriate for this "extreme" example.  One last remark
522# regarding the hypothetical man scheme: instead of implementing it through
523# "lynxexec:" or "lynxprog:", it would be somewhat safer to use "lynxcgi:"
524# instead if it is supported.  A simple lynxcgi script would have to write
525# the man page to stdout (either converted to text/html or as plain text,
526# preceded by an appropriate Content-Type header line), and all necessary
527# checking for special shell characters would be done within the script -
528# lynx does not use the system() function to run the script.
529#
530# Other Limitations
531# =================
532# First, see CAVEAT above.  There are other limitations:
533#
534# 1. Applicable URL schemes
535# -------------------------
536# Rules processing does not apply to all URL schemes.  Some are
537# handled differently from the generic access code, therefore rules
538# for such URLs will never be "seen".  This limitation applies at
539# least to lynxexec:, lynxprog:, mailto:, LYNXHIST:, LYNXMESSAGES:,
540# LYNXCFG:, and LYNXCOMPILEOPTS: URLs.  You shouldn't be tempted
541# to try to redirect most of these schemes anyway, but this also
542# makes it impossible to disable them with "Fail" rules.
543#
544# Also, a scheme has to be known to Lynx in order to get as far as
545# applying rules - you cannot just define your own new foobar: scheme
546# and then map it to something here, but see Application 9, above,
547# for a workaround.
548#
549# 2. No re-checking
550# -----------------
551# When a URL is mapped to a different one, the new URL is not checked
552# again for compliance with most restrictions established by -anonymous,
553# -restrictions, lynx.cfg and so on.  This can be regarded as a feature:
554# it allows specific exceptions.  Of course it means that users for
555# whom any restrictions must be enforced cannot have write access to a
556# personal rules file, but that should be obvious anyway!
557# This limitation does not applies if "Redirect" is used, in that case
558# the new URL will always be re-examined.
559#
560# 3. Mappings are invisible
561# -------------------------
562# Changing the URL with "Map" or "Pass" rules will in general not be
563# visible to the user, because it happens at a late stage of processing
564# a request (similar to directing a request through a proxy).  One
565# can think of two kinds of URL for every resource: a "Document URL" as
566# the user sees it (on INFO page, history list, status line, etc.), and
567# a "physical URL" used for the actual access.  Rules change only the
568# physical URL.  This is different from the effect of HTTP redirection.
569# Often this is bad, sometimes it may be desirable.
570#
571# Changing the URL can create broken links if a document has relative URLs,
572# since they are taken to be relative to the "Document URL" (if no BASE tag
573# is present) when the HTML is parsed.
574#
575# This limitation does not apply if "Redirect" is used - the new location
576# will be visible to the user, and will be used by lynx for resolving
577# relative URLs within the document.
578#
579# 4. Interaction with proxying
580# ----------------------------
581# Rules processing is done after most other access checks, but before
582# proxy (and gateway) settings are examined.  A "Fail" rule works
583# as expected, but when the URL has been mapped to a different one,
584# the subsequent proxy checking can get confused.  If it decides that
585# access is through a proxy or gateway, it will generally use the
586# original URL to construct the "physical" URL, effectively overriding
587# the mapping rules.  If the mapping is to a different access scheme
588# or hostname, proxy checking could also be fooled to use a proxy when
589# it shouldn't, to not use one when it should, or (if different proxies
590# are used for different schemes) to use the wrong proxy.  So "just
591# don't do that"; in some cases setting the no_proxy variable will help.
592# Example 3 happens to work nicely if there is a http_proxy but no
593# ftp_proxy.
594#
595# This limitation does not come into play if a "UseProxy" rule is applied,
596# in either of its two forms: with a PROXYURL, proxying is fully under
597# the control of the rules author, and with "none", subsequent proxy
598# and gateway checking is completely disabled.  It is therefore a good
599# idea to combine any "Map" and "Pass" rules that might result in passing
600# the changed URL with explicit "UseProxy" rules, if the rules file is
601# expected to be used together with proxying; or else always use "Redirect"
602# instead of simple passing.
603#
604# 5. Case-sensitive matching
605# --------------------------
606# The matching logic is generic string-based.  It doesn't know anything
607# about URL syntax, and so it cannot know in which parts of a URL case
608# matters and where it doesn't.  As a result, all comparisons are case-
609# sensitive.  If (a limited number of) case variations of a URL need
610# to be dealt with, several rules can be used instead of one.
611# In particular, this makes "UseProxy ... none" in some ways more limited
612# than a no_proxy setting.
613#
614# 6. Redirection differences
615# --------------------------
616# For some URLs lynx does never check after a request whether a redirection
617# occurs; that makes the "Redirect" rule useless for such URLs (in addition
618# to those mentioned under limitation 1.).  Some of them are some gopher
619# types, telnet: and similar in most situations, newspost: and similar,
620# lynxcgi:, and some other private types.  Trying to redirect these will
621# make access fail.  You probable don't want to change such URLs anyway,
622# but if you feel you must, try using "Map" and "Pass" instead.
623#
624# The -noredir command line option only applies for real HTTP redirection
625# responses, Redirect rules are still applied.  Also for certain other
626# command line options (-mime_header, -head) and command keys (HEAD) lynx
627# shows the redirection message (or part of it) in case of a real HTTP
628# redirection, instead of following the redirection.  Here, too, a Redirect
629# rule remains effective (there is no redirection message to show, after all).
630#
631# 7. URLs required
632# ----------------
633# Full absolute URLs (modulo possible "*" matching wildcards) are required
634# in rules.  Strings like "www.somewhere.com" or "/some/dir/some.file" or
635# "www.somewhere.com/some/dir/some.file" are not URLs.  Lynx may accept
636# them as user input, as abbreviated forms for URLs; but by the time the
637# rules get checked, those have been converted to full URLs, if they can
638# be recognized.  This also means that rules cannot influence which strings
639# typed at a 'g'oto prompt are recognized for URLs - rules processing kicks
640# in later.
641