Apache2 http request processing
API Phases
Apache processes a HTTP request in several phases. A hook for each of these phases is provided by the Apache API. mod_rewrite uses two of these hooks: the URL-to-filename translation hook (used after the HTTP request has been read, but before any authorization starts) and the Fixup hook (triggered after the authorization phases, and after the per-directory config files (.htaccess) have been read, but before the content handler is activated).
Once a request comes in, and Apache has determined the appropriate server (or virtual server), the rewrite engine starts the URL-to-filename translation, processing the mod_rewrite directives from the per-server configuration. A few steps later, when the final data directories are found, the per-directory configuration directives of mod_rewrite are triggered in the Fixup phase.
Ruleset Processing
When mod_rewrite is triggered during these two API phases, it reads the relevant rulesets from its configuration structure (which was either created on startup, for per-server context, or during the directory traversal for per-directory context). The URL rewriting engine is started with the appropriate ruleset (one or more rules together with their conditions), and its operation is exactly the same for both configuration contexts. Only the final result processing is different.
The order of rules in the ruleset is important because the rewrite engine processes them in a particular (not always obvious) order, as follows: The rewrite engine loops through the rulesets (each ruleset being made up of RewriteRule directives, with or without RewriteConds), rule by rule. When a particular rule is matched, mod_rewrite also checks the corresponding conditions (RewriteCond directives). For historical reasons the conditions are given first, making the control flow a little bit long-winded.
As above, first the URL is matched against the Pattern of a rule. If it does not match, mod_rewrite immediately stops processing that rule, and goes on to the next rule. If the Pattern matches, mod_rewrite checks for rule conditions. If none are present, the URL will be replaced with a new string, constructed from the Substitution string, and mod_rewrite goes on to the next rule.
If RewriteConds exist, an inner loop is started, processing them in the order that they are listed. Conditions are not matched against the current URL directly. A TestString is constructed by expanding variables, back-references, map lookups, etc., against which the CondPattern is matched. If the pattern fails to match one of the conditions, the complete set of rule and associated conditions fails. If the pattern matches a given condition, then matching continues to the next condition, until no more conditions are available. If all conditions match, processing is continued with the substitution of the Substitution string for the URL.
Regex Back-Reference Availability
Using parentheses in Pattern or in one of the CondPatterns causes back-references to be internally created. These can later be referenced using the strings $N and %N, for creating the Substitution and TestString strings.
Quoting Special Characters
Special characters in TestString and Substitution strings can be escaped (that is, treated as normal characters without their usual special meaning) by prefixing them with a backslash ('\') character. In other words, you can include an actual dollar-sign character in a Substitution string by using '\$'; this keeps mod_rewrite from trying to treat it as a backreference.
Environment Variables
This module keeps track of two additional (non-standard) CGI/SSI environment variables named SCRIPT_URL and SCRIPT_URI. These contain the logical Web-view to the current resource, while the standard CGI/SSI variables SCRIPT_NAME and SCRIPT_FILENAME contain the physical System-view.
Notice: These variables hold the URI/URL as they were initially requested, that is, before any rewriting. This is important to note because the rewriting process is primarily used to rewrite logical URLs to physical pathnames.
Example
SCRIPT_NAME=/sw/lib/w3s/tree/global/u/rse/.www/index.html
SCRIPT_FILENAME=/u/rse/.www/index.html
SCRIPT_URL=/u/rse/
SCRIPT_URI=http://en1.engelschall.com/u/rse/
Practical Solutions
RewriteBase Directive
Description: Sets the base URL for per-directory rewrites
Syntax: RewriteBase URL-path
Default: See usage for information.
Context: directory, .htaccess
Override: FileInfo
Status: Extension
Module: mod_rewrite
The RewriteBase directive explicitly sets the base URL for per-directory rewrites. As you will see below, RewriteRule can be used in per-directory config files (.htaccess). In such a case, it will act locally, stripping the local directory prefix before processing, and applying rewrite rules only to the remainder. When processing is complete, the prefix is automatically added back to the path. The default setting is; RewriteBase physical-directory-path
When a substitution occurs for a new URL, this module has to re-inject the URL into the server processing. To be able to do this it needs to know what the corresponding URL-prefix or URL-base is. By default this prefix is the corresponding filepath itself. However, for most websites, URLs are NOT directly related to physical filename paths, so this assumption will often be wrong! Therefore, you can use the RewriteBase directive to specify the correct URL-prefix.
If your webserver's URLs are not directly related to physical file paths, you will need to use RewriteBase in every .htaccess file where you want to use RewriteRule directives.
For example, assume the following per-directory config file:
#
# /abc/def/.htaccess -- per-dir config file for directory /abc/def
# Remember: /abc/def is the physical path of /xyz, i.e., the server
# has a 'Alias /xyz /abc/def' directive e.g.
#
RewriteEngine On
# let the server know that we were reached via /xyz and not
# via the physical path prefix /abc/def
RewriteBase /xyz
# now the rewriting rules
RewriteRule ^oldstuff\.html$ newstuff.html
In the above example, a request to /xyz/oldstuff.html gets correctly rewritten to the physical file /abc/def/newstuff.html.
For Apache Hackers
The following list gives detailed information about the internal processing steps:
Request:
/xyz/oldstuff.html
Internal Processing:
/xyz/oldstuff.html -> /abc/def/oldstuff.html (per-server Alias)
/abc/def/oldstuff.html -> /abc/def/newstuff.html (per-dir RewriteRule)
/abc/def/newstuff.html -> /xyz/newstuff.html (per-dir RewriteBase)
/xyz/newstuff.html -> /abc/def/newstuff.html (per-server Alias)
Result:
/abc/def/newstuff.html
This seems very complicated, but is in fact correct Apache internal processing. Because the per-directory rewriting comes late in the process, the rewritten request has to be re-injected into the Apache kernel, as if it were a new request. (See mod_rewrite technical details.) This is not the serious overhead it may seem to be - this re-injection is completely internal to the Apache server (and the same procedure is used by many other operations within Apache).
RewriteCond Directive
Description: Defines a condition under which rewriting will take place
Syntax: RewriteCond TestString CondPattern
Context: server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Extension
Module: mod_rewrite
The RewriteCond directive defines a rule condition. One or more RewriteCond can precede a RewriteRule directive. The following rule is then only used if both the current state of the URI matches its pattern, and if these conditions are met.
TestString is a string which can contain the following expanded constructs in addition to plain text:
RewriteRule backreferences: These are backreferences of the form $N (0 <= N <= 9), which provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions.
RewriteCond backreferences: These are backreferences of the form %N (1 <= N <= 9), which provide access to the grouped parts (again, in parentheses) of the pattern, from the last matched RewriteCond in the current set of conditions.
RewriteMap expansions: These are expansions of the form ${mapname:key|default}. See the documentation for RewriteMap for more details.
Server-Variables: These are variables of the form %{ NAME_OF_VARIABLE } where NAME_OF_VARIABLE can be a string taken from the following list:
HTTP headers:
HTTP_USER_AGENT
HTTP_REFERER
HTTP_COOKIE
HTTP_FORWARDED
HTTP_HOST
HTTP_PROXY_CONNECTION
HTTP_ACCEPT
connection & request:
REMOTE_ADDR
REMOTE_HOST
REMOTE_PORT
REMOTE_USER
REMOTE_IDENT
REQUEST_METHOD
SCRIPT_FILENAME
PATH_INFO
QUERY_STRING
AUTH_TYPE
server internals:
DOCUMENT_ROOT
SERVER_ADMIN
SERVER_NAME
SERVER_ADDR
SERVER_PORT
SERVER_PROTOCOL
SERVER_SOFTWARE
system stuff:
TIME_YEAR
TIME_MON
TIME_DAY
TIME_HOUR
TIME_MIN
TIME_SEC
TIME_WDAY
TIME
specials:
API_VERSION
THE_REQUEST
REQUEST_URI
REQUEST_FILENAME
IS_SUBREQ
HTTPS
These variables all correspond to the similarly named HTTP MIME-headers, C variables of the Apache server or struct tm fields of the Unix system. Most are documented elsewhere in the Manual or in the CGI specification. Those that are special to mod_rewrite include those below.
IS_SUBREQ
Will contain the text "true" if the request currently being processed is a sub-request, "false" otherwise. Sub-requests may be generated by modules that need to resolve additional files or URIs in order to complete their tasks.
API_VERSION
This is the version of the Apache module API (the internal interface between server and module) in the current httpd build, as defined in include/ap_mmn.h. The module API version corresponds to the version of Apache in use (in the release version of Apache 1.3.14, for instance, it is 19990320:10), but is mainly of interest to module authors.
THE_REQUEST
The full HTTP request line sent by the browser to the server (e.g., "GET /index.html HTTP/1.1"). This does not include any additional headers sent by the browser.
REQUEST_URI
The resource requested in the HTTP request line. (In the example above, this would be "/index.html".)
REQUEST_FILENAME
The full local filesystem path to the file or script matching the request.
HTTPS
Will contain the text "on" if the connection is using SSL/TLS, or "off" otherwise. (This variable can be safely used regardless of whether or not mod_ssl is loaded).
Other things you should be aware of:
The variables SCRIPT_FILENAME and REQUEST_FILENAME contain the same value - the value of the filename field of the internal request_rec structure of the Apache server. The first name is the commonly known CGI variable name while the second is the appropriate counterpart of REQUEST_URI (which contains the value of the uri field of request_rec).
%{ENV:variable}, where variable can be any environment variable, is also available. This is looked-up via internal Apache structures and (if not found there) via getenv() from the Apache server process.
%{SSL:variable}, where variable is the name of an SSL environment variable, can be used whether or not mod_ssl is loaded, but will always expand to the empty string if it is not. Example: %{SSL:SSL_CIPHER_USEKEYSIZE} may expand to 128.
%{HTTP:header}, where header can be any HTTP MIME-header name, can always be used to obtain the value of a header sent in the HTTP request. Example: %{HTTP:Proxy-Connection} is the value of the HTTP header ``Proxy-Connection:''.
%{LA-U:variable} can be used for look-aheads which perform an internal (URL-based) sub-request to determine the final value of variable. This can be used to access variable for rewriting which is not available at the current stage, but will be set in a later phase.
For instance, to rewrite according to the REMOTE_USER variable from within the per-server context (httpd.conf file) you must use %{LA-U:REMOTE_USER} - this variable is set by the authorization phases, which come after the URL translation phase (during which mod_rewrite operates).
On the other hand, because mod_rewrite implements its per-directory context (.htaccess file) via the Fixup phase of the API and because the authorization phases come before this phase, you just can use %{REMOTE_USER} in that context.
%{LA-F:variable} can be used to perform an internal (filename-based) sub-request, to determine the final value of variable. Most of the time, this is the same as LA-U above.
CondPattern is the condition pattern, a regular expression which is applied to the current instance of the TestString. TestString is first evaluated, before being matched against CondPattern.
Remember: CondPattern is a perl compatible regular expression with some additions:
You can prefix the pattern string with a '!' character (exclamation mark) to specify a non-matching pattern.
There are some special variants of CondPatterns. Instead of real regular expression strings you can also use one of the following:
'<CondPattern' (lexicographically precedes)
Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString lexicographically precedes CondPattern.
'>CondPattern' (lexicographically follows)
Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString lexicographically follows CondPattern.
'=CondPattern' (lexicographically equal)
Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString is lexicographically equal to CondPattern (the two strings are exactly equal, character for character). If CondPattern is "" (two quotation marks) this compares TestString to the empty string.
'-d' (is directory)
Treats the TestString as a pathname and tests whether or not it exists, and is a directory.
'-f' (is regular file)
Treats the TestString as a pathname and tests whether or not it exists, and is a regular file.
'-s' (is regular file, with size)
Treats the TestString as a pathname and tests whether or not it exists, and is a regular file with size greater than zero.
'-l' (is symbolic link)
Treats the TestString as a pathname and tests whether or not it exists, and is a symbolic link.
'-F' (is existing file, via subrequest)
Checks whether or not TestString is a valid file, accessible via all the server's currently-configured access controls for that path. This uses an internal subrequest to do the check, so use it with care - it can impact your server's performance!
'-U' (is existing URL, via subrequest)
Checks whether or not TestString is a valid URL, accessible via all the server's currently-configured access controls for that path. This uses an internal subrequest to do the check, so use it with care - it can impact your server's performance!
Note: All of these tests can also be prefixed by an exclamation mark ('!') to negate their meaning.
You can also set special flags for CondPattern by appending [flags] as the third argument to the RewriteCond directive, where flags is a comma-separated list of any of the following flags:
'nocase|NC' (no case)
This makes the test case-insensitive - differences between 'A-Z' and 'a-z' are ignored, both in the expanded TestString and the CondPattern. This flag is effective only for comparisons between TestString and CondPattern. It has no effect on filesystem and subrequest checks.
'ornext|OR' (or next condition)
Use this to combine rule conditions with a local OR instead of the implicit AND. Typical example:
RewriteCond %{REMOTE_HOST} =host1 [OR]
RewriteCond %{REMOTE_HOST} =host2 [OR]
RewriteCond %{REMOTE_HOST} =host3
RewriteRule ...some special stuff for any of these hosts...
Without this flag you would have to write the condition/rule pair three times.
Example:
To rewrite the Homepage of a site according to the ``User-Agent:'' header of the request, you can use the following:
RewriteCond %{HTTP_USER_AGENT} ^Mozilla
RewriteRule ^/$ /homepage.max.html [L]
RewriteCond %{HTTP_USER_AGENT} ^Lynx
RewriteRule ^/$ /homepage.min.html [L]
RewriteRule ^/$ /homepage.std.html [L]
Explanation: If you use a browser which identifies itself as 'Mozilla' (including Netscape Navigator, Mozilla etc), then you get the max homepage (which could include frames, or other special features). If you use the Lynx browser (which is terminal-based), then you get the min homepage (which could be a version designed for easy, text-only browsing). If neither of these conditions apply (you use any other browser, or your browser identifies itself as something non-standard), you get the std (standard) homepage.
RewriteEngine Directive
Description: Enables or disables runtime rewriting engine
Syntax: RewriteEngine on|off
Default: RewriteEngine off
Context: server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Extension
Module: mod_rewrite
The RewriteEngine directive enables or disables the runtime rewriting engine. If it is set to off this module does no runtime processing at all. It does not even update the SCRIPT_URx environment variables.
Use this directive to disable the module instead of commenting out all the RewriteRule directives!
Note that, by default, rewrite configurations are not inherited. This means that you need to have a RewriteEngine on directive for each virtual host in which you wish to use it.
RewriteMap directives of the type prg are not started during server initialization if they're defined in a context that does not have RewriteEngine set to on
RewriteLock Directive
Description: Sets the name of the lock file used for RewriteMap synchronization
Syntax: RewriteLock file-path
Context: server config
Status: Extension
Module: mod_rewrite
This directive sets the filename for a synchronization lockfile which mod_rewrite needs to communicate with RewriteMap programs. Set this lockfile to a local path (not on a NFS-mounted device) when you want to use a rewriting map-program. It is not required for other types of rewriting maps.
RewriteLog Directive
Description: Sets the name of the file used for logging rewrite engine processing
Syntax: RewriteLog file-path
Context: server config, virtual host
Status: Extension
Module: mod_rewrite
The RewriteLog directive sets the name of the file to which the server logs any rewriting actions it performs. If the name does not begin with a slash ('/') then it is assumed to be relative to the Server Root. The directive should occur only once per server config.
To disable the logging of rewriting actions it is not recommended to set Filename to /dev/null, because although the rewriting engine does not then output to a logfile it still creates the logfile output internally. This will slow down the server with no advantage to the administrator! To disable logging either remove or comment out the RewriteLog directive or use RewriteLogLevel 0!
Security
See the Apache Security Tips document for details on how your security could be compromised if the directory where logfiles are stored is writable by anyone other than the user that starts the server.
Example
RewriteLog "/usr/local/var/apache/logs/rewrite.log"
RewriteLogLevel Directive
Description: Sets the verbosity of the log file used by the rewrite engine
Syntax: RewriteLogLevel Level
Default: RewriteLogLevel 0
Context: server config, virtual host
Status: Extension
Module: mod_rewrite
The RewriteLogLevel directive sets the verbosity level of the rewriting logfile. The default level 0 means no logging, while 9 or more means that practically all actions are logged.
To disable the logging of rewriting actions simply set Level to 0. This disables all rewrite action logs.
Using a high value for Level will slow down your Apache server dramatically! Use the rewriting logfile at a Level greater than 2 only for debugging!
Example
RewriteMap Directive
Description: Defines a mapping function for key-lookup
Syntax: RewriteMap MapName MapType:MapSource
Context: server config, virtual host
Status: Extension
Module: mod_rewrite
Compatibility: The choice of different dbm types is available in Apache 2.0.41 and later
The RewriteMap directive defines a Rewriting Map which can be used inside rule substitution strings by the mapping-functions to insert/substitute fields through a key lookup. The source of this lookup can be of various types.
The MapName is the name of the map and will be used to specify a mapping-function for the substitution strings of a rewriting rule via one of the following constructs:
${ MapName : LookupKey }
${ MapName : LookupKey | DefaultValue }
When such a construct occurs, the map MapName is consulted and the key LookupKey is looked-up. If the key is found, the map-function construct is substituted by SubstValue. If the key is not found then it is substituted by DefaultValue or by the empty string if no DefaultValue was specified.
For example, you might define a RewriteMap as:
RewriteMap examplemap txt:/path/to/file/map.txt
You would then be able to use this map in a RewriteRule as follows:
RewriteRule ^/ex/(.*) ${examplemap:$1}
The following combinations for MapType and MapSource can be used:
Standard Plain Text
MapType: txt, MapSource: Unix filesystem path to valid regular file
This is the standard rewriting map feature where the MapSource is a plain ASCII file containing either blank lines, comment lines (starting with a '#' character) or pairs like the following - one per line.
MatchingKey SubstValue
Example
##
## map.txt -- rewriting map
##
Ralf.S.Engelschall rse # Bastard Operator From Hell
Mr.Joe.Average joe # Mr. Average
RewriteMap real-to-user txt:/path/to/file/map.txt
Randomized Plain Text
MapType: rnd, MapSource: Unix filesystem path to valid regular file
This is identical to the Standard Plain Text variant above but with a special post-processing feature: After looking up a value it is parsed according to contained ``|'' characters which have the meaning of ``or''. In other words they indicate a set of alternatives from which the actual returned value is chosen randomly. For example, you might use the following map file and directives to provide a random load balancing between several back-end server, via a reverse-proxy. Images are sent to one of the servers in the 'static' pool, while everything else is sent to one of the 'dynamic' pool.
Example:
Rewrite map file
##
## map.txt -- rewriting map
##
static www1|www2|www3|www4
dynamic www5|www6
Configuration directives
RewriteMap servers rnd:/path/to/file/map.txt
RewriteRule ^/(.*\.(png|gif|jpg)) http://${servers:static}/$1 [NC,P,L]
RewriteRule ^/(.*) http://${servers:dynamic}/$1 [P,L]
Hash File
MapType: dbm[=type], MapSource: Unix filesystem path to valid regular file
Here the source is a binary format DBM file containing the same contents as a Plain Text format file, but in a special representation which is optimized for really fast lookups. The type can be sdbm, gdbm, ndbm, or db depending on compile-time settings. If the type is ommitted, the compile-time default will be chosen. You can create such a file with any DBM tool or with the following Perl script. Be sure to adjust it to create the appropriate type of DBM. The example creates an NDBM file.
#!/path/to/bin/perl
##
## txt2dbm -- convert txt map to dbm format
##
use NDBM_File;
use Fcntl;
($txtmap, $dbmmap) = @ARGV;
open(TXT, "<$txtmap") or die "Couldn't open $txtmap!\n";
tie (%DB, 'NDBM_File', $dbmmap,O_RDWR|O_TRUNC|O_CREAT, 0644)
or die "Couldn't create $dbmmap!\n";
while (<TXT>) {
next if (/^\s*#/ or /^\s*$/);
$DB{$1} = $2 if (/^\s*(\S+)\s+(\S+)/);
}
untie %DB;
close(TXT);
$ txt2dbm map.txt map.db
Internal Function
MapType: int, MapSource: Internal Apache function
Here, the source is an internal Apache function. Currently you cannot create your own, but the following functions already exist:
toupper:
Converts the key to all upper case.
tolower:
Converts the key to all lower case.
escape:
Translates special characters in the key to hex-encodings.
unescape:
Translates hex-encodings in the key back to special characters.
External Rewriting Program
MapType: prg, MapSource: Unix filesystem path to valid regular file
Here the source is a program, not a map file. To create it you can use a language of your choice, but the result has to be an executable program (either object-code or a script with the magic cookie trick '#!/path/to/interpreter' as the first line).
This program is started once, when the Apache server is started, and then communicates with the rewriting engine via its stdin and stdout file-handles. For each map-function lookup it will receive the key to lookup as a newline-terminated string on stdin. It then has to give back the looked-up value as a newline-terminated string on stdout or the four-character string ``NULL'' if it fails (i.e., there is no corresponding value for the given key). A trivial program which will implement a 1:1 map (i.e., key == value) could be:
External rewriting programs are not started if they're defined in a context that does not have RewriteEngine set to on
.
#!/usr/bin/perl
$| = 1;
while (<STDIN>) {
# ...put here any transformations or lookups...
print $_;
}
But be very careful:
``Keep it simple, stupid'' (KISS). If this program hangs, it will cause Apache to hang when trying to use the relevant rewrite rule.
A common mistake is to use buffered I/O on stdout. Avoid this, as it will cause a deadloop! ``$|=1'' is used above, to prevent this.
The RewriteLock directive can be used to define a lockfile which mod_rewrite can use to synchronize communication with the mapping program. By default no such synchronization takes place.
The RewriteMap directive can occur more than once. For each mapping-function use one RewriteMap directive to declare its rewriting mapfile. While you cannot declare a map in per-directory context it is of course possible to use this map in per-directory context.
Note
For plain text and DBM format files the looked-up keys are cached in-core until the mtime of the mapfile changes or the server does a restart. This way you can have map-functions in rules which are used for every request. This is no problem, because the external lookup only happens once!
top
RewriteOptions Directive
Description: Sets some special options for the rewrite engine
Syntax: RewriteOptions Options
Default: RewriteOptions MaxRedirects=10
Context: server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Extension
Module: mod_rewrite
Compatibility: MaxRedirects is available in Apache 2.0.45 and later
The RewriteOptions directive sets some special options for the current per-server or per-directory configuration. The Option strings can be one of the following:
inherit
This forces the current configuration to inherit the configuration of the parent. In per-virtual-server context this means that the maps, conditions and rules of the main server are inherited. In per-directory context this means that conditions and rules of the parent directory's .htaccess configuration are inherited.
MaxRedirects=number
In order to prevent endless loops of internal redirects issued by per-directory RewriteRules, mod_rewrite aborts the request after reaching a maximum number of such redirects and responds with an 500 Internal Server Error. If you really need more internal redirects than 10 per request, you may increase the default to the desired value.
AllowAnyURI
When RewriteRule is used in VirtualHost or server context with version 2.0.65 or later of httpd, mod_rewrite will only process the rewrite rules if the request URI is a URL-path. This avoids some security issues where particular rules could allow "surprising" pattern expansions (see CVE-2011-3368 and CVE-2011-4317). To lift the restriction on matching a URL-path, the AllowAnyURI option can be enabled, and mod_rewrite will apply the rule set to any request URI string, regardless of whether that string matches the URL-path grammar required by the HTTP specification.
!!! Security Warning !!!
Enabling this option will make the server vulnerable to security issues if used with rewrite rules which are not carefully authored. It is strongly recommended that this option is not used. In particular, beware of input strings containing the '@' character which could change the interpretation of the transformed URI, as per the above CVE names.
MergeBase
With this option, the value of RewriteBase is copied from where it's explicitly defined into any sub-directory or sub-location that doesn't define its own RewriteBase. This flag is available for Apache HTTP Server 2.0.65 and later.
RewriteRule Directive
Description: Defines rules for the rewriting engine
Syntax: RewriteRule Pattern Substitution
Context: server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Extension
Module: mod_rewrite
Compatibility: The cookie-flag is available in Apache 2.0.40 and later.
The RewriteRule directive is the real rewriting workhorse. The directive can occur more than once, with each instance defining a single rewrite rule. The order in which these rules are defined is important - this is the order in which they will be applied at run-time.
Pattern is a perl compatible regular expression, which is applied to the current URL. ``Current'' means the value of the URL when this rule is applied. This may not be the originally requested URL, which may already have matched a previous rule, and have been altered.
Some hints on the syntax of regular expressions:
Text:
. Any single character
[chars] Character class: Any character of the class ``chars''
[^chars] Character class: Not a character of the class ``chars''
text1|text2 Alternative: text1 or text2
Quantifiers:
? 0 or 1 occurrences of the preceding text
* 0 or N occurrences of the preceding text (N > 0)
+ 1 or N occurrences of the preceding text (N > 1)
Grouping:
(text) Grouping of text
(used either to set the borders of an alternative as above, or
to make backreferences, where the Nth group can
be referred to on the RHS of a RewriteRule as $N)
Anchors:
^ Start-of-line anchor
$ End-of-line anchor
Escaping:
\char escape the given char
(for instance, to specify the chars ".[]()" etc.)
For more information about regular expressions, have a look at the perl regular expression manpage ("perldoc perlre"). If you are interested in more detailed information about regular expressions and their variants (POSIX regex etc.) the following book is dedicated to this topic:
Mastering Regular Expressions, 2nd Edition
Jeffrey E.F. Friedl
O'Reilly & Associates, Inc. 2002
ISBN 0-596-00289-0
In mod_rewrite, the NOT character ('!') is also available as a possible pattern prefix. This enables you to negate a pattern; to say, for instance: ``if the current URL does NOT match this pattern''. This can be used for exceptional cases, where it is easier to match the negative pattern, or as a last default rule.
Note
When using the NOT character to negate a pattern, you cannot include grouped wildcard parts in that pattern. This is because, when the pattern does NOT match (ie, the negation matches), there are no contents for the groups. Thus, if negated patterns are used, you cannot use $N in the substitution string!
The substitution of a rewrite rule is the string which is substituted for (or replaces) the original URL which Pattern matched.
In addition to plain text, it can include
back-references ($N) to the RewriteRule pattern
back-references (%N) to the last matched RewriteCond pattern
server-variables as in rule condition test-strings (%{VARNAME})
mapping-function calls (${mapname:key|default})
Back-references are identifiers of the form $N (N=0..9), which will be replaced by the contents of the Nth group of the matched Pattern. The server-variables are the same as for the TestString of a RewriteCond directive. The mapping-functions come from the RewriteMap directive and are explained there. These three types of variables are expanded in the order above.
As already mentioned, all rewrite rules are applied to the Substitution (in the order in which they are defined in the config file). The URL is completely replaced by the Substitution and the rewriting process continues until all rules have been applied, or it is explicitly terminated by a L flag - see below.
There is a special substitution string named '-' which means: NO substitution! This is useful in providing rewriting rules which only match URLs but do not substitute anything for them. It is commonly used in conjunction with the C (chain) flag, in order to apply more than one pattern before substitution occurs.
Additionally you can set special flags for Substitution by appending [flags] as the third argument to the RewriteRule directive. Flags is a comma-separated list of any of the following flags:
'chain|C' (chained with next rule)
This flag chains the current rule with the next rule (which itself can be chained with the following rule, and so on). This has the following effect: if a rule matches, then processing continues as usual - the flag has no effect. If the rule does not match, then all following chained rules are skipped. For instance, it can be used to remove the ``.www'' part, inside a per-directory rule set, when you let an external redirect happen (where the ``.www'' part should not occur!).
'cookie|CO=NAME:VAL:domain[:lifetime[:path]]' (set cookie)
This sets a cookie in the client's browser. The cookie's name is specified by NAME and the value is VAL. The domain field is the domain of the cookie, such as '.apache.org', the optional lifetime is the lifetime of the cookie in minutes, and the optional path is the path of the cookie
'env|E=VAR:VAL' (set environment variable)
This forces an environment variable named VAR to be set to the value VAL, where VAL can contain regexp backreferences ($N and %N) which will be expanded. You can use this flag more than once, to set more than one variable. The variables can later be dereferenced in many situations, most commonly from within XSSI (via <!--#echo var="VAR"-->) or CGI ($ENV{'VAR'}). You can also dereference the variable in a later RewriteCond pattern, using %{ENV:VAR}. Use this to strip information from URLs, while maintaining a record of that information.
'forbidden|F' (force URL to be forbidden)
This forces the current URL to be forbidden - it immediately sends back a HTTP response of 403 (FORBIDDEN). Use this flag in conjunction with appropriate RewriteConds to conditionally block some URLs.
'gone|G' (force URL to be gone)
This forces the current URL to be gone - it immediately sends back a HTTP response of 410 (GONE). Use this flag to mark pages which no longer exist as gone.
'last|L' (last rule)
Stop the rewriting process here and don't apply any more rewrite rules. This corresponds to the Perl last command or the break command in C. Use this flag to prevent the currently rewritten URL from being rewritten further by following rules. For example, use it to rewrite the root-path URL ('/') to a real one, e.g., '/e/www/'.
'next|N' (next round)
Re-run the rewriting process (starting again with the first rewriting rule). This time, the URL to match is no longer the original URL, but rather the URL returned by the last rewriting rule. This corresponds to the Perl next command or the continue command in C. Use this flag to restart the rewriting process - to immediately go to the top of the loop. Be careful not to create an infinite loop!
'nocase|NC' (no case)
This makes the Pattern case-insensitive, ignoring difference between 'A-Z' and 'a-z' when Pattern is matched against the current URL.
'noescape|NE' (no URI escaping of output)
This flag prevents mod_rewrite from applying the usual URI escaping rules to the result of a rewrite. Ordinarily, special characters (such as '%', '$', ';', and so on) will be escaped into their hexcode equivalents ('%25', '%24', and '%3B', respectively); this flag prevents this from happening. This allows percent symbols to appear in the output, as in
RewriteRule /foo/(.*) /bar?arg=P1\%3d$1 [R,NE]
which would turn '/foo/zed' into a safe request for '/bar?arg=P1=zed'.
'nosubreq|NS' ( not for internal sub-requests)
This flag forces the rewrite engine to skip a rewrite rule if the current request is an internal sub-request. For instance, sub-requests occur internally in Apache when mod_include tries to find out information about possible directory default files (index.xxx). On sub-requests it is not always useful, and can even cause errors, if the complete set of rules are applied. Use this flag to exclude some rules.
To decide whether or not to use this rule: if you prefix URLs with CGI-scripts, to force them to be processed by the CGI-script, it's likely that you will run into problems (or significant overhead) on sub-requests. In these cases, use this flag.
'proxy|P' (force proxy)
This flag forces the substitution part to be internally sent as a proxy request and immediately (rewrite processing stops here) put through the proxy module. You must make sure that the substitution string is a valid URI (typically starting with http://hostname) which can be handled by the Apache proxy module. If not, you will get an error from the proxy module. Use this flag to achieve a more powerful implementation of the ProxyPass directive, to map remote content into the namespace of the local server.
Note: mod_proxy must be enabled in order to use this flag.
'passthrough|PT' (pass through to next handler)
This flag forces the rewrite engine to set the uri field of the internal request_rec structure to the value of the filename field. This flag is just a hack to enable post-processing of the output of RewriteRule directives, using Alias, ScriptAlias, Redirect, and other directives from various URI-to-filename translators. For example, to rewrite /abc to /def using mod_rewrite, and then /def to /ghi using mod_alias:
RewriteRule ^/abc(.*) /def$1 [PT]
Alias /def /ghi
If you omit the PT flag, mod_rewrite will rewrite uri=/abc/... to filename=/def/... as a full API-compliant URI-to-filename translator should do. Then mod_alias will try to
do a URI-to-filename transition, which will fail.
Note: You must use this flag if you want to mix directives from different modules which allow URL-to-filename translators. The typical example is the use of mod_alias and mod_rewrite.
'qsappend|QSA' (query string append)
This flag forces the rewrite engine to append a query string part of the substitution string to the existing string, instead of replacing it. Use this when you want to add more data to the query string via a rewrite rule.
'redirect|R [=code]' (force redirect)
Prefix Substitution with http://thishost[:thisport]/ (which makes the new URL a URI) to force a external redirection. If no code is given, a HTTP response of 302 (MOVED TEMPORARILY) will be returned. If you want to use other response codes in the range 300-400, simply specify the appropriate number or use one of the following symbolic names: temp (default), permanent, seeother. Use this for rules to canonicalize the URL and return it to the client - to translate ``/~'' into ``/u/'', or to always append a slash to /u/user, etc.
Note: When you use this flag, make sure that the substitution field is a valid URL! Otherwise, you will be redirecting to an invalid location. Remember that this flag on its own will only prepend http://thishost[:thisport]/ to the URL, and rewriting will continue. Usually, you will want to stop rewriting at this point, and redirect immediately. To stop rewriting, you should add the 'L' flag.
'skip|S=num' (skip next rule(s))
This flag forces the rewriting engine to skip the next num rules in sequence, if the current rule matches. Use this to make pseudo if-then-else constructs: The last rule of the then-clause becomes skip=N, where N is the number of rules in the else-clause. (This is not the same as the 'chain|C' flag!)
'type|T=MIME-type' (force MIME type)
Force the MIME-type of the target file to be MIME-type. This can be used to set up the content-type based on some conditions. For example, the following snippet allows .php files to be displayed by mod_php if they are called with the .phps extension:
RewriteRule ^(.+\.php)s$ $1 [T=application/x-httpd-php-source]
Home directory expansion
When the substitution string begins with a string resembling "/~user" (via explicit text or backreferences), mod_rewrite performs home directory expansion independent of the presence or configuration of mod_userdir.
This expansion does not occur when the PT flag is used on the RewriteRule directive.
Note: Enabling rewrites in per-directory context
To enable the rewriting engine for per-directory configuration files, you need to set ``RewriteEngine On'' in these files and ``Options FollowSymLinks'' must be enabled. If your administrator has disabled override of FollowSymLinks for a user's directory, then you cannot use the rewriting engine. This restriction is needed for security reasons.
Note: Pattern matching in per-directory context
Never forget that Pattern is applied to a complete URL in per-server configuration files. However, in per-directory configuration files, the per-directory prefix (which always is the same for a specific directory) is automatically removed for the pattern matching and automatically added after the substitution has been done. This feature is essential for many sorts of rewriting - without this, you would always have to match the parent directory which is not always possible.
There is one exception: If a substitution string starts with ``http://'', then the directory prefix will not be added, and an external redirect or proxy throughput (if flag P is used) is forced!
Note: Substitution of Absolute URLs
When you prefix a substitution field with http://thishost[:thisport], mod_rewrite will automatically strip that out. This auto-reduction on URLs with an implicit external redirect is most useful in combination with a mapping-function which generates the hostname part.
Remember: An unconditional external redirect to your own server will not work with the prefix http://thishost because of this feature. To achieve such a self-redirect, you have to use the R-flag.
Note: Query String
The Pattern will not be matched against the query string. Instead, you must use a RewriteCond with the %{QUERY_STRING} variable. You can, however, create URLs in the substitution string, containing a query string part. Simply use a question mark inside the substitution string, to indicate that the following text should be re-injected into the query string. When you want to erase an existing query string, end the substitution string with just a question mark. To combine a new query string with an old one, use the [QSA] flag.
Here are all possible substitution combinations and their meanings:
Inside per-server configuration (httpd.conf)
for request ``GET /somepath/pathinfo'':
Given Rule Resulting Substitution
---------------------------------------------- ----------------------------------
^/somepath(.*) otherpath$1 invalid, not supported
^/somepath(.*) otherpath$1 [R] invalid, not supported
^/somepath(.*) otherpath$1 [P] invalid, not supported
---------------------------------------------- ----------------------------------
^/somepath(.*) /otherpath$1 /otherpath/pathinfo
^/somepath(.*) /otherpath$1 [R] http://thishost/otherpath/pathinfo
via external redirection
^/somepath(.*) /otherpath$1 [P] doesn't make sense, not supported
---------------------------------------------- ----------------------------------
^/somepath(.*) http://thishost/otherpath$1 /otherpath/pathinfo
^/somepath(.*) http://thishost/otherpath$1 [R] http://thishost/otherpath/pathinfo
via external redirection
^/somepath(.*) http://thishost/otherpath$1 [P] doesn't make sense, not supported
---------------------------------------------- ----------------------------------
^/somepath(.*) http://otherhost/otherpath$1 http://otherhost/otherpath/pathinfo
via external redirection
^/somepath(.*) http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo
via external redirection
(the [R] flag is redundant)
^/somepath(.*) http://otherhost/otherpath$1 [P] http://otherhost/otherpath/pathinfo
via internal proxy
Inside per-directory configuration for /somepath
(/physical/path/to/somepath/.htacccess, with RewriteBase /somepath)
for request ``GET /somepath/localpath/pathinfo'':
Given Rule Resulting Substitution
---------------------------------------------- ----------------------------------
^localpath(.*) otherpath$1 /somepath/otherpath/pathinfo
^localpath(.*) otherpath$1 [R] http://thishost/somepath/otherpath/pathinfo
via external redirection
^localpath(.*) otherpath$1 [P] doesn't make sense, not supported
---------------------------------------------- ----------------------------------
^localpath(.*) /otherpath$1 /otherpath/pathinfo
^localpath(.*) /otherpath$1 [R] http://thishost/otherpath/pathinfo
via external redirection
^localpath(.*) /otherpath$1 [P] doesn't make sense, not supported
---------------------------------------------- ----------------------------------
^localpath(.*) http://thishost/otherpath$1 /otherpath/pathinfo
^localpath(.*) http://thishost/otherpath$1 [R] http://thishost/otherpath/pathinfo
via external redirection
^localpath(.*) http://thishost/otherpath$1 [P] doesn't make sense, not supported
---------------------------------------------- ----------------------------------
^localpath(.*) http://otherhost/otherpath$1 http://otherhost/otherpath/pathinfo
via external redirection
^localpath(.*) http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo
via external redirection
(the [R] flag is redundant)
^localpath(.*) http://otherhost/otherpath$1 [P] http://otherhost/otherpath/pathinfo
via internal proxy