Espressioni Regolari (Regulars Expressions)
Identificatori di testo:
. qualsiasi carattere
[abc] a, b oppure c
[^abc] né a, né b né c
abc|def abc oppure def
Quantificatori:
? 0 o 1 occorrenze dell’identificatore di testo precedente
* 0 o N occorrenze dell’identificatore di testo precedente (N>0)
+ 1 o N occorrenze dell’identificatore di testo precedente (N>1)
Raggruppamento:
(identificatori di testo) le parentesi tonde sono un modo per identificare un gruppo di identificatori di testo come una singola unità atomica.
Ancore:
^ inizio linea
$ fine linea
Escape:
\ esegue l’escape del carattere che segue
Negazione:
è possibile eseguire la “negazione” di un determinato pattern facendolo precedere dal carattere punto esclamativo !
ALTRO
Le espressioni ([a-z]+) e ([0-9]+) indicano una porzione variabile della url che può contenere una qualsiasi serie di lettere nel primo caso e una qualsiasi serie di numeri nel secondo. Queste variabili verranno usate per realizzare la rewrite.
Infatti il simbolo $ seguito da un numero ($1 e $2 nel nostro caso) utilizzato nella parte destra della nostra regola serve per richiamare (posizionalmente) tali variabili presenti nella parte di sinistra.
----------------------------------------------
[] specifies a character class, in which any character within the brackets will be a match. e.g., [xyz] will match either an x, y, or z.
[]+ character class in which any combination of items within the brackets will be a match. e.g., [xyz]+ will match any number of x’s, y’s, z’s, or any combination of these characters.
[^] specifies not within a character class. e.g., [^xyz] will match any character that is neither x, y, nor z.
[a-z] a dash (-) between two characters within a character class ([]) denotes the range of characters between them. e.g., [a-zA-Z] matches all lowercase and uppercase letters from a to z.
a{n} specifies an exact number, n, of the preceding character. e.g., x{3} matches exactly three x’s.
a{n,} specifies n or more of the preceding character. e.g., x{3,} matches three or more x’s.
a{n,m} specifies a range of numbers, between n and m, of the preceding character. e.g., x{3,7} matches three, four, five, six, or seven x’s.
() used to group characters together, thereby considering them as a single unit. e.g., (perishable)?press will match press, with or without the perishable prefix.
^ denotes the beginning of a regex (regex = regular expression) test string. i.e., begin argument with the proceeding character.
$ denotes the end of a regex (regex = regular expression) test string. i.e., end argument with the previous character.
? declares as optional the preceding character. e.g., monzas? will match monza or monzas, while mon(za)? will match either mon or monza. i.e., x? matches zero or one of x.
! declares negation. e.g., “!string” matches everything except “string”.
. a dot (or period) indicates any single arbitrary character.
- instructs “not to” rewrite the URL, as in “...domain.com.* - [F]”.
+ matches one or more of the preceding character. e.g., G+ matches one or more G’s, while "+" will match one or more characters of any kind.
* matches zero or more of the preceding character. e.g., use “.*” as a wildcard.
| declares a logical “or” operator. for example, (x|y) matches x or y.
\ escapes special characters ( ^ $ ! . * | ). e.g., use “\.” to indicate/escape a literal dot.
\. indicates a literal dot (escaped).
/ zero or more slashes.
.* zero or more arbitrary characters.
^$ defines an empty string.
^.*$ the standard pattern for matching everything.
[^/.] defines one character that is neither a slash nor a dot.
[^/.]+ defines any number of characters which contains neither slash nor dot.
http:// this is a literal statement — in this case, the literal character string, “http://”.
^domain.* defines a string that begins with the term “domain”, which then may be proceeded by any number of any characters.
^domain\.com$ defines the exact string “domain.com”.