How to control the overal mode of a regular expression.
Individual Characters
This page describes how to specify which characters should match at an arbitary position within a search text.
Character
Description
Example
The wildcard
.
"dot"
The dot token matches any character except return and newline. Return and newline can be matched by dot, using the
(?s)
and
(?-s)
options described on the Modes page.
.
matches
x
Ways to specify a particular test character
Any character except
[\^$.|?*+(){
Aside from those "special characters" listed, any character may be used to represent itself as a token.
a
matches
a
\
"backslash" followed by any of
[^$.|?*+(){
A backslash may be used before any "special character" in order that special characters may represent themselves as tokens. This scheme is refered to as an "escape".
\+
matches
+
\xFF
where FF are 2 hexadecimal digits
Backslash x, may be used with two hexdecimal digits, as a token capable of defining any 8 bit character.
\x21
matches
!
\n
,
\r
and
\t
Backslash r, n and t, may be used as tokens to match ASCII return, newline and tab characters.
\r\n
matches a DOS/Windows CRLF line break.
\\
Backslash backslash, may be used as a token to match backslash.
\\
matches
\
Ways to specify a specific range of test characters, valid in a specific position
\d
,
\w
and
\s
Backslash d, w and s, may be used as a token to match any digit, word or space respectively.
[\d\s]
matches a character that is a digit or whitespace
\D
,
\W
and
\S
Backslash D, W and S, are tokens which match characters that are not digits, words and spaces respectively.
\D
matches a character that is not a digit
Ways to specify a custom range of test characters, valid in a specific position
[
"opening square bracket"
Begins description of a range of characters, as a single token. It matches at a single location in the search text. Since the range can only be composed of characters a different syntax is used inside the range delimiters.
Any character except
^-]\
When inside the range delimiters, additional characters are added as alternative tokens at a single location in the search text.
[xyz]
matches
x
,
y
or
z
\
"backslash" followed by any of
^-]\
The escape mechanism may be used inside the range delimiters, in any case where the escape is a token.
[\^\]]
matches
^
or
]
-
"minus" except immediately after the opening
[
Minus allows the specification of an implicit subrange. Such a range begins at the ASCII value of the character that preceeds the minus sign and ends at the ASCII value of the character that follows the minus sign. Where the minus sign immediately follows the opening range delimiter, it represents a minus sign.
[a-zA-Z0-9]
matches any letter or digit
^
"caret" immediately after the opening
[
The caret allows the range to be inverted such that the range as a whole represents any character not specified within. Where the caret is not placed immediately after the opening range delimiter, it represents itself as the caret.