Question:
What are the most common regular expression rules and operators?
Answer:
You must escape all meta characters: . $ ^ { [ ( | ) ] } * + ? \
This will match a point: '\.' and this any character: '.'
Any other character will match itself.
Anchors
Char Sequence | Description |
---|---|
^ | Matches start of string or line |
$ | Matches end of string or line |
\A | Matches start of string |
\Z | Matches end of string |
\b | Matches word boundary; boundary between a \w and a \W character. |
\B | Matches Not word boundary; match must not occur on a \b boundary. |
Character Classes etc
Char Sequence | Description |
---|---|
. | Matches any single character (except new line) |
[aoxz] | Matches any character between the brackets |
[^aoxz] | Matches any character not between the brackets |
[0-9] | Matches any digit from 0 to 9; hyphen (-) allows for contiguous character ranges. |
[A-Z] | Matches any character from uppercase A to uppercase Z |
[a-z] | Matches any character from lowercase a to lowercase z |
[A-z] | Matches any character from uppercase A to lowercase z |
(one|two) | Matches alternatives; has lowest precedence of all operators. |
\d | Matches any digit |
\D | Matches any non-digit character |
\s | Matches any whitespace character |
\S | Matches any non-whitespace character |
\w | Matches any word character [a-zA-Z_0-9]. |
\W | Matches any non-word character |
Special Escapes
Char Sequence | Description |
---|---|
\0 | Matches a NUL character |
\n | Matches a new line character |
\f | Matches a form feed character |
\r | Matches a carriage return character |
\t | Matches a tab character |
\v | Matches a vertical tab character |
\019 | Matches the character specified by an octal number 019 |
\x20 | Matches the character specified by a hexadecimal number 20 |
\u00E0 | Matches a specific Unicode code point (here à) |
Quantifiers
Char Sequence | Description |
---|---|
* | = {0,} Matches any string that contains zero or more occurrences of of previous expression, greedy by default |
+ | = {1,} Matches any string that contains at least one of previous expression, greedy by default |
? | = {0,1} Matches any string that contains zero or one occurrences of of previous expression |
{n} | Matches any string that contains a sequence of n previous expression |
{n,} | Matches any string that contains a sequence of at least n of previous expression, greedy by default |
{min,max} | Matches any string that contains a sequence of at least min and at most max of previous expression |
#? | where # is a quantifier: matches as few as possible, it makes the greedy quantifier lazy |
Grouping, Backreferences and capturing substrings
Char Sequence | Description |
---|---|
(bla) | Captures the matched subexpression, adds it ot the backreference array. Can be accessed by number using for example $1 for the first group. |
(?:foo)+bar | Non-Capturing Group : Matches any occurrence of foo, followed directly by bar. The foo group is not added to the backreference array. |
a(?=b) | Positive lookahead assertion. Matches a if followed directly by b |
a(?!b) | Negative lookahead assertion. Matches a if it is not followed directly by b |
(?<=a)b | Positive lookbehind assertion. Matches b if preceded directly by a (not supported in js) |
(?<!a)b | Negative lookbehind assertion. Matches b if it is not preceded directly by a (not supported in js) |
(?(if)then|else) | Conditional regular expressions. |
(?# comment) | Inline comments |
Pattern Modifiers
Char Sequence | Description |
---|---|
g | global (not only one match) |
i | case-insensitive |
s | single line |
m | multi-line |
x | comments and white-space allowed |
Javascript example using a literal regular expression object :
Test if a given string matches exactly a given pattern (case-insensitive /i/)
Javascript example using the RegExp object :
Replace first 'xx' in string (case-insensitive /i/)
Javascript example using the Pattern object
Replace all occurances of 'xx' in string (case-insensitive and global /gi/)
var str = "aaXXbbXX"; var pat = /XX/gi var result = str.replace(pat,"__"); document.write(result);
Javascript example using the Pattern and Group reference
Replace all occurances of 'xx' that are preceded by at least one 'a'
var str = "oaaaXXbabXX"; var pat = /(a+)XX/gi var result = str.replace(pat,"$1__"); document.write(result);
Advanced Regular Expression Site: www.regular-expressions.info
A good Regular Expression Test site: regextester.com/
Another Regular Expression Cheat Sheet (.NET): RegExLib.com/Cheatsheet
No comments:
Post a Comment