Question:
What are the most common regular expression rules and operators?
Answer:
You must escape all meta characters: . $ ^ { [ ( | ) ] } * + ? \
This will match a point: '\.' and this any character: '.'
Any other character will match itself.
Anchors
| Char Sequence | Description |
|---|---|
| ^ | Matches start of string or line |
| $ | Matches end of string or line |
| \A | Matches start of string |
| \Z | Matches end of string |
| \b | Matches word boundary; boundary between a \w and a \W character. |
| \B | Matches Not word boundary; match must not occur on a \b boundary. |
Character Classes etc
| Char Sequence | Description |
|---|---|
| . | Matches any single character (except new line) |
| [aoxz] | Matches any character between the brackets |
| [^aoxz] | Matches any character not between the brackets |
| [0-9] | Matches any digit from 0 to 9; hyphen (-) allows for contiguous character ranges. |
| [A-Z] | Matches any character from uppercase A to uppercase Z |
| [a-z] | Matches any character from lowercase a to lowercase z |
| [A-z] | Matches any character from uppercase A to lowercase z |
| (one|two) | Matches alternatives; has lowest precedence of all operators. |
| \d | Matches any digit |
| \D | Matches any non-digit character |
| \s | Matches any whitespace character |
| \S | Matches any non-whitespace character |
| \w | Matches any word character [a-zA-Z_0-9]. |
| \W | Matches any non-word character |
Special Escapes
| Char Sequence | Description |
|---|---|
| \0 | Matches a NUL character |
| \n | Matches a new line character |
| \f | Matches a form feed character |
| \r | Matches a carriage return character |
| \t | Matches a tab character |
| \v | Matches a vertical tab character |
| \019 | Matches the character specified by an octal number 019 |
| \x20 | Matches the character specified by a hexadecimal number 20 |
| \u00E0 | Matches a specific Unicode code point (here à) |
Quantifiers
| Char Sequence | Description |
|---|---|
| * | = {0,} Matches any string that contains zero or more occurrences of of previous expression, greedy by default |
| + | = {1,} Matches any string that contains at least one of previous expression, greedy by default |
| ? | = {0,1} Matches any string that contains zero or one occurrences of of previous expression |
| {n} | Matches any string that contains a sequence of n previous expression |
| {n,} | Matches any string that contains a sequence of at least n of previous expression, greedy by default |
| {min,max} | Matches any string that contains a sequence of at least min and at most max of previous expression |
| #? | where # is a quantifier: matches as few as possible, it makes the greedy quantifier lazy |
Grouping, Backreferences and capturing substrings
| Char Sequence | Description |
|---|---|
| (bla) | Captures the matched subexpression, adds it ot the backreference array. Can be accessed by number using for example $1 for the first group. |
| (?:foo)+bar | Non-Capturing Group : Matches any occurrence of foo, followed directly by bar. The foo group is not added to the backreference array. |
| a(?=b) | Positive lookahead assertion. Matches a if followed directly by b |
| a(?!b) | Negative lookahead assertion. Matches a if it is not followed directly by b |
| (?<=a)b | Positive lookbehind assertion. Matches b if preceded directly by a (not supported in js) |
| (?<!a)b | Negative lookbehind assertion. Matches b if it is not preceded directly by a (not supported in js) |
| (?(if)then|else) | Conditional regular expressions. |
| (?# comment) | Inline comments |
Pattern Modifiers
| Char Sequence | Description |
|---|---|
| g | global (not only one match) |
| i | case-insensitive |
| s | single line |
| m | multi-line |
| x | comments and white-space allowed |
Javascript example using a literal regular expression object :
Test if a given string matches exactly a given pattern (case-insensitive /i/)
Javascript example using the RegExp object :
Replace first 'xx' in string (case-insensitive /i/)
Javascript example using the Pattern object
Replace all occurances of 'xx' in string (case-insensitive and global /gi/)
var str = "aaXXbbXX"; var pat = /XX/gi var result = str.replace(pat,"__"); document.write(result);
Javascript example using the Pattern and Group reference
Replace all occurances of 'xx' that are preceded by at least one 'a'
var str = "oaaaXXbabXX"; var pat = /(a+)XX/gi var result = str.replace(pat,"$1__"); document.write(result);
Advanced Regular Expression Site: www.regular-expressions.info
A good Regular Expression Test site: regextester.com/
Another Regular Expression Cheat Sheet (.NET): RegExLib.com/Cheatsheet
No comments:
Post a Comment