A regular expression (also known as regex or regexp) is a search pattern consisting of a set of characters and optional flags. You can use a regular expression to define a search pattern to find data in a text.
Flags or modifiers in regular expressions are used to customize searching.
Flag | Description |
---|---|
g | Global match - finds all matches instead of stopping at the first one |
i | Ignore case - performs a case insensitive search |
m | Multiline - allows ^ and $ to match start and end of line |
u | Unicode - enables full Unicode support |
y | Sticky - starts searching at the lastIndex position |
s | Singleline - also known as "dotall", allows . to match newlines \n |
Brackets are used to search for characters in a given range in regular expressions.
Expression | Description |
---|---|
[...] | One of the characters in the brackets |
[^...] | One of the characters NOT in the brackets |
[a-z] | One of the characters from a to z |
[^a-z] | One of the characters NOT from a to z |
[A-Z] | One of the characters from A to Z |
[^A-Z] | One of the characters NOT from A to Z |
[0-9] | One of the characters from 0 to 9 (a digit character) |
[^0-9] | One of the characters NOT from 0 to 9 (a non-digit character) |
Groups in regular expressions are part of a search pattern enclosed in parentheses (...)
.
Expression | Description |
---|---|
(...) | A capturing group |
(?:...) | A non-capturing group |
(a|b) | Either a or b |
Character classes are characters with a special meaning to define search patterns in regular expressions.
Character | Description |
---|---|
. | A single character except newline \n |
\d | A digit character. Equivalent to [0-9] . |
\D | A non-digit character. Equivalent to [^0-9] . |
\w | A word character. An alphanumeric character including underscore. Equivalent to [a-zA-Z0-9_] . |
\W | A non-word character. NOT an alphanumeric character including underscore. Equivalent to [^a-zA-Z0-9_] . |
\s | A whitespace character |
\S | A non-whitespace character |
[\b] | A literal backspace character |
Character | Description |
---|---|
\ | An escape character |
\0 | A null character |
\n | A newline character |
\t | A tab character |
\v | A vertical tab character |
\r | A carriage return character |
\f | A form feed character |
\cX | A control character where X is a character from A-Z |
\ooo | The character specified by three octal digits |
\xhh | The character specified by two hexadecimal digits |
\uhhhh | The Unicode character specified by four hexadecimal digits |
Quantifiers in regular expressions specify the number of occurrences a character, character class, or group must be present.
Character | Description |
---|---|
* | Zero or more times |
? | Zero or one time |
+ | One or more times |
{n} | Exactly n times |
{n,m} | n to m times |
{n,} | n times or more |
Assertions are regular expressions consisting of anchors and lookaheads that cause a match to succeed if found or fail otherwise.
Character | Description |
---|---|
^ | Start of string or line |
$ | End of string or line |
\b | A word boundary |
\B | A non-word boundary |
(?=...) | Positive lookahead |
(?!...) | Negative lookahead |