Regex Cheat Sheet
Regular Expression Quick Reference Cheat Sheet
A quick start regex cheat sheet reference guide for regular expressions, including regex syntax, symbols, ranges, grouping, assertions, Unicode handling, and some practical examples.
Here's a quick regular expressions cheat sheet with examples to get started:
-
Basic Characters:
.: Matches any character except newline. Example:a.cmatchesabc,adc.\w: Matches a word character (letters, digits,_). Example:\w+matcheshello123.\d: Matches any digit (0-9). Example:\d+matches123.\s: Matches whitespace (space, tab, newline). Example:\s+matches spaces inhello world.
-
Anchors:
^: Matches the start of a string. Example:^hellomatcheshello world.$: Matches the end of a string. Example:world$matcheshello world.
-
Quantifiers:
*: Matches 0 or more occurrences. Example:a*matchesaaa,a, or nothing.+: Matches 1 or more occurrences. Example:a+matchesaaa,a, but not empty.{n}: Matches exactly n occurrences. Example:a{3}matchesaaa.
-
Groups:
(abc): Capturesabcas a group.(?:abc): Matchesabcwithout capturing.(?<name>abc): Capturesabcand names itname.
Regular expressions (regex) are powerful tools for text matching and manipulation. This regex cheat sheet is a quick start regex tutorial, helping you understand regex patterns, regex syntax, and some practical applications. Whether you need a Python regex, Java regex, or JavaScript regex, this guide is a definite beginner must. Use the flavor filter to check compatibility and help you save time before pulling your hair out wondering why \w does not work in your Bash Script or why \p{Devanagari} is not working in your JavaScript Regex. I have also provided compatibility in table form later on in the page.
Regex Features and Examples
Character Classes
"John.Doe@techearl.com, 123-456-7890, 2024-01-15"
| Pattern | Description | Example | Match |
|---|---|---|---|
. | Any character except newline. | J.h | Joh in "John.Doe" |
\w | Word character (letters, digits, underscore) | \w+ | John, Doe, techearl, com, 123, 456, 7890, 2024, 01, 15 |
\d | Digit (0-9) | \d+ | 123, 456, 7890, 2024, 01, 15 |
\s | Whitespace (space, tab, newline) | \s+ | Spaces after commas in "John.Doe@techearl.com, 123-456-7890, 2024-01-15" |
[abc] | Matches a, b, or c. | [abc] | c in "techearl.com", a in "techearl.com" |
[^abc] | Matches anything except a, b, c. | [^abc] | J, o, h, n, ., D, e, etc. |
[a-zA-Z] | Matches any letter. | [a-zA-Z]+ | John, Doe, techearl, com |
Anchors and Boundaries
"SuperHero saves the day! Not so super villain"
| Pattern | Description | Example Regex | Match |
|---|---|---|---|
^ | Start of string, or start of line in multi-line mode | ^Super | Matches Super at start in "SuperHero saves the day!" |
$ | End of string, or end of line in multi-line mode | villain$ | Matches villain at end in "Not so super villain" |
\A | Start of string (not affected by multi-line mode) | \ASuper | Only matches Super at very start in "SuperHero saves the day!" |
\z | End of string (strict match) | villain\z | Only matches villain at very end in "Not so super villain" |
\Z | End of string, ignoring trailing newline | villain\Z | Matches villain in both "Not so super villain" and "Not so super villain\n" |
\G | Start of match or end of previous match | \G\w+\s* | Matches words consecutively: SuperHero, saves, the, day |
\b | Word boundary | \bsuper\b | Matches super in "Not so super villain" but not in "SuperHero" |
\b | Word boundary (Unicode - requires unicode flag) | \b\p{L}+\b | Matches привет, café, 안녕 in "привет café 안녕" (with unicode flag enabled) |
\B | Not a word boundary | \BHero\B | Matches Hero in "SuperHero" but not in "Hero saves" |
Quantifiers
| Pattern | Description | Example | Match |
|---|---|---|---|
* | 0 or more occurrences. | ba* | b, ba, baa |
+ | 1 or more occurrences. | ba+ | ba, baa |
? | 0 or 1 occurrence. | ba? | b, ba |
{n} | Exactly n occurrences. | a{3} | aaa |
{n,} | n or more occurrences. | a{2,} | aa, aaa |
{n,m} | Between n and m occurrences. | a{1,3} | a, aa, aaa |
Groups and Capturing
| Pattern | Description | Example | Match |
|---|---|---|---|
(abc) | Capturing group. | (cat) | Matches cat. |
(?:abc) | Non-capturing group. | (?:cat) | Matches cat without capturing. |
(?<name>abc) | Named capturing group (Python/JS syntax). | (?<animal>cat) | Captures cat as animal. |
Lookaheads and Lookbehinds
| Pattern | Description | Example Regex | Match |
|---|---|---|---|
(?=abc) | Positive lookahead. | \d(?= dollars) | 5 in 5 dollars. |
(?!abc) | Negative lookahead. | \d(?! dollars) | 5 in 5 euros. |
(?<=abc) | Positive lookbehind. | (?<=\$)\d+ | 10 in $10. |
(?<!abc) | Negative lookbehind. | (?<!\$)\d+ | 20 in 20 euros. |
Flags and Modifiers
| Flag | Description | Example | Effect |
|---|---|---|---|
g | Global match. | /cat/g | Finds all cat instances. |
i | Case-insensitive match. | /cat/i | Matches Cat, CAT. |
m | Multiline mode. | /^cat/m | Matches cat at line start. |
PCRE Control Verbs
Control verbs are advanced features specific to PCRE (Perl-Compatible Regular Expressions). They allow you to manipulate the regex engine's backtracking and matching behavior directly, providing a level of control unavailable in most other regex engines. These verbs can optimize performance, enforce logic, or debug complex patterns by altering how the engine handles matches and failures.
Below is a brief introduction to each control verb and its function.
| Control Verb | Description |
|---|---|
(*COMMIT) | Prevents backtracking past this point. If a match fails after this point, the entire regex fails. |
(*PRUNE) | Discards all backtracking paths at this position, effectively "pruning" them. |
(*SKIP) | Skips the current match attempt and resumes matching from the next position. |
(*FAIL) | Forces an immediate match failure at the current position. |
(*ACCEPT) | Immediately ends the current match as successful, ignoring the remaining pattern. |
These verbs provide a high degree of control over how the regex engine processes and evaluates patterns, making them powerful tools for optimizing regex performance and logic in PCRE.
Advanced Examples
Validate an Email Address
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$Matches: user@example.com, hello.world@domain.org.
Match a Phone Number
^\+?\d{1,3}?[-.\s]?\(?\d{1,4}?\)?[-.\s]?\d{1,4}[-.\s]?\d{1,9}$Matches: +1-800-555-5555, (800) 555-5555.
Extract URLs
https?:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(\/\S*)?Matches: https://example.com, http://domain.org/path.
Replace Multiple Spaces with One
\s{2,}Replace with a single space to clean up text like "Hello world" into "Hello world".
Find Duplicate Words
\b(\w+)\s+\1\bMatches: the the, hello hello.
Zero Punctuation and Regex in Action
Regex excels at handling edge cases, such as matching patterns without punctuation. For example, this regex ensures no punctuation in a string:
^[\w\s]+$Matches: Hello World but not Hello, World!.
Regex Compatibility Tables
A quick compatibility reference for regular expressions across major engines including PCRE2 (PHP), ECMAScript 2024 (JavaScript), Python, Golang, Java, .NET, Rust, Ruby and POSIX, covering character classes, Unicode support and advanced features.
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| . | Any character except newline. With 's', includes newlines. | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| \w | Word character (letters, digits, underscore). | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| \W | Non-word character (inverse of \w). | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| \d | Digit (0–9). | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| \D | Non-digit (inverse of \d). | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| \s | Whitespace (spaces, tabs, newlines, etc.). | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| \S | Non-whitespace (inverse of \s). | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| Check Unicode modes for differences in \w, \d, etc. POSIX does not support these shorthands. | ||||||||||
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| ^ | Start of string, or start of line in multi-line mode | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| $ | End of string, or end of line in multi-line mode | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| \A | Start of string (not affected by multi-line mode) | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \z | End of string (strict match) | ✓ | ✗ | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ |
| \Z | End of string, ignoring trailing newline | ✓ | ✗ | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ |
| \b | Word boundary | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \B | Not a word boundary | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \< | Start of word (GNU/POSIX extension) | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
| \> | End of word (GNU/POSIX extension) | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| * | 0 or more occurrences. | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| + | 1 or more occurrences. | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| ? | 0 or 1 occurrences. | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| {n} | Exactly n occurrences. | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| {n,} | n or more occurrences. | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| {n,m} | Between n and m occurrences. | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| ? | Makes quantifiers lazy (e.g., .+?, .*?). | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| g | Global match (find all matches) | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
| m | Multi-line mode | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| i | Case-insensitive matching | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ |
| x | Ignore whitespace (verbose mode) | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ |
| s | Dot matches newline | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| u | Unicode mode | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✓ | ✓ | ✗ |
| X | Enable additional syntax features (PCRE-specific) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| U | Ungreedy matching (inverts greediness) | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ |
| A | Anchor match to the start of the string | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
| J | Allow duplicate group names | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| n | Disable capturing groups | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| xx | Ignore all whitespace and comments (PCRE extended) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| \n | New line (LF) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| \r | Carriage return (CR) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| \t | Tab character | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| \v | Vertical tab | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
| \f | Form feed | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
| \a | Bell character | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \e | Escape character | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ |
| \h | Horizontal whitespace character | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ |
| \H | Non-horizontal whitespace character | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ |
| \uFFFF | Unicode character by 4-digit hex code | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \x{FFFF} | Unicode character by variable-length hex code | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ |
| \xFF | Character by two-digit hex code | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| (*COMMIT) | No backtracking past this point. | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| (*PRUNE) | Directs the engine to “forget” any backtracking paths at this position. | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| (*SKIP) | Skips the current position and continues matching after the given point. | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| (*FAIL) | Forces an immediate match failure at this position. | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| (*ACCEPT) | Forcibly end the current match as successful right here. | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Control verbs (a.k.a. verb directives) are advanced PCRE features that alter backtracking flow. | ||||||||||
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| [:upper:] | Uppercase letters | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:lower:] | Lowercase letters | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:alpha:] | All letters | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:digit:] | Digits | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:alnum:] | Letters and digits | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:space:] | Whitespace | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:punct:] | Punctuation | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:graph:] | Printable characters except spaces | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:print:] | Printable characters including spaces | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:xdigit:] | Hexadecimal digits | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| [:cntrl:] | Control characters | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ |
| These are character classes and need to be used inside square brackets [], ie [[:upper:]] | ||||||||||
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| (...) | Capturing group | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| (?:...) | Non-capturing group | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| (?<name>...) | Named capturing group | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| [abc] | Character set matching a, b, or c | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [^abc] | Negated set matching everything except a, b, or c | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [a-q] | Range from a to q | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [A-Q] | Range from A to Q | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [0-7] | Range of digits 0 through 7 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Syntax varies by language: Python/PCRE2 use (?P<name>...), JavaScript uses (?<name>...), .NET supports both | ||||||||||
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| (?=...) | Positive lookahead | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| (?!...) | Negative lookahead | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| (?<=...) | Positive lookbehind | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ |
| (?<!...) | Negative lookbehind | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ |
| (?>...) | Atomic group (once-only subexpression) | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| (?#...) | Inline comment ignored by engine | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| Pattern | Description | PCRE2 | ECMAScript 2024 | Python >=3.9 | Golang | Java 17 | .NET 8.0 | Rust | Ruby 3.0+ | POSIX |
|---|---|---|---|---|---|---|---|---|---|---|
| \p{L} | Any letter from any language | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{M} | Marks (accents, diacritics) | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{N} | Any numeric character | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Z} | Separator characters (spaces, etc.) | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Han} | Chinese characters (Mandarin/Cantonese) | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Devanagari} | Hindi or Sanskrit characters | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Cyrillic} | Cyrillic script (e.g., Russian) | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Arabic} | Arabic script | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Tamil} | Tamil script | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Greek} | Greek script | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Hebrew} | Hebrew script | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Thai} | Thai script | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| \p{Emoji} | Emoji characters | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ |
Tools for Testing and Debugging Regex
- Regex101: Interactive online regex tester.
- RegExr: Explore and test regular expressions visually.
- Regex Cheat Sheets: Downloadable PDFs for quick reference.

Ishan Karunaratne
Ishan Karunaratne is the Chief Technology Officer at Rehab Media Group and an accomplished software and DevOps engineer, known for delivering innovative solutions, optimizing workflows, and leading teams to achieve technical excellence.
