Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
What is Raw String Notation in Python regular expression?
Raw string notation in Python regular expressions helps avoid backslash conflicts by treating backslashes as literal characters rather than escape sequences. This is especially important when writing regex patterns that contain backslashes.
The Problem with Regular Strings
In regular Python strings, backslashes have special meaning for escape sequences like \n (newline) and \t (tab). This can interfere with regex patterns that need literal backslashes ?
import re
# Without raw string - needs double backslashes
pattern = "\d+"
text = "There are 123 items"
matches = re.findall(pattern, text)
print("Without raw string:", matches)
Without raw string: ['123']
Using Raw String Notation
Raw strings are prefixed with r and treat backslashes as literal characters. This makes regex patterns cleaner and more readable ?
import re
# With raw string - single backslash works
pattern = r"\d+"
text = "There are 123 items"
matches = re.findall(pattern, text)
print("With raw string:", matches)
With raw string: ['123']
Syntax
The general syntax for raw string notation in regex patterns is ?
import re pattern = r"your_regex_pattern" result = re.search(pattern, text)
Finding Email Addresses
Here's a practical example using raw strings to match email addresses ?
import re
# Email pattern using raw string
pattern = r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
# Sample text
text = "Contact us at support@example.com or admin@test.org"
# Find all email addresses
emails = re.findall(pattern, text)
print("Found emails:", emails)
Found emails: ['support@example.com', 'admin@test.org']
Finding Word Boundaries
Raw strings are essential when using word boundary patterns with \b ?
import re
# Word boundary pattern
pattern = r"\b\w+\b"
# Sample text
text = "Hello, Python world!"
# Find all complete words
words = re.findall(pattern, text)
print("Words found:", words)
Words found: ['Hello', 'Python', 'world']
Comparison
| String Type | Pattern | Readability | Maintenance |
|---|---|---|---|
| Regular String | "\d+" |
Confusing | Error-prone |
| Raw String | r"\d+" |
Clear | Easy |
Conclusion
Raw string notation with r"" is the preferred way to write regex patterns in Python. It eliminates backslash escaping issues and makes patterns more readable and maintainable.
