Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
What is difference between \'.\' , \'?\' and \'*\' in Python regular expression?
This article will discuss the difference between dot '.', question mark '?', and asterisk '*' in Python regular expressions.
In regular expressions, '.' means any one character except the newline, '*' means zero or more occurrences of the preceding character or group, and '?' means zero or one occurrence of the preceding character or group. The question mark can also make quantifiers non-greedy in some contexts.
Let's understand these symbols using practical examples ?
Usage of Dot (.)
The dot '.' matches any single character except a newline ?
import re # Define text or string here txt = "cat cot cut" # Match with the pattern pat = r"c.t" # Use findall method res = re.findall(pat, txt) # Print the result print(res)
The output of the above code is ?
['cat', 'cot', 'cut']
The pattern c.t matches any three-character string starting with 'c' and ending with 't', with any single character in between.
Usage of Question Mark (?)
The question mark '?' makes the preceding character optional, meaning it can occur 0 or 1 time ?
import re # Define text or string here txt = "color colour" # Match with the pattern pat = r"colou?r" # Use findall method res = re.findall(pat, txt) # Print the result print(res)
The output of the above code is ?
['color', 'colour']
The pattern colou?r matches both "color" (where 'u' appears 0 times) and "colour" (where 'u' appears 1 time).
Usage of Asterisk (*)
The asterisk '*' matches zero or more occurrences of the preceding character ?
import re # Define text or string here txt = "ab aabb aaabbb" # Match with the pattern pat = r"a*b" # Use findall method res = re.findall(pat, txt) # Print the result print(res)
The output of the above code is ?
['ab', 'aab', 'b', 'aaab', 'b', 'b']
The pattern a*b matches 'b' with zero or more 'a' characters before it.
Comparison
| Symbol | Function | Matches | Example Pattern | Example Match |
|---|---|---|---|---|
. |
Any single character | Exactly one character | c.t |
cat, cot, cut |
? |
Zero or one occurrence | 0 or 1 of preceding | colou?r |
color, colour |
* |
Zero or more occurrences | 0+ of preceding | a*b |
b, ab, aab, aaab |
Key Points
. (dot) matches any single character excluding newlines
* is a quantifier that matches zero or more of the preceding element
? is a quantifier that matches zero or one of the preceding element
? can also be used to make quantifiers non-greedy (e.g.,
*?,+?)
Conclusion
The dot '.' is a character matcher, while '?' and '*' are quantifiers that control repetition. Understanding these symbols is essential for writing effective regular expressions in Python.
