Python re Module (Regular Expressions)
What is Python re Module?
The re module in Python is used for working with regular expressions (RegEx) — a sequence of characters that forms a search pattern. It's used for string matching, pattern searching, data validation, string manipulation, and more.
Why Use Regular Expressions in Python?
- To search and extract patterns (e.g., emails, phone numbers)
- To validate inputs (e.g., password format, URL structure)
- To split, replace, or clean text using patterns
Importing the re Module
import re
Commonly Used re Functions in Python
| Function | Description |
|---|---|
| re.match() | Checks for a match only at the beginning of the string |
| re.search() | Searches the entire string for the first match |
| re.findall() | Returns a list of all matches |
| re.finditer() | Returns an iterator yielding match objects |
| re.sub() | Replaces matched patterns with a string |
| re.split() | Splits a string by the matched pattern |
| re.compile() | Compiles a pattern into a regex object |
1. re.match() – Match at the Beginning
import re
result = re.match(r'Hello', 'Hello World')
print(result.group())
Output:
Hello
If the pattern is not at the beginning, match() returns None.
2. re.search() – Search for a Pattern Anywhere
import re
result = re.search(r'World', 'Hello World')
print(result.group())
Output:
World
3. re.findall() – Find All Matches
text = 'Email: test1@gmail.com and test2@yahoo.com'
emails = re.findall(r'\S+@\S+', text)
print(emails)
Output:
['test1@gmail.com', 'test2@yahoo.com']
4. re.sub() – Replace Pattern in String
text = "Hello 123, this is 456"
new_text = re.sub(r'\d+', '#', text)
print(new_text)
Output:
Hello #, this is #
5. re.split() – Split String Using Pattern
text = "one,two;three four"
parts = re.split(r'[;, ]', text)
print(parts)
Output:
['one', 'two', 'three', 'four']
6. re.compile() – Compile and Reuse Pattern
pattern = re.compile(r'\d+')
matches = pattern.findall('Item1 = 10, Item2 = 20')
print(matches)
Output:
['1', '10', '2', '20']
Python Regular Expression Patterns (RegEx Syntax)
| Pattern | Description |
|---|---|
| . | Any character except newline |
| ^ | Beginning of string |
| $ | End of string |
| * | 0 or more repetitions |
| + | 1 or more repetitions |
| ? | 0 or 1 repetition |
| {n} | Exactly n repetitions |
| {n,} | n or more repetitions |
| {n,m} | Between n and m repetitions |
| [] | Matches characters in brackets |
| \d | Digit (0-9) |
| \D | Non-digit |
| \w | Alphanumeric |
| \W | Non-alphanumeric |
| \s | Whitespace |
| \S | Non-whitespace |
| | | OR operator |
| () | Capture group |
Real-World Example: Extract Phone Numbers
text = "Call me at 9876543210 or 1234567890"
phones = re.findall(r'\d{10}', text)
print("Phone numbers:", phones)
Output:
Phone numbers: ['9876543210', '1234567890']
Real-World Example: Validate Email Address
email = "user@example.com"
is_valid = re.match(r'^\w+@\w+\.\w+$', email)
print("Valid Email?", bool(is_valid))
Output:
Valid Email? True