Python re Module (Regular Expressions)

What is Python re Module?

The re module in Python is used for working with regular expressions (RegEx) — a sequence of characters that forms a search pattern. It's used for string matching, pattern searching, data validation, string manipulation, and more.

Why Use Regular Expressions in Python?

To search and extract patterns (e.g., emails, phone numbers)
To validate inputs (e.g., password format, URL structure)
To split, replace, or clean text using patterns

Importing the re Module

import re

Commonly Used re Functions in Python

Function	Description
re.match()	Checks for a match only at the beginning of the string
re.search()	Searches the entire string for the first match
re.findall()	Returns a list of all matches
re.finditer()	Returns an iterator yielding match objects
re.sub()	Replaces matched patterns with a string
re.split()	Splits a string by the matched pattern
re.compile()	Compiles a pattern into a regex object

1. re.match() – Match at the Beginning

import re

result = re.match(r'Hello', 'Hello World')
print(result.group())

Output:

Hello

If the pattern is not at the beginning, match() returns None.

2. re.search() – Search for a Pattern Anywhere

import re

result = re.search(r'World', 'Hello World')
print(result.group())

Output:

World

3. re.findall() – Find All Matches

text = 'Email: test1@gmail.com and test2@yahoo.com'
emails = re.findall(r'\S+@\S+', text)
print(emails)

Output:

['test1@gmail.com', 'test2@yahoo.com']

4. re.sub() – Replace Pattern in String

text = "Hello 123, this is 456"
new_text = re.sub(r'\d+', '#', text)
print(new_text)

Output:

Hello #, this is #

5. re.split() – Split String Using Pattern

text = "one,two;three four"
parts = re.split(r'[;, ]', text)
print(parts)

Output:

['one', 'two', 'three', 'four']

6. re.compile() – Compile and Reuse Pattern

pattern = re.compile(r'\d+')
matches = pattern.findall('Item1 = 10, Item2 = 20')
print(matches)

Output:

['1', '10', '2', '20']

Python Regular Expression Patterns (RegEx Syntax)

Pattern	Description
.	Any character except newline
^	Beginning of string
$	End of string
*	0 or more repetitions
+	1 or more repetitions
?	0 or 1 repetition
{n}	Exactly n repetitions
{n,}	n or more repetitions
{n,m}	Between n and m repetitions
[]	Matches characters in brackets
\d	Digit (0-9)
\D	Non-digit
\w	Alphanumeric
\W	Non-alphanumeric
\s	Whitespace
\S	Non-whitespace
\|	OR operator
()	Capture group

Real-World Example: Extract Phone Numbers

text = "Call me at 9876543210 or 1234567890"
phones = re.findall(r'\d{10}', text)
print("Phone numbers:", phones)

Output:

Phone numbers: ['9876543210', '1234567890']

Real-World Example: Validate Email Address

email = "user@example.com"
is_valid = re.match(r'^\w+@\w+\.\w+$', email)
print("Valid Email?", bool(is_valid))

Output:

Valid Email? True