Python re Module (Regular Expressions)


What is Python re Module?

The re module in Python is used for working with regular expressions (RegEx) — a sequence of characters that forms a search pattern. It's used for string matching, pattern searching, data validation, string manipulation, and more.

Why Use Regular Expressions in Python?

  • To search and extract patterns (e.g., emails, phone numbers)
  • To validate inputs (e.g., password format, URL structure)
  • To split, replace, or clean text using patterns

Importing the re Module

import re

Commonly Used re Functions in Python

Function Description
re.match() Checks for a match only at the beginning of the string
re.search() Searches the entire string for the first match
re.findall() Returns a list of all matches
re.finditer() Returns an iterator yielding match objects
re.sub() Replaces matched patterns with a string
re.split() Splits a string by the matched pattern
re.compile() Compiles a pattern into a regex object


1. re.match() – Match at the Beginning

import re

result = re.match(r'Hello', 'Hello World')
print(result.group())

Output:

Hello

If the pattern is not at the beginning, match() returns None.



2. re.search() – Search for a Pattern Anywhere

import re

result = re.search(r'World', 'Hello World')
print(result.group())

Output:

World


3. re.findall() – Find All Matches

text = 'Email: test1@gmail.com and test2@yahoo.com'
emails = re.findall(r'\S+@\S+', text)
print(emails)

Output:

['test1@gmail.com', 'test2@yahoo.com']


4. re.sub() – Replace Pattern in String

text = "Hello 123, this is 456"
new_text = re.sub(r'\d+', '#', text)
print(new_text)

Output:

Hello #, this is #


5. re.split() – Split String Using Pattern

text = "one,two;three four"
parts = re.split(r'[;, ]', text)
print(parts)

Output:

['one', 'two', 'three', 'four']


6. re.compile() – Compile and Reuse Pattern

pattern = re.compile(r'\d+')
matches = pattern.findall('Item1 = 10, Item2 = 20')
print(matches)

Output:

['1', '10', '2', '20']

Python Regular Expression Patterns (RegEx Syntax)

Pattern Description
. Any character except newline
^ Beginning of string
$ End of string
* 0 or more repetitions
+ 1 or more repetitions
? 0 or 1 repetition
{n} Exactly n repetitions
{n,} n or more repetitions
{n,m} Between n and m repetitions
[] Matches characters in brackets
\d Digit (0-9)
\D Non-digit
\w Alphanumeric
\W Non-alphanumeric
\s Whitespace
\S Non-whitespace
| OR operator
() Capture group


Real-World Example: Extract Phone Numbers

text = "Call me at 9876543210 or 1234567890"
phones = re.findall(r'\d{10}', text)
print("Phone numbers:", phones)

Output:

Phone numbers: ['9876543210', '1234567890']

Real-World Example: Validate Email Address

email = "user@example.com"
is_valid = re.match(r'^\w+@\w+\.\w+$', email)
print("Valid Email?", bool(is_valid))

Output:

Valid Email? True