What Is Normalization in SQL? – 1NF, 2NF, and 3NF Explained
Introduction
In SQL, text data types are used to store alphanumeric values like names, addresses, emails, and descriptions. Choosing the correct text type — CHAR, VARCHAR, or TEXT — is important for optimizing storage space, query speed, and database performance.
In this section, you'll learn the definitions, differences, and best use cases for each text data type.
1. CHAR (Fixed-Length String)
CHAR is used to store fixed-length strings. If the stored string is shorter than the defined length, SQL automatically pads it with spaces to match the specified size.
Features:
- Fixed length
- Fast and predictable performance
- Uses extra storage if the data is often shorter than the specified length
Syntax:
column_name CHAR(length);
length = number of characters (1 to 255 depending on the database system)
Example:
CREATE TABLE countries (
country_code CHAR(2),
country_name CHAR(50)
);
country_code like 'US', 'IN', 'UK' will always take 2 characters.
When to Use CHAR:
- Data with a constant size, such as country codes, gender ('M', 'F'), state abbreviations
- Fixed-format fields like credit card types ('VISA', 'MC')
- When exact storage size is known and consistent
2. VARCHAR (Variable-Length String)
VARCHAR stands for Variable Character. It stores variable-length strings, meaning only the actual characters are stored without unnecessary padding.
Features:
- Variable length
- More space-efficient than CHAR for varying-length text
- Slightly slower than CHAR when processing large volumes (because of extra calculations for string lengths)
Syntax:
column_name VARCHAR(length);
length = maximum number of characters allowed
Example:
CREATE TABLE employees (
first_name VARCHAR(50),
email VARCHAR(100)
);
Names and emails can vary in length, making VARCHAR ideal.
When to Use VARCHAR:
- Data with unpredictable or variable length
- Names, emails, addresses, and descriptions under 255-65535 characters
- Most general-purpose text fields
3. TEXT (Large Text Field)
TEXT is used to store large amounts of text like long descriptions, blog posts, comments, or articles.
Features:
- Meant for large text storage (up to 65,535 characters for standard TEXT in MySQL)
- Cannot have a default value (in some databases like MySQL)
- TEXT fields are stored outside the main table with a pointer reference
- Different variants exist (TINYTEXT, MEDIUMTEXT, LONGTEXT) for various sizes
Syntax:
column_name TEXT;
Example:
CREATE TABLE articles (
id INT,
title VARCHAR(255),
body TEXT
);
body will store the full article content, which can be very large.
When to Use TEXT:
- Long-form text fields (comments, articles, reviews, reports)
- Data that exceeds normal VARCHAR limits
- When exact storage requirements are unknown or potentially very large
Quick Comparison: CHAR vs VARCHAR vs TEXT
Feature | CHAR | VARCHAR | TEXT |
---|---|---|---|
Storage | Fixed length | Variable length | Variable, large storage |
Max Size | Up to 255 chars | 65,535 bytes (typically) | 65,535+ chars (depends on type) |
Performance | Fast for fixed-size | Efficient for variable text | Slightly slower for queries |
Indexing | Full index support | Full index support | Limited in some DBs |
Best Use Case | Codes, fixed formats | Names, addresses, emails | Articles, long descriptions |
Important Tips
- Use CHAR only when all values will be exactly the same length
- VARCHAR is the best choice for most standard text fields
- Reserve TEXT for content that exceeds VARCHAR limits
- Consider VARCHAR(MAX) in SQL Server for large text that might need indexing
- Be aware that TEXT fields may have limitations on default values and full-text indexing
What Is Normalization?
Normalization is the process of organizing data in a database to reduce data redundancy and improve data integrity. It involves dividing large, complex tables into smaller, well-structured tables and defining relationships between them.
The goal:
Ensure that each piece of data is stored in one place and that the database is efficient and consistent.
Why Is Normalization Important?
- Reduces duplicate data
- Improves data consistency
- Makes updates and deletes more reliable
- Enhances query performance (when indexed properly)
- Simplifies future database changes
The Normal Forms: 1NF, 2NF, and 3NF
First Normal Form (1NF)
A table is in 1NF if:
- All columns contain atomic (indivisible) values
- Each column contains values of the same type
- Each row is unique
Violation Example:
Student_ID | Name | Courses |
---|---|---|
1 | Alice | Math, Science |
Courses column contains multiple values (not atomic)
Normalized (1NF):
Student_ID | Name | Course |
---|---|---|
1 | Alice | Math |
1 | Alice | Science |
Second Normal Form (2NF)
A table is in 2NF if:
- It is in 1NF
- Every non-key column is fully dependent on the entire primary key
- Applies mostly to tables with composite primary keys
Violation Example:
Order_ID | Product_ID | Product_Name |
---|---|---|
1 | 101 | Mouse |
Here, Product_Name depends only on Product_ID, not the full composite key (Order_ID, Product_ID).
Normalized (2NF):
Orders Table
Order_ID | Product_ID |
---|---|
1 | 101 |
Products Table
Product_ID | Product_Name |
---|---|
101 | Mouse |
Third Normal Form (3NF)
A table is in 3NF if:
- It is in 2NF
- All columns are only dependent on the primary key
- There are no transitive dependencies
Violation Example:
Student_ID | Name | Dept_ID | Dept_Name |
---|---|---|---|
1 | Alice | D01 | Science |
Here, Dept_Name is dependent on Dept_ID, which is not the primary key (Student_ID) — this is a transitive dependency.
Normalized (3NF):
Students Table
Student_ID | Name | Dept_ID |
---|---|---|
1 | Alice | D01 |
Departments Table
Dept_ID | Dept_Name |
---|---|
D01 | Science |
Summary of Normal Forms
Normal Form | Purpose | Rule |
---|---|---|
1NF | Atomicity | No repeating groups or arrays |
2NF | Full functional dependency | No partial dependency on composite key |
3NF | Remove transitive dependency | All columns depend only on primary key |