Data Visualization in Python using Matplotlib and Seaborn


What is Data Visualization?

Data Visualization is the graphical representation of data and information using charts, graphs, and plots. In Python, the most popular libraries for data visualization are:

  • Matplotlib: Low-level, highly customizable plotting library.
  • Seaborn: Built on top of Matplotlib, offers a higher-level interface and attractive statistical plots.

Installing Libraries

pip install matplotlib seaborn


1. Introduction to Matplotlib

Basic Line Plot

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 35]

plt.plot(x, y)
plt.title("Line Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.grid(True)
plt.show()

Bar Chart

categories = ['A', 'B', 'C']
values = [10, 30, 20]

plt.bar(categories, values)
plt.title("Bar Chart")
plt.xlabel("Category")
plt.ylabel("Value")
plt.show()

Pie Chart

labels = ['Apple', 'Banana', 'Cherry']
sizes = [30, 50, 20]

plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=140)
plt.title("Fruit Distribution")
plt.axis('equal')
plt.show()

Histogram

import numpy as np

data = np.random.randn(1000)

plt.hist(data, bins=30, color='skyblue', edgecolor='black')
plt.title("Histogram of Random Data")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()


2. Introduction to Seaborn

import seaborn as sns
import pandas as pd

# Sample DataFrame
data = pd.DataFrame({
    'Age': [22, 25, 30, 35, 40, 45],
    'Salary': [25000, 32000, 48000, 58000, 60000, 75000],
    'Department': ['HR', 'HR', 'IT', 'IT', 'Finance', 'Finance']
})

Seaborn Line Plot

sns.lineplot(x='Age', y='Salary', data=data)
plt.title("Salary vs Age")
plt.show()

Seaborn Bar Plot

sns.barplot(x='Department', y='Salary', data=data)
plt.title("Average Salary by Department")
plt.show()

Seaborn Histogram / Distribution Plot

sns.histplot(data['Salary'], bins=10, kde=True)
plt.title("Salary Distribution")
plt.show()

Box Plot

sns.boxplot(x='Department', y='Salary', data=data)
plt.title("Salary Distribution by Department")
plt.show()

Heatmap (Correlation Matrix)

# Correlation heatmap
correlation = data[['Age', 'Salary']].corr()

sns.heatmap(correlation, annot=True, cmap='coolwarm')
plt.title("Correlation Heatmap")
plt.show()


Comparing Matplotlib vs Seaborn

Feature Matplotlib Seaborn
Level Low-level High-level
Customization Full control Limited but beautiful by default
Ease of Use Steeper learning curve Easier for beginners
Use Case Detailed plots, custom use Quick statistical data visualization