Introduction to Machine Learning

What is Machine Learning?

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that enables computers to learn from data and make predictions or decisions without being explicitly programmed.

It allows software to improve its performance over time using data and algorithms.

Why Learn Machine Learning?

Automate decision-making.
Predict trends and outcomes.
Extract patterns and insights from large data sets.
Power modern applications like recommendation systems, fraud detection, and self-driving cars.

1. Types of Machine Learning

Supervised Learning

Learn from labeled data (input ➜ output).

Examples:

Email spam detection
Predicting house prices
Sentiment analysis

Algorithms:

Linear Regression
Logistic Regression
Decision Trees
Support Vector Machines (SVM)

Unsupervised Learning

Learn from unlabeled data—find hidden patterns.

Examples:

Customer segmentation
Market basket analysis
Anomaly detection

Algorithms:

K-Means Clustering
Hierarchical Clustering
PCA (Dimensionality Reduction)

Semi-Supervised Learning

Uses both labeled and unlabeled data. Often applied in situations where labeled data is limited.

Reinforcement Learning

An agent learns to make decisions by interacting with the environment and receiving rewards.

Examples:

Game AI
Robotics
Self-driving cars

2. Key Machine Learning Terminologies

Term	Definition
Model	A function learned by the ML algorithm
Training	Feeding data to the algorithm to learn
Features	Input variables (columns)
Labels	Target variable (output)
Overfitting	When a model memorizes training data and performs poorly on new data
Underfitting	When a model is too simple and misses patterns

3. Basic Machine Learning Workflow

Step 1: Data Collection
Step 2: Data Preprocessing (cleaning, normalization)
Step 3: Splitting Data (Train/Test)
Step 4: Model Selection (choose algorithm)
Step 5: Training the Model
Step 6: Evaluation (accuracy, precision, recall)
Step 7: Tuning Hyperparameters
Step 8: Deployment

4. Example: Supervised Learning with Scikit-learn (Iris Dataset)

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load dataset
data = load_iris()
X = data.data
y = data.target

# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Train model
model = DecisionTreeClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

5. Popular Machine Learning Libraries in Python

Library	Use
Scikit-learn	Core ML algorithms
Pandas	Data preprocessing
NumPy	Numerical operations
Matplotlib/Seaborn	Data visualization
TensorFlow/Keras	Deep learning

6. Common Machine Learning Algorithms

Algorithm	Type	Use Case
Linear Regression	Supervised	Predict continuous values
Logistic Regression	Supervised	Binary classification
Decision Trees	Supervised	Easy-to-interpret models
K-Means Clustering	Unsupervised	Group similar data
Naive Bayes	Supervised	Spam detection
Random Forest	Supervised	Ensemble method for better accuracy
K-Nearest Neighbors	Supervised	Classification/regression
PCA	Unsupervised	Dimensionality reduction