Introduction to Machine Learning


What is Machine Learning?

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that enables computers to learn from data and make predictions or decisions without being explicitly programmed.

It allows software to improve its performance over time using data and algorithms.


Why Learn Machine Learning?

  • Automate decision-making.
  • Predict trends and outcomes.
  • Extract patterns and insights from large data sets.
  • Power modern applications like recommendation systems, fraud detection, and self-driving cars.

1. Types of Machine Learning

Supervised Learning

Learn from labeled data (input ➜ output).

Examples:

  • Email spam detection
  • Predicting house prices
  • Sentiment analysis

Algorithms:

  • Linear Regression
  • Logistic Regression
  • Decision Trees
  • Support Vector Machines (SVM)

Unsupervised Learning

Learn from unlabeled data—find hidden patterns.

Examples:

  • Customer segmentation
  • Market basket analysis
  • Anomaly detection

Algorithms:

  • K-Means Clustering
  • Hierarchical Clustering
  • PCA (Dimensionality Reduction)

Semi-Supervised Learning

Uses both labeled and unlabeled data. Often applied in situations where labeled data is limited.

Reinforcement Learning

An agent learns to make decisions by interacting with the environment and receiving rewards.

Examples:

  • Game AI
  • Robotics
  • Self-driving cars

2. Key Machine Learning Terminologies

Term Definition
Model A function learned by the ML algorithm
Training Feeding data to the algorithm to learn
Features Input variables (columns)
Labels Target variable (output)
Overfitting When a model memorizes training data and performs poorly on new data
Underfitting When a model is too simple and misses patterns

3. Basic Machine Learning Workflow

Step 1: Data Collection
Step 2: Data Preprocessing (cleaning, normalization)
Step 3: Splitting Data (Train/Test)
Step 4: Model Selection (choose algorithm)
Step 5: Training the Model
Step 6: Evaluation (accuracy, precision, recall)
Step 7: Tuning Hyperparameters
Step 8: Deployment

4. Example: Supervised Learning with Scikit-learn (Iris Dataset)

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load dataset
data = load_iris()
X = data.data
y = data.target

# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Train model
model = DecisionTreeClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

5. Popular Machine Learning Libraries in Python

Library Use
Scikit-learn Core ML algorithms
Pandas Data preprocessing
NumPy Numerical operations
Matplotlib/Seaborn Data visualization
TensorFlow/Keras Deep learning

6. Common Machine Learning Algorithms

Algorithm Type Use Case
Linear Regression Supervised Predict continuous values
Logistic Regression Supervised Binary classification
Decision Trees Supervised Easy-to-interpret models
K-Means Clustering Unsupervised Group similar data
Naive Bayes Supervised Spam detection
Random Forest Supervised Ensemble method for better accuracy
K-Nearest Neighbors Supervised Classification/regression
PCA Unsupervised Dimensionality reduction