Introduction to Machine Learning
What is Machine Learning?
Machine Learning (ML) is a subset of Artificial Intelligence (AI) that enables computers to learn from data and make predictions or decisions without being explicitly programmed.
It allows software to improve its performance over time using data and algorithms.
Why Learn Machine Learning?
- Automate decision-making.
- Predict trends and outcomes.
- Extract patterns and insights from large data sets.
- Power modern applications like recommendation systems, fraud detection, and self-driving cars.
1. Types of Machine Learning
Supervised Learning
Learn from labeled data (input ➜ output).
Examples:
- Email spam detection
- Predicting house prices
- Sentiment analysis
Algorithms:
- Linear Regression
- Logistic Regression
- Decision Trees
- Support Vector Machines (SVM)
Unsupervised Learning
Learn from unlabeled data—find hidden patterns.
Examples:
- Customer segmentation
- Market basket analysis
- Anomaly detection
Algorithms:
- K-Means Clustering
- Hierarchical Clustering
- PCA (Dimensionality Reduction)
Semi-Supervised Learning
Uses both labeled and unlabeled data. Often applied in situations where labeled data is limited.
Reinforcement Learning
An agent learns to make decisions by interacting with the environment and receiving rewards.
Examples:
- Game AI
- Robotics
- Self-driving cars
2. Key Machine Learning Terminologies
Term | Definition |
---|---|
Model | A function learned by the ML algorithm |
Training | Feeding data to the algorithm to learn |
Features | Input variables (columns) |
Labels | Target variable (output) |
Overfitting | When a model memorizes training data and performs poorly on new data |
Underfitting | When a model is too simple and misses patterns |
3. Basic Machine Learning Workflow
Step 1: Data Collection
Step 2: Data Preprocessing (cleaning, normalization)
Step 3: Splitting Data (Train/Test)
Step 4: Model Selection (choose algorithm)
Step 5: Training the Model
Step 6: Evaluation (accuracy, precision, recall)
Step 7: Tuning Hyperparameters
Step 8: Deployment
4. Example: Supervised Learning with Scikit-learn (Iris Dataset)
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
# Load dataset
data = load_iris()
X = data.data
y = data.target
# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Train model
model = DecisionTreeClassifier()
model.fit(X_train, y_train)
# Predict and evaluate
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
5. Popular Machine Learning Libraries in Python
Library | Use |
---|---|
Scikit-learn | Core ML algorithms |
Pandas | Data preprocessing |
NumPy | Numerical operations |
Matplotlib/Seaborn | Data visualization |
TensorFlow/Keras | Deep learning |
6. Common Machine Learning Algorithms
Algorithm | Type | Use Case |
---|---|---|
Linear Regression | Supervised | Predict continuous values |
Logistic Regression | Supervised | Binary classification |
Decision Trees | Supervised | Easy-to-interpret models |
K-Means Clustering | Unsupervised | Group similar data |
Naive Bayes | Supervised | Spam detection |
Random Forest | Supervised | Ensemble method for better accuracy |
K-Nearest Neighbors | Supervised | Classification/regression |
PCA | Unsupervised | Dimensionality reduction |