Computer Vision Mind Map

Computer Vision Mastery
Beginner → Expert

📚 Prerequisites (1-2 weeks)

Python Fundamentals

Master Python basics before diving into CV

Variables, data types, operators
Control flow (if/else, loops)
Functions and lambda expressions
Data structures (lists, dicts, sets)
File I/O operations

📺 Python Tutorial 📺 Mosh Tutorial

NumPy Essentials

Fundamental library for array operations

Array creation and manipulation
Indexing and slicing
Broadcasting rules
Mathematical operations
Array reshaping

📖 NumPy Docs 📺 NumPy Tutorial

🌱 Beginner Level (4-8 weeks)

OpenCV Basics

Start with fundamental image operations

Read/write images and videos
Resizing, cropping, rotating
Drawing shapes and text
Color space conversions

📺 OpenCV Course (4hr) 📺 Learn in 3 Hours 📺 ProgrammingKnowledge 📺 3-Hour Tutorial 📖 Official Docs

Image Filtering

Smoothing, sharpening, and kernel operations

Gaussian blur and smoothing
Median and bilateral filtering
Sharpening techniques
Kernel convolution basics

📺 Image Processing 📖 Filtering Guide

Edge Detection

Detect boundaries and features in images

Sobel operator
Canny edge detection
Gradient computation
Laplacian method

📺 Edge Detection 📖 Canny Tutorial

Thresholding

Convert images to binary for segmentation

Binary thresholding
Adaptive thresholding
Otsu's method
Multi-level thresholding

📖 Threshold Guide

PyImageSearch Tutorials

Practical computer vision projects and tutorials

Shape detection and recognition
Object detection and classification
Image preprocessing techniques
Real-world CV applications

📺 Shape Detection 📺 Object Detection 🌐 PyImageSearch

🚀 Intermediate Level (8-12 weeks)

Contour Analysis

Detect and analyze object shapes

Finding and drawing contours
Shape properties (area, perimeter)
Bounding boxes and convex hull
Shape matching and approximation

📖 Contour Docs 📺 Contour Tutorial

Feature Detection

SIFT, SURF, ORB for image matching

Harris corner detection
SIFT (Scale-Invariant)
SURF (Speeded-Up)
ORB (Oriented FAST)
Feature matching

📖 Feature Docs 📺 SIFT/SURF

Deep Learning Basics

Neural networks and CNNs fundamentals

Neural network architecture
Backpropagation
Convolutional layers
Pooling and activation functions

📺 3Blue1Brown 📖 DeepLearning.AI

TensorFlow/PyTorch

Master deep learning frameworks

Model building and training
Data loading pipelines
Loss functions and optimizers
Model evaluation

📖 TensorFlow 📖 PyTorch 📺 PyTorch Course

Advanced CV Course

Hand tracking, pose estimation, face detection

Hand tracking and gestures
Face detection and mesh
Pose estimation
Real-time applications

📺 FreeCodeCamp (6hr) 🌐 Murtaza Hassan

🎓 Advanced Level (12+ weeks)

Stanford CS231n

Comprehensive CNN course from Stanford

CNN architectures deep dive
Image classification
Object detection
Semantic segmentation
Generative models

📺 Lectures 🌐 Course Site 📖 Notes

Object Detection

YOLO, R-CNN, and modern detectors

R-CNN family (Fast, Faster, Mask)
YOLO v3-v8
SSD and RetinaNet
Real-time detection
Evaluation metrics (mAP, IoU)

📺 YOLO Tutorial 📖 YOLOv8 Docs 📊 Papers

Image Segmentation

U-Net, Mask R-CNN, semantic segmentation

FCN (Fully Convolutional)
U-Net architecture
DeepLab family
Mask R-CNN
Instance vs semantic

📺 U-Net Tutorial 📊 Research

Face Recognition

Face detection, alignment, recognition

MTCNN, RetinaFace detection
FaceNet embeddings
ArcFace, CosFace
Expression recognition
Face verification

📺 Face Recognition 💻 Face Recognition

Pose Estimation

Human pose detection and tracking

OpenPose architecture
MediaPipe solutions
HRNet for pose
Multi-person detection
Action recognition

📖 MediaPipe 📺 Pose Tutorial

Video Analysis

Object tracking and motion analysis

Optical flow (Lucas-Kanade)
Object tracking (SORT, DeepSORT)
Action recognition
Video segmentation

📖 Optical Flow 📺 Tracking

University Courses

Academic computer vision courses and lectures

University of Michigan Deep Learning for CV
Academic computer vision lectures
Research-focused tutorials
Theoretical foundations

📺 Michigan Deep Learning CV 📺 Academic Playlist

🔬 Expert Level (Ongoing)

Generative Models

GANs, VAEs, Diffusion Models

GAN architectures (DCGAN, StyleGAN)
Variational Autoencoders
Diffusion models (Stable Diffusion)
Image-to-image translation
Neural style transfer

📺 GANs Explained 💻 Diffusers 🌐 Stable Diffusion

Vision Transformers

ViT, DETR, attention mechanisms

Vision Transformer (ViT)
DETR (Detection Transformer)
Attention mechanisms
Self-attention in vision
Multi-head attention

📺 Vision Transformers 📊 Research

📚 Additional Learning Resources

University Courses

📺 Michigan Deep Learning CV 📺 Academic Playlist

PyImageSearch Tutorials

📺 Shape Detection 📺 Object Detection

OpenCV Playlists

📺 ProgrammingKnowledge 📺 3-Hour Tutorial

Computer Vision Concepts

📺 CV Concepts 📺 Advanced Topics 📺 Image Processing 📺 CV Applications 📺 Deep Learning CV 📺 Neural Networks 📺 CV Algorithms 📺 Image Analysis 📺 Computer Vision

🎯 Computer Vision & Image Processing