Stanford CS231n
Comprehensive CNN course from Stanford
- CNN architectures deep dive
- Image classification
- Object detection
- Semantic segmentation
- Generative models
Object Detection
YOLO, R-CNN, and modern detectors
- R-CNN family (Fast, Faster, Mask)
- YOLO v3-v8
- SSD and RetinaNet
- Real-time detection
- Evaluation metrics (mAP, IoU)
Image Segmentation
U-Net, Mask R-CNN, semantic segmentation
- FCN (Fully Convolutional)
- U-Net architecture
- DeepLab family
- Mask R-CNN
- Instance vs semantic
Face Recognition
Face detection, alignment, recognition
- MTCNN, RetinaFace detection
- FaceNet embeddings
- ArcFace, CosFace
- Expression recognition
- Face verification
Pose Estimation
Human pose detection and tracking
- OpenPose architecture
- MediaPipe solutions
- HRNet for pose
- Multi-person detection
- Action recognition
Video Analysis
Object tracking and motion analysis
- Optical flow (Lucas-Kanade)
- Object tracking (SORT, DeepSORT)
- Action recognition
- Video segmentation
University Courses
Academic computer vision courses and lectures
- University of Michigan Deep Learning for CV
- Academic computer vision lectures
- Research-focused tutorials
- Theoretical foundations