Stanford University CS231n: Deep Learning for Computer Vision

Schedule

Lectures will occur Tuesday/Thursday from 1:30-3:00pm Pacific Time at NVIDIA Auditorium.
Discussion sections will (generally) occur on Fridays between 1:30-2:30pm Pacific Time on Zoom. Check Ed for any exceptions.

Updated lecture slides will be posted here shortly before each lecture. For ease of reading, we have color-coded the lecture category titles in blue, discussion sections (and final project poster session) in yellow, and the midterm exam in red. Note that the schedule is subject to change as the quarter progresses.

Date	Description	Course Materials	Events	Deadlines
03/29	Lecture 1: Introduction Computer vision overview Historical context Course logistics [slides 1] [slides 2]
———	Deep Learning Basics
03/31	Lecture 2: Image Classification with Linear Classifiers The data-driven approach K-nearest neighbor Linear Classifiers Algebraic / Visual / Geometric viewpoints SVM and Softmax loss [slides]	Image Classification Problem Linear Classification
04/01	Python / Numpy Review Session [Colab] [Tutorial]	1:30-2:30pm PT	Assignment 1 out [handout] [colab]
04/05	Lecture 3: Regularization and Optimization Regularization Stochastic Gradient Descent Momentum, AdaGrad, Adam Learning rate schedules [slides]	Optimization
04/07	Lecture 4: Neural Networks and Backpropagation Multi-layer Perceptron Backpropagation [slides]	Backprop Linear backprop example Suggested Readings: Why Momentum Really Works Derivatives notes Efficient backprop More backprop references: [1], [2], [3]
04/08	Backprop Review Session [slides]	1:30-2:30pm PT
———	Perceiving and Understanding the Visual World
04/12	Lecture 5: Image Classification with CNNs History Higher-level representations, image features Convolution and pooling [slides]	Convolutional Networks
04/13	Final Project Overview and Guidelines [slides]	3:00-4:00pm PT
04/14	Lecture 6: CNN Architectures Batch Normalization Transfer learning AlexNet, VGG, GoogLeNet, ResNet [slides]	AlexNet, VGGNet, GoogLeNet, ResNet
04/15			Assignment 2 out [handout] [colab]	Assignment 1 due
04/18				Project proposal due
04/19	Lecture 7: Training Neural Networks Activation functions Data processing Weight initialization Hyperparameter tuning Data augmentation [slides]	Neural Networks, Parts 1, 2, 3 Suggested Readings: Stochastic Gradient Descent Tricks Efficient Backprop Practical Recommendations for Gradient-based Training Deep Learning, Nature 2015 An Overview of Gradient Descent Algorithms A Disciplined Approach to Neural Network Hyper-Parameters
04/21	Lecture 8: Visualizing and Understanding Feature visualization and inversion Adversarial examples DeepDream and style transfer [slides]
04/22	PyTorch Review Session [slides]	1:30-2:30pm PT
04/26	Lecture 9: Object Detection and Image Segmentation Single-stage detectors Two-stage detectors Semantic/Instance/Panoptic segmentation [slides]	FCN, R-CNN, Fast R-CNN, Faster R-CNN, YOLO
04/28	Lecture 10: Recurrent Neural Networks RNN, LSTM, GRU Language modeling Image captioning Sequence-to-sequence [slides]	Suggested Readings: DL book RNN chapter Understanding LSTM Networks
04/29	Object Detection & RNNs Review Session [slides]	2:30-3:30pm PT
05/02				Assignment 2 due
05/03	Lecture 11: Attention and Transformers Self-Attention Transformers [slides]	Suggested Readings: Attention is All You Need [Original Transformers Paper] Attention? Attention [Blog by Lilian Weng] The Illustrated Transformer [Blog by Jay Alammar] ViT: Transformers for Image Recognition [Paper] [Blog] [Video] DETR: End-to-End Object Detection with Transformers [Paper] [Blog] [Video]
05/5	Lecture 12: Video Understanding Video classification 3D CNNs Two-stream networks Multimodal video understanding [slides]
05/06	Midterm Review Session	2:30-3:30pm PT
05/07				Project milestone due
05/10	In-Class Midterm	1:30-3:00pm	Assignment 3 out [handout] [colab]
———	Reconstructing and Interacting with the Visual World
05/12	Lecture 13: Generative Models Supervised vs. Unsupervised learning Pixel RNN, Pixel CNN Variational Autoencoders Generative Adversarial Networks [slides]	Suggested Readings: Image GPT: Generative Pretraining From Pixels [Paper] [Blog]
05/17	Lecture 14: Self-supervised Learning Pretext tasks Contrastive learning Multisensory supervision [slides]	Suggested Readings: Lilian Weng Blog Post DINO: Emerging Properties in Self-Supervised Vision Transformers [Paper] [Blog] [Video]
05/19	Lecture 15: Low-Level Vision (Guest Lecture by Prof. Jia Deng from Princeton University) Optical flow Depth estimation Stereo vision [slides]
05/24	Lecture 16: 3D Vision 3D shape representations Shape reconstruction Neural implicit representations [slides]			Assignment 3 due
———	Human-Centered Applications and Implications
05/26	Lecture 17: Human-Centered Artificial Intelligence AI & healthcare
05/31	Lecture 18: Fairness in Visual Recognition (Guest Lecture by Prof. Olga Russakovsky from Princeton University)
06/02				Project final report due
06/04	Final Project Poster Session	Note: Only open to the Stanford community and invited guests. 3:30-6:30pm Location: Alumni Center McCaw Hall/Ford Gardens Click here for the logistics and expectations.
06/05				Project poster PDF due

CS231n: Deep Learning for Computer Vision

Stanford - Spring 2022

Schedule