CSE676
Deep Learning

Deep Learning algorithms learn multi-level representations of data, with each level explaining the data in a hierarchical manner. Such algorithms have been effective at uncovering underlying structure in data, e.g., features to discriminate between classes. They have been successful in many artificial intelligence problems including image classification, speech recognition, and natural language processing.

The course, which will be taught through lectures and projects, will cover the underlying theory, the range of applications to which it has been applied, and learning from very large data sets. The course will cover connectionist architectures commonly associated with deep learning, e.g., basic neural networks, convolutional neural networks, and recurrent neural networks. Methods to train and optimize the architectures and methods to perform effective inference with them will be the main focus. Students will be encouraged to use open-source software libraries such as PyTorch.

Pre-requisite: Introductory Machine Learning (ML) . A course on Probabilistic Graphical Models (PGMs) is helpful but not necessary.

Instructor Information

Course Instructor: Jue Guo [C]

  • Research Area: Optimization for machine learning, Adversarial Learning, Continual Learning, and Graph Learning
  • Interested in participating in our research? Reach to me by email.

Course Outline and Logistics

Check out the course material under lecture notes.

Course Hours: Session [C]; Tuesday and Thursday 2:00PM-3:20PM

Office Hours: 3:00pm - 4:00pm on Friday

TA: Shijie Zhou ( shijiezh@buffalo.edu )


Week(s) Topics Covered
Week 1 and Week 2 Math, Machine Learning Review, and Linear Regression
Week 3 and Week 4 Review on Linear Regression, Softmax Regression, and MLP
Week 5 and Week 6 Optimization, CNN, and Efficient-Net Paper Reading
Week 7 (One Class) Midterm (Coverage on Weeks 1, 2, 3, 4, 5)
Week 8 and Week 9 Recurrent Neural Networks and Paper Read on Transformer
Week 10, Week 11, Week 12, and Week 13 Graph Neural Network Paper Read
Week 14 and Week 15 Catch up Time on the Material if Needed
Final and Review

Evaluation Components

Component Weight / Details
Attendance 10% (Random Attendance Check)
Programming Assignments 30% (2 PAs)
Midterm 30%
Final 30%

Note on Logistics

  • A week-ahead notice for mid-term, based on the pace of the course.
  • The logistic is subject to change based on the overall pace and the performance of the class.

Grading

The following is the outline of the grading:

Grading Rubric

This course is absolute grading, meaning no curve, as there is a certain standard we need to uphold for students to have a good knowledge of deep learning.

Percentage Letter Grade Percentage Letter Grade
95-100 A 70-74 C+
90-94 A- 65-69 C
85-89 B+ 60-64 C-
80-84 B 55-59 D
75-79 B- 0-54 F

Lecture Notes

The notes are based on Dive into Deep Learning. Throughout my teaching, I have noticed that students sometimes struggle with understanding the derivations in the textbook due to the omission of several steps. To address this, I have expanded the derivations and provided more detailed explanations.

Below are the lecture notes. Please note that these notes are updated regularly, so be sure to check back often for the latest updates.

Topic Notes
Introduction
Optimization
DNN and CNN
RNN
GNN