Hardware Accelerators for Machine Learning
This course provides in-depth coverage of the architectural techniques used to design accelerators for training and inference in machine learning systems. We start with classical ML algorithms including linear regression and support vector machines and mainly focus on DNN models such as convolutional neural nets and recurrent neural nets. The course will explore acceleration and hardware trade-offs for both training and inference of these models. We will also examine the impact of parameters including batch size, precision, sparsity and compression on the design space trade-offs for efficiency vs accuracy. The course presents several guest lecturers from top groups in industry and academia.
What you will learn
- How to implement the core computational kernels used in ML using parallelism, locality, and low precision
- How to design energy-efficient accelerators, making trade-offs between ML model parameters and hardware implementation techniques
CS 149 or EE 180. CS 229 is ideal, but not required
- Accelerator design for ML model inference and training
- Linear algebra fundamentals and accelerating linear algebra
- Neural networks: MLPs and CNNs Inference
- Evaluating Performance, Energy efficiency, Parallelism, Locality, Memory hierarchy, Roofline model
- Generalization and Regularization of Training
- Fast Inference/Training
- Distributed Training
- Sparsity, Low Precision, and Asynchronous training
The course schedule is displayed for planning purposes – courses can be modified, changed, or cancelled. Course availability will be considered finalized on the first day of open enrollment. For quarterly enrollment dates, please refer to our graduate education section.
Thank you for your interest. The course you have selected is not open for enrollment. Please click the button below to receive an email when the course becomes available again.