filmov
tv
Lecture 14 - Distributed Training and Gradient Compression (Part II) | MIT 6.S965
Показать описание
Lecture 14 introduces the communication bottlenecks of distributed training: bandwidth and latency. This lecture introduces gradient compression including gradient pruning and gradient quantization to solve the bandwidth bottleneck and introduces delayed gradient averaging to alleviate the latency problem.
Keywords: Distributed Training, Bandwidth, Latency, Deep Gradient Compression, Delayed Gradient Averaging
------------------------------------------------------------------------------------
TinyML and Efficient Deep Learning Computing
Instructors:
Have you found it difficult to deploy neural networks on mobile devices and IoT devices? Have you ever found it too slow to train neural networks? This course is a deep dive into efficient machine learning techniques that enable powerful deep learning applications on resource-constrained devices. Topics cover efficient inference techniques, including model compression, pruning, quantization, neural architecture search, and distillation; and efficient training techniques, including gradient compression and on-device transfer learning; followed by application-specific model optimization techniques for videos, point cloud, and NLP; and efficient quantum machine learning. Students will get hands-on experience implementing deep learning applications on microcontrollers, mobile phones, and quantum machines with an open-ended design project related to mobile AI.
Website:
Keywords: Distributed Training, Bandwidth, Latency, Deep Gradient Compression, Delayed Gradient Averaging
------------------------------------------------------------------------------------
TinyML and Efficient Deep Learning Computing
Instructors:
Have you found it difficult to deploy neural networks on mobile devices and IoT devices? Have you ever found it too slow to train neural networks? This course is a deep dive into efficient machine learning techniques that enable powerful deep learning applications on resource-constrained devices. Topics cover efficient inference techniques, including model compression, pruning, quantization, neural architecture search, and distillation; and efficient training techniques, including gradient compression and on-device transfer learning; followed by application-specific model optimization techniques for videos, point cloud, and NLP; and efficient quantum machine learning. Students will get hands-on experience implementing deep learning applications on microcontrollers, mobile phones, and quantum machines with an open-ended design project related to mobile AI.
Website: