tinyML Talks Local Manu Rastogi: Tutorial on micro-kernel based hardware acceleration

Показать описание

tinyML Talks local Webcast - recorded August 13, 2020
"Tutorial on micro-kernel based hardware acceleration"
Manu Rastogi

Energy and compute are both scarce for deep learning deployment at the edge. Rapid innovation in new layer types and network topologies makes it even more challenging. There is also increased pressure on hardware designs and toolchain development for automated and efficient model deployment. Often the hardware and toolchains lag behind in the support of new layers. Since deep learning is becoming more ubiquitous there is stiff competition amongst different hardware vendors to provide the most energy-efficient solutions. The key piece to model deployment at the edge is the mico-kernels or the micro-code that orchestrate the data movement and the computation of these networks on hardware. As part of this talk, we will walk through the matrix multiplication micro-code. We will understand the various trade-offs between different optimization strategies and extend these principles to neural networks.

Рекомендации по теме

tinyML Talks Local Manu Rastogi: Tutorial on micro-kernel based hardware acceleration

tinyML Talks Local Manu Rastogi: Tutorial on micro-kernel based hardware acceleration

tinyML Talks Local Vicki Moran & Will McDonald: Training Neural Networks for Sensors

tinyML Talks: Software/Hardware Co-design for Tiny AI Systems

tinyML Talks - Daniel Situnayake: How to train and deploy tinyML models for three common sensor...

tinyML Talks Urmish Thakker: Pushing the limits of RNN Compression using Kronecker Products

tinyML Summit 2020 - Jason Knight : Using ML for ML to span the gamut of TinyML Hardware

tinyML Summit 2019 - Byron Changuion : ELL: the Microsoft Embedded Learning Library

LTD20-302 TinyML as-a-Service

State of the art in hardware-accelerated neural networks | AI & ML on the Edge | Frédéric Pétrot...

IOT RADAR - TinyML SUMMIT REPORT

tinyML Summit 2019 - Boris Murmann : Session 2 Leader

[SPCL_Bcast] Optimization of Data Movement for Convolutional Neural Networks

[REFAI Seminar 11/11/21] Energy-Efficient AI ASIC Designs: CNN Accelerator and LSTM Accelerator

Enabling ​DNNs at the Extreme Edge: Co-optimize Circuits, Architectures & Algorithms | Synopsys...

The Intersection of SSCS and AI --A Tale of Two Journeys by Vivienne Sze and Boris Murmann

AI Acceleration Capabilities Using MMA(Matrix Multiply...- Satish Kumar Sadasivam & Puneeth Bhat

[ASPLOS21]Analytical characterization and design space exploration for optimization of CNN (20min)

DARPA ERI Summit 2018: The Accelerator Age

BLIS: A Framework for Rapidly Instantiating BLAS Functionality (EUM'21)

Enabling DNNs at the Extreme Edge: Co-optimize Circuits, Architectures & Algorithms | Synopsys...