filmov
tv
ASPLOS'23 - Session 1A - Heron: Automatically Constrained High-performance Library Generation for De

Показать описание
ASPLOS'23: The 28th International Conference on Architectural Support for Programming Languages and Operating Systems
Session 1A: Systems for ML
Session Chair: Tushar Krishna, Georgia Inst. of Technology
Title: Heron: Automatically Constrained High-performance Library Generation for Deep Learning Accelerators
Authors: Jun Bi (Univ. of Science and Technology of China); Qi Guo, Xiaqing Li, Yongwei Zhao (Inst. of Computing Tech., Chinese Academy of Sciences); Yuanbo Wen, Yuxuan Guo, Enshuai Zhou (Univ. of Science and Technology of China); Xing Hu, Zidong Du (Inst. of Computing Tech., Chinese Academy of Sciences); Ling Li (Inst. of Software, Chinese Academy of Sciences); Huaping Chen (Univ. of Science and Technology of China); Tianshi Chen (Cambricon Technologies)
Session 1A: Systems for ML
Session Chair: Tushar Krishna, Georgia Inst. of Technology
Title: Heron: Automatically Constrained High-performance Library Generation for Deep Learning Accelerators
Authors: Jun Bi (Univ. of Science and Technology of China); Qi Guo, Xiaqing Li, Yongwei Zhao (Inst. of Computing Tech., Chinese Academy of Sciences); Yuanbo Wen, Yuxuan Guo, Enshuai Zhou (Univ. of Science and Technology of China); Xing Hu, Zidong Du (Inst. of Computing Tech., Chinese Academy of Sciences); Ling Li (Inst. of Software, Chinese Academy of Sciences); Huaping Chen (Univ. of Science and Technology of China); Tianshi Chen (Cambricon Technologies)
ASPLOS'23 - Session 1A - Heron: Automatically Constrained High-performance Library Generation f...
ASPLOS'23 - Session 1A - GRACE: A Scalable Graph-Based Approach To Accelerating Recommendation ...
ASPLOS'23 - Session 1A - WACO: Learning Workload-Aware Co-optimization of the Format and Schedu...
ASPLOS'22 - Session 1A - DOTA: Detect and Omit Weak Attentions for Scalable Transformer Acceler...
ASPLOS'23 - Session 2A - Coyote: A Compiler for Vectorizing Encrypted Arithmetic Circuits
ASPLOS'23 - Session 4A - eHDL: Turning eBPF/XDP Programs into Hardware Designs for the NIC
ASPLOS'23 - Session 4C - The Sparse Abstract Machine
ASPLOS'20 - Session 1A - DNNGuard: An Elastic Heterogeneous DNN Accelerator Architecture agains...
ASPLOS'22 - Session 1A - FINGERS: Exploiting Fine-Grained Parallelism in Graph Mining Accelerat...
ASPLOS'23 - Session 9A - Towards an Adaptable Systems Architecture for Memory Tiering at Wareho...
ASPLOS'23 - Session 5B - Re-architecting I/O Caches for Emerging Fast Storage Devices
ASPLOS'23 - Session 9C - AfterImage: Leaking Control Flow and Tracking Load Operations via the ...
ASPLOS'23 - Session 8C - CommonGraph: Graph Analytics on Evolving Data
[ASPLOS '23/CAL Lightning] Infinity Stream: Portable and Programmer-Friendly In-/Near-Memory Fu...
ASPLOS'23 - Session 5B - Prism: Optimizing Key-Value Store for Modern Heterogeneous Storage Dev...
ASPLOS`21 - Session 1 - Autonomous NIC offloads - Full video
ASPLOS'22 - Session 1A - TaskStream: Accelerating Task-Parallel Workloads by Recovering Program...
ASPLOS'20 - Session 9A - AsymNVM: An Efficient Framework for Implementing Persistent Data Struc...
ASPLOS'24 - Session 1D - Attacks and Mitigations
ASPLOS'23 - Session 5C - FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
ASPLOS'23 - Session 8B - Stepwise Debugging for Hardware Accelerators
ASPLOS'23 - Session 5A - Compilation Consistency Modulo Debug Information
ASPLOS'23 - Session 7B - Protect the System Call, Protect (most of) the World with BASTION
ASPLOS'23 - Session 4C - Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Prog...
Комментарии