Leveraging Wasm for Portable AI Inference Across GPUs, CPUs, OS & Cloud...- Miley Fu & Hung-Ying Tai

Показать описание

Leveraging Wasm for Portable AI Inference Across GPUs, CPUs, OS & Cloud-Native Environments | 利用Wasm在GPU、CPU、操作系统和云原生环境中进行可移植的AI推理 - Miley Fu & Hung-Ying Tai, Second State

This talk will focus on the advantages of using WebAssembly (Wasm) for running AI inference tasks in a cloud-native ecosystem. We will explore how wasm empowers devs to develop on their own PC and have their AI inference uniformly performed across different hardware, including GPUs and CPUs, operating systems, edge cloud etc. We'll discuss how Wasm and Wasm runtime facilitates seamless integration into cloud-native frameworks, enhancing the deployment and scalability of AI applications. This presentation will specifically highlight how Wasm provides a flexible, efficient solution suitable for diverse cloud-native architectures, including Kubernetes, to allow developers to fully tap the potential of LLMs, especially open source LLMs. The session offers insights into maximizing the potential of AI applications by leveraging the cross-platform capabilities of Wasm, ensuring consistency, low cost, and efficiency in AI inference across different computing environments.

本次演讲将重点介绍在云原生生态中运行AI推理任务时使用WebAssembly（Wasm）的优势。我们将探讨如何使用Wasm使开发者能够在自己的个人电脑上开发，并在不同硬件（包括GPU和CPU）、操作系统、边缘云等上统一执行他们的AI推理。我们将讨论Wasm和Wasm运行时如何实现无缝集成到云原生框架中，增强AI应用程序的部署和可扩展性。本次演示将重点展示Wasm如何提供灵活、高效的解决方案，适用于各种云原生架构，包括Kubernetes，以帮助开发者充分发挥大语言模型的潜力，特别是开源大语言模型。将深入探讨通过利用Wasm的跨平台能力来最大限度地发挥AI应用的潜力，确保在不同计算环境中实现AI推理的一致性、低成本和高效性。

CNCF [Cloud Native Computing Foundation]

Рекомендации по теме

Комментарии

ppt file can be found at the talk page at the sched website

mileyfu

Leveraging Wasm for Portable AI Inference Across GPUs, CPUs, OS & Cloud...- Miley Fu & Hung-Ying Tai

Leveraging Wasm for Portable AI Inference Across GPUs, CPUs, OS & Cloud...- Miley Fu & Hung-...

Leveraging Wasm for Portable AI Inference Across GPUs, CPUs, OS & Cloud-Nativ... Miley Fu & ...

Wasm Powered Open source LLMs and AI Agent | Project Lightning Talk

Fast And Portable LLM Inference With WebAssembly And Rust by Michael Yuan

Leveraging WASM to Improve Quality and Flexibility of the API Plugin Ecosystem... - Rohan Deshpande

WWWAI: Constructing a truly open GenAI app with WebAssembly, WebGPU, and WebAI - DevFest Venezia 24

WebAssembly Based AI as a Service - Rishit Dagli & Shivay Lamba [Presented in English]

WebAssembly Based AI as a Service on the Edge with Kubernetes - Rishit Dagli & Shivay Lamba

Workshop: Efficient and Portable AI / LLM Inference on the Edge Cloud - Xiaowei Hu, Second State

WebAssembly and WebGPU enhancements for faster Web AI

Lightning Talk: Running Native WebAssembly AI Applications Everywhere - Tiejun Chen, VMware

LLM's Anywhere: Browser Deployment with Wasm & WebGPU - Joinal Ahmed & Nikhil Rana

Wasm Workers Server: Portable Serverless Apps with WebAssembly - Angel M De Miguel Meana, VMware

Secure and Efficient Sensing Applications with Wasm: Sony... - Ryu Ishimoto and Ekaterina Kravchenko

Create cloud native agents and extensions for LLMs by Vivian Hu & Michael Yuan @ Wasm I/O 2024

Wasm Research Day 2024 – Arjun Ramesh

Leveraging WebAssembly to Write Kubernetes Admission Policies - Rafael Fernández López, SUSE

Machine Learning with Wasm (wasi-nn) - Andrew Brown & Mingqiu Sun, Intel

Wasm Is Becoming the Runtime for LLMs - Michael Yuan, Second State

The CRITICAL ADVICE You Need For IMPLEMENTING Web Assembly

Build, Share, Run WebAssembly Apps Using the Docker Toolchain - Chris Crone & Michael Yuan

Keynote: What is a Component (and Why)? - Luke Wagner, Distinguished Engineer, Fastly

WebLLM: A High-Performance In-Browser LLM Inference Engine

Create Cloud Native Agents and Extensions for LLMs - Xiaowei Hu, Second State