Ray Serve: Patterns of ML Models in Production

Показать описание

(Simon Mo, Anyscale)

You trained a ML model, now what? The model needs to be deployed for online serving and offline processing. This talk walks through the journey of deploying your ML models in production. I will cover common deployment patterns backed by concrete use cases which are drawn from 100+ user interviews for Ray and Ray Serve. Lastly, I will cover how we built Ray Serve, a scalable model serving framework, from these learnings.

Рекомендации по теме

Комментарии

I’d like to see an End-To-End hands on example of this. I.e. as a coding tutorial or such which shows hands on how to do large time series analysis / large model and carry it to production. Simply cases were you need to train on many machines / GPUs to get through the data (see NVTabular) and how ray clicks into that regarding the research, training, validation (weights and biases / tensor board) and finally serving / inference phase. Right now ray feels like an “take all or nothing” compared to RAPIDs etc.

dinoscheidt

Ray Serve: Patterns of ML Models in Production

Ray Serve: Patterns of ML Models in Production

Introducing Ray Serve: Scalable and Programmable ML Serving Framework - Simon Mo, Anyscale

Deploying Many Models Efficiently with Ray Serve

TALK / Simon Mo / Patterns of ML Models in Production

Ray Serve: Tutorial for Building Real Time Inference Pipelines

Building Production AI Applications with Ray Serve

Ray: A Framework for Scaling and Distributing Python & ML Applications

Seamlessly Scaling your ML Pipelines with Ray Serve - Archit Kulkarni

Enabling Cost-Efficient LLM Serving with Ray Serve

Productionizing ML at scale with Ray Serve

Introduction to Model Deployment with Ray Serve

State of Ray Serve in 2.0

Advanced Model Serving Techniques with Ray on Kubernetes - Andrew Sy Kim & Kai-Hsun Chen

Multi-model composition with Ray Serve deployment graphs

Faster Model Serving with Ray and Anyscale | Ray Summit 2024

Leveraging the Possibilities of Ray Serve

Scaling AI & Machine Learning Workloads With Ray on AWS, Kubernetes, & BERT

Scalable machine learning workloads with Ray AI Runtime

KubeRay: A Ray cluster management solution on Kubernetes

Ray (Episode 4): Deploying 7B GPT using Ray

Keynote: The Future of Ray - Robert Nishihara, Anyscale

Scaling AI Workloads with the Ray Ecosystem

Ray Serve for IOT at Samsara

Highly available architectures for online serving in Ray