vLLM Office Hours - Deep Dive into Mistral on vLLM - October 17, 2024

preview_player
Показать описание

During our special topic deep dives, we were joined by Mistral AI’s research engineer, Patrick von Platen, who shared insights into Mistral’s architecture choices and how to efficiently deploy Mistral's models on vLLM.

During the Q&A, we tackled audience questions on topics such as architecture redesign strategies, rotary position embeddings, vLLM support for ARM architecture, OpenAI Whisper, Seq2Seq support in v0.6.3, and more.

Рекомендации по теме
Комментарии
Автор

Hi, great video, I would like to know if it is possible to use e5-mistral-7b-instruct in VLLM for embedding and completions with only one instance of VLLM?

micuentadecasa