🔵 Serving Fine-Tuned Models in Production

preview_player
Показать описание
When it comes to serving of fine-tuned LLMs it is essential in production systems to share the base models, while hosting the LoRA parameters in the same container. This reduces the costs, however poses new problems like providing tenant isolation, scaling and dynamic loading of new adapters. In this talk we will cover the different aspects of the SAP Generative AI Hub and how the serving issues can be solved.

Speaker: Karim Mohraz, SAP

Рекомендации по теме