LLMs in Production: Fine-Tuning, Scaling, and Evaluation at Atlassian

preview_player
Показать описание
We will dive into the practicalities of deploying LLMs in business settings. We'll explore when to leverage LLMs and address how to minimize the complexity of the problem. Our discussion will guide you through designing an evaluation methodology and detail the circumstances necessitating fine-tuning for optimal performance. We will elaborate on the nuances of training data selection, establishing a flexible training ecosystem, hyperparameter optimization, scalable training, and finetuning workflows. As part of the practical session, we will go through the ETL process, how to format and structure data for finetuning, and how to organize, save, and manage these datasets. We will demonstrate a few finetuning configurations, show you how to monitor and evaluate your finetuned LLMs, and collect further datasets to improve your finetuned LLM over time.

Talk By: Brian Law, Sr Specialist Solutions Architect, Databricks ; Nathan Azrak, Senior Machine Learning Engineer, Atlassian

Here's more to explore:

Рекомендации по теме
Комментарии
Автор

interesting but vague ..for e.g. imho, clustering results and manually reviewing each cluster actually compounds errors in evaluations. For starters clustering depends on embedding text using some pre trained model which is highly domain dependent. Then there's the distance metrics used in clustering, which works ok with smallish vectors but vector embedding using any LM is at the very least in the 100's .. good luck with that ! I am not indicating that direct sampling of 5-10% is any better BUT you lose a lot of time and energy with techniques like clustering that are perfect for pure numerical analysis; similarly indication of fine-tuning with 10-20 or even a few 100 examples is oversimplifying fine tuning by so much ..if you DONT have a 5-10k dataset i dont think u should even start contemplating FT .. another todo before starting :P

pumplove