filmov
tv
Hallucinations and Hyperparameters: Navigating the Quirks of LLMs - Yonatan Alexander
Показать описание
In my talk, Hallucinations and Hyperparameters: Navigating the Quirks of LLMs, I’ll start with an overview of the current state of LLMs before moving into the technical details of deployments. I’ll cover key elements such as hyperparameters, batching, GPU utilization, routing strategies, and token summarization—critical factors in optimizing both performance and cost.
We’ll then explore considerations for companies looking to adopt LLMs, including the decision between platform-based solutions and self-deployment. I’ll explain the differences between fine-tuning, Retrieval-Augmented Generation (RAG), and prefix caching, as well as how to choose the right model based on factors like cost, scalability, and control. Security will be a central focus, with discussions on prompt injection, jailbreak risks, and mitigation strategies.
For end users, I’ll provide best practices for prompt engineering, emphasizing how to maximize the effectiveness of LLMs in tasks like code generation and solving complex problems.
Finally, we’ll discuss how to stay ahead in the rapidly evolving AI landscape.
We’ll then explore considerations for companies looking to adopt LLMs, including the decision between platform-based solutions and self-deployment. I’ll explain the differences between fine-tuning, Retrieval-Augmented Generation (RAG), and prefix caching, as well as how to choose the right model based on factors like cost, scalability, and control. Security will be a central focus, with discussions on prompt injection, jailbreak risks, and mitigation strategies.
For end users, I’ll provide best practices for prompt engineering, emphasizing how to maximize the effectiveness of LLMs in tasks like code generation and solving complex problems.
Finally, we’ll discuss how to stay ahead in the rapidly evolving AI landscape.