Test Driven PROMPT Engineering: Using Promptfoo to COMPARE Prompts, LLMs, and Providers.

Показать описание

Wouldn't it be great if you KNEW your prompt was CHEAP, FAST, and ACCURATE? Relying on trial and error isn't enough when you're using prompts in production tools and applications. What you need is a methodical approach to prompt evaluation and testing. 'Test Driven PROMPT Engineering' is your key to unlocking this potential. This video showcases how Promptfoo can be a game-changer in comparing and optimizing your prompts, LLMs, and providers. Gain insights into cost-effective LLM choices, learn about prompt testing essentials to ensure your prompts are as efficient, cheap and accurate as they can be. This tutorial is straightforward and packed with value, designed for prompt engineers, full stack engineers, and product builders who want to make informed, confident, data-driven decisions in prompt engineering.

What might surprise you is how simple prompt testing can be (shout out to the promptfoo developers). Promptfoo will enhance your prompt engineering skills and AI Agents with simple yet customizable LLM testing and evaluation. Promptfoo even has support for testing the new OpenAI Assistants API! It doesn't matter if you're using AutoGen, Assistants API, ollama, ChatDev, Aider, custom agents, multi agent systems or really any other prompt engineering tool. At the end of the day every tool is driven by prompts and that means llm testing and evaluations will help you gain confidence, cut costs, and optimize the results from your prompts.

This tutorial provides a hands-on approach to understanding the intricacies of prompt comparison and optimization. You'll learn token usage, time to completion and how to compare different prompts to choose THE WINNER. Promptfoo enables you to effectively compare and select LLMs, with a focus on achieving the best balance between speed, accuracy, and cost. We discuss key strategies for testing prompts in various scenarios, highlighting the importance of prompt evaluation and testing by looking at real llm test cases using GPT-4 Turbo and GPT-3 Turbo.

Let me know if you're interested in more prompt testing tutorials, frameworks, and methodologies.

📺 Quick Start LLM Testing

🔗 LINKS:

📖 CHAPTERS:
00:00 Are your prompts even good?
00:45 For real apps, prompt FEEL is not enough.
01:14 Cheaper, Faster, Accurate Prompts with PROMPTFOO
01:35 Quick Start LLM Testing
03:20 Immediate Results with GPT-4-Turbo vs GPT-3-Turbo
04:00 Clean and Reusable testing structures
06:00 Asserts and Test Cases
07:45 Learn these 3 components and you're good to go
09:35 LLM Evaluation and Testing 2nd Run
10:00 GPT-4 is breaking the bank and our timeline
12:33 Promptfoo has a lot more to offer for llm testing
13:12 Promptfoo has a OpenAI, Anthropic, Ollama, and soon Gemini Providers
13:30 Three reasons you should test your prompts
14:42 Test Driven Prompts

💬 Hashtags
#gpt #promptengineering #aiagents

Рекомендации по теме

Комментарии

Thank you! Really good video and I also love that you do not just show their docs, but take the time to create and show actual examples.

timkoehler

This video is ridiculously insightful. You've explained everything in such a lucid manner, and your video is so well structured - after explaining something it's like you were addressing the next question that popped in my head. Fantastic job!

HerroEverynyan

I don't even need to know what this video is about to know that I need to watch it.

gigglesmclovin

Perfect, was looking for exactly this. Thanks for sharing!

s_streichsbier

Thank you !! Great video and content, clearly gonna try and implement in my processes. If you like it keep up the good work

vazquezsebastian

love the insights related to gpt3 vs 4 comparison, and the message around how testing saves time!

judymou

You said that in case of recognizing NLQ gpt3.5-turbo is 10 x faster and 4 x cheaper than gpt4. It is actually 40 times cheaper as gpt 3.5 is 10 times cheaper per 1000k tokens

macoson

Wait. I can only test prompts against barebone models? How would I test an agent? Something, that can be executed and returns a response?

cutmasta-kun

Thanks for the lesson Andy, though after the ttydb project i was sure this will showcase how to automate prompt optimization ontop of promptfoo. Like ttydb, i believe we can make an agenticFoo that constantly and consistently improves prompts through out any project.
What do you think?
Meanwhile all the best ❤

fire

Hi @IndydevDan, I have been enjoying your autogen tutorials and experimenting. But suddenly I came to know about Langchain. Call me novice but what's the difference between these two?

SynonAnon-viql

Test Driven PROMPT Engineering: Using Promptfoo to COMPARE Prompts, LLMs, and Providers.

Test Driven PROMPT Engineering: Using Promptfoo to COMPARE Prompts, LLMs, and Providers.

Taming AI Product Development Through Test-driven Prompt Engineering // Maxime Beauchemin // LLMs 2

Applying Test-Driven-Development Approaches to Prompt Engineering

Test driven LLM Prompts

Master the Perfect ChatGPT Prompt Formula (in just 8 minutes)!

Agile2024 Preview: TDD with ChatGPT

Prompt Engineering for Code Generation

What is Prompt Tuning?

Evaluation Engineering: Iterative Strategies to Testing Prompts

'With this prompt, I learned Pytest in 12 minutes' - Learn ANYTHING with LLMs

Josh Tobin: LLMOps: Test-Driven Development for Large Language Model Applications

How To Use Claude 3 Prompt Engineer Anthropic Console AutoPrompter (Custom Prompt Engineer)

Test Drive ChatGPT's TREE OF THOUGHTS Prompt

Prompt Engineering Techniques (extended version)

Test #LLM app, RAG app using #promptfoo | Speed up testing & evaluation.

AI Prompt Engineering for Testers, Performance Pyramid and More

Test-Driven Development: Best Practices You NEED to Know

OpenAI DevDay 2024 - What No One is Talking About!

AWS re:Invent 2023 - Prompt engineering best practices for LLMs on Amazon Bedrock (AIM377)

Treating Prompt Engineering More Like Code // Maxime Beauchemin // MLOps Podcast #167

Prompt engineering with ChatGPT. Writing expert prompts for non-English speakers. Sommelier and OSHA

Say Goodbye to Prompt Engineering with Meta-Prompting

Top 7 Prompt Libraries for Unlimited Prompts

Acceptance test-driven LLM development - David Faragó