OpenAI Realtime Voice API: A 7-Minute Getting Started Guide

preview_player
Показать описание
In this video, I guide you through setting up the new OpenAI real-time API, which promises new interactive possibilities for developers with its web socket-based architecture. You will learn how to clone the repository, configure the environment with an OpenAI API key, and set up a relay server for backend communication. The API offers real-time two-way interactions and a stateful interface, enabling function calls like getting weather updates with ease. I also explore features like 'set memory' functionality and demonstrate deploying basic applications. Stay tuned for future episodes where I'll cover deploying this in a production environment. By the end of this tutorial, you'll have a functional setup to experiment with and expand upon!

Links:
Introducing the Realtime API
OpenAI Realtime Console

Learn the fundamentals of becoming an AI Engineer on Scrimba:

00:00 Introduction to OpenAI Real-Time API
00:38 Understanding Web Sockets and Real-Time Interaction
01:13 Function Calling Demonstration
01:39 Stateful API and Memory Functions
02:52 Setting Up the Repository
03:11 Configuring the Environment
03:49 Running the Application
04:34 Handling Function Call Outputs
05:11 Exploring the Code and Next Steps
07:12 Conclusion and Next Steps
Рекомендации по теме
Комментарии
Автор

Talked with it for 5 min in the playground today. The cost was $2.35. Not too shabby.

MaliRasko
Автор

Great tool, if this was cheaper I would develop with it. Also, just emailed you about a sponsor opportunity. Cheers!

BrianDevJourney
Автор

This API is too expensive; I think we should avoid sending all chunks. We need a local VAD (Voice Activity Detection) to send only the chunks that contain voice; otherwise, it could become costly.

ibrahimaba
Автор

What was the latency? Also is there a way to have it await the function call return via the websocket? Def a non starter if we just have to deal with it coming back in pieces

jaysonp