Using OpenAI Realtime API to build a Twilio Voice AI assistant with Node.js

preview_player
Показать описание
We're so excited for our friends at OpenAI, who launched their Realtime API today. This tutorial is built using this exciting API. The API opens up Speech to Speech (S2S) capabilities for their GPT-4o multimodal model, which supports direct audio input and output – avoiding translating back and forth from text with a speech-to-text (SST) or text-to-speech (TTS) step.

Note: OpenAI is rolling out Realtime API access incrementally. Please watch their site for updates.

This video will help you build an AI assistant using Twilio Voice and OpenAI's Realtime API. Here's what you'll need to build it:

Resources:
Комментарии
Автор

Now you just have to provide customer data from Segment to the model. Then when a customer calls the model can give a personalized answer.

For example, a customer calls a car repair shop. Then the model using RAG accesses a customer’s data to check on the status of a car repair. Lastly, the model responds with the status of the car repair.

All the customer has to do is call the car repair shop and ask a simple question with voice. A great customer experience if you ask me 😊

riley_blackwell
Автор

that Twilio robotic voice need and update, thank for the content!!!

nlarchive
Автор

for the interruption issue :
you need to clear the twilio buffer and then send response.cancel

nexgenpcshop
Автор

Already built this on my channel will be crazy

SaminYasar_
Автор

There's a small bug in the blog post guide. The websocket connection URL is mistyped (should contain a single model=, atm has two)

markustrasberg
Автор

Would like to see a tutorial about using OpenAI to get on-screen transcriptions of phone calls

PraiseYeezus
Автор

This is going to really help you guys. I worked on this immediately when this was dropped but this setup has a weakness. Interruptions don’t work when you interrupt the agent in the middle of a larger audio playback (ask it to read an example paragraph) and then try to interrupt it in the middle - it won’t work. I tried messing with it but nothing worked.

cyruszad
Автор

Very good video! When I was testing with Twilio's dev phone, I found an issue. We are unable to interrupt the conversation directly, like we can when using OpenAI Realtime. How should this problem be resolved?

QianliangHuang
Автор

This is great. But I have been struggling with the ability to interrupt the AI when on a call with Twilio.

HarborProjectB
Автор

Also, do you guys have any thoughts you would care to share on outbound calling?

cscrowley
Автор

got this working using my azure endpoint with some help from chatgpt!

I did notice this example doesn't handle interruptions, will you be updating the repo with more features in the future?

jothamdudley
Автор

Hey! I have use the function calling in this real time api for calendar bookking but I am struggling with how to send the response of the function back to API for TTS. Can you please help me with that?

fantasticshorts
Автор

Can you show how we can integrate Function Calling as well?

dawid_dahl
Автор

has anyone here figured out how to modify this code for interrupts?

gurumack
Автор

One thing I dont undrstand - how to make OpenAI speak first when it answers the call?

randotkatsenko
Автор

"Thank you, Brent! Do I need a Twilio subscription for communication between two valid numbers? (The trial only provides one valid number.) When I try to make a call using the Twilio dev phone with the same number, I don't receive anything." it seems i need two numbers?

mustaphaboutzoua
Автор

Can you make a tutorial for this on azure as well?

limebulls
Автор

Confused. Instructions say "Step 2: Get your Account Sid and Auth Token from the Twilio Console to get started.", but nowhere does it say what do with them. Also call connects ago, but it can't seem to hear me, then disconnected after 5 seconds. Related? Connected to the OpenAI Realtime API
Sending session update: {"type":"session.update", "session":{"turn_detection":{"type":"server_vad"}, "input_audio_format":"g711_ulaw", "output_audio_format":"g711_ulaw", "voice":"alloy", "instructions":"You are a helpful and bubbly AI assistant who loves to chat about anything the user is interested about and is prepared to offer them facts. You have a penchant for dad jokes, owl jokes, and rickrolling – subtly. Always stay positive, but work in a joke when appropriate.", "modalities":["text", "audio"], "temperature":0.8}}
Disconnected from the OpenAI Realtime API

BrainCandyQuiz
Автор

How can we get access to Realtime API on Openai account (I have paid account already). I integrated code and added openai key but problem is that during call, it's started communicating and not listening to me (No two-way communication). Can someone help me out?

MohsinAli-xrr
Автор

can i use it in danish, turkish or german?

sarzzfish