filmov
tv
Tutorial and Demo of a ChatGPT Assistant with Python, Gradio and ElevenLabs APIs... code included!
![preview_player](https://i.ytimg.com/vi/z4H9ILlu9Vs/maxresdefault.jpg)
Показать описание
#VTuber Video and we get right to the point and share the code for building a VoiceChatBot using the OpenAI models and latest APIs... We not only use ChatGPT and Whisper but we also integrate the Elevenlabs lab API to get an AI synthesized voice for our responses. We use the d-ID service to create an AI avatar to present it and explain it. The Script for the Video was written by Chat GPT after sharing the code with it. #Virtualhumans, #digitalhumans, #Virtuainfluencers
Here is the code..
import gradio as gr
import openai
import winsound
from elevenlabslib import *
from pydub import AudioSegment
import io
import config
api_key = config.ELEVENLABS_API_KEY
from elevenlabslib import ElevenLabsUser
user = ElevenLabsUser(api_key)
messages = ["You are an advisor. Please respond to all input in 50 words or less."]
def transcribe(audio):
global messages
audio_file = open(audio, "rb")
engine="text-davinci-003",
prompt=messages[-1],
max_tokens=80,
n=1,
stop=None,
temperature=0.5,
)
system_message = response["choices"][0]["text"]
chat_transcript = "\n".join(messages)
return chat_transcript
iface = gr.Interface(
fn=transcribe,
inputs=gr.Audio(source="microphone", type="filepath", placeholder="Please start speaking..."),
outputs="text",
title="🤖 My Desktop ChatGPT Assistant 🤖",
description="🌟 Please ask me your question and I will respond both verbally and in text to you...",
)
_____________
#ChatGPT #chatgpttutorial #Whisper #OpenAI #AI #ML #pythonprogramming #pythonchatbot, #OpenAI, #ElevenLabs, #Gradio, #naturallanguageprocessing, #audioresponse, #voicebot #voicetechnology, #Python, #APIkeys, #transcription, #texttospeech ospeech, #machinelearning, #learningvideo , #chatgptprompts, #artificialintelligence, #aispeechrecognition, #whisper #chatgpt4, #chatgpt3, #voicebot, #howtoprogramchatgpt, #howtoprogram
_____________
Code walk courtesy of ChatGPT:
The first step is to import the necessary libraries, including Gradio, OpenAI, ElevenLabs, and PyDub. The config file is also imported, which contains API keys necessary for accessing the OpenAI and ElevenLabs services.
Next, we define a list called "messages" that will store the conversation history between the user and the chatbot. Initially, it contains a greeting message from the chatbot, "Hello! How can I assist you today?"
Then, we define a function called "transcribe" that handles the audio input from the user, transcribes it using OpenAI's audio transcription API, and sends it to OpenAI's natural language processing engine for a response. The function also generates an audio response using an ElevenLabs voice and saves it as a WAV file that is played using the winsound module.
To begin, the function reads in the audio file and transcribes it using the OpenAI Audio API. The resulting transcription is then appended to the "messages" list as a string. We also print the transcript to the console for debugging purposes.
After that, we use OpenAI's natural language processing API to generate a response to the user's query. We pass in the most recent message from the "messages" list as the prompt and ask for a maximum of 60 tokens in the response. We also set the temperature parameter to 0.5, which means the model will make somewhat unpredictable choices rather than always selecting the most likely word. The resulting system message is also appended to the "messages" list.
Next, we use ElevenLabs' text-to-speech API to generate an audio response to the user's query. We select a voice named "Antoni" and pass in the system message as the text to be spoken. The resulting audio is saved as a WAV file. The file is then played.
Finally, we join all of the messages in the "messages" list into a single string and return it to the Gradio interface, where it is displayed to the user as text.
Finally, we create a Gradio interface using the "transcribe" function as the callback. We set the input to be audio from the microphone and the output to be text. We also set a timeout of 60 seconds to ensure that the chatbot will respond
voicechatbot, OpenAI, ElevenLabs, Gradio, natural language processing, chatbot, audio response, voice technology, Python, API keys, text-to-speech, machine learning, voice assistant, artificial intelligence, speech recognition, and #VTuber
Here is the code..
import gradio as gr
import openai
import winsound
from elevenlabslib import *
from pydub import AudioSegment
import io
import config
api_key = config.ELEVENLABS_API_KEY
from elevenlabslib import ElevenLabsUser
user = ElevenLabsUser(api_key)
messages = ["You are an advisor. Please respond to all input in 50 words or less."]
def transcribe(audio):
global messages
audio_file = open(audio, "rb")
engine="text-davinci-003",
prompt=messages[-1],
max_tokens=80,
n=1,
stop=None,
temperature=0.5,
)
system_message = response["choices"][0]["text"]
chat_transcript = "\n".join(messages)
return chat_transcript
iface = gr.Interface(
fn=transcribe,
inputs=gr.Audio(source="microphone", type="filepath", placeholder="Please start speaking..."),
outputs="text",
title="🤖 My Desktop ChatGPT Assistant 🤖",
description="🌟 Please ask me your question and I will respond both verbally and in text to you...",
)
_____________
#ChatGPT #chatgpttutorial #Whisper #OpenAI #AI #ML #pythonprogramming #pythonchatbot, #OpenAI, #ElevenLabs, #Gradio, #naturallanguageprocessing, #audioresponse, #voicebot #voicetechnology, #Python, #APIkeys, #transcription, #texttospeech ospeech, #machinelearning, #learningvideo , #chatgptprompts, #artificialintelligence, #aispeechrecognition, #whisper #chatgpt4, #chatgpt3, #voicebot, #howtoprogramchatgpt, #howtoprogram
_____________
Code walk courtesy of ChatGPT:
The first step is to import the necessary libraries, including Gradio, OpenAI, ElevenLabs, and PyDub. The config file is also imported, which contains API keys necessary for accessing the OpenAI and ElevenLabs services.
Next, we define a list called "messages" that will store the conversation history between the user and the chatbot. Initially, it contains a greeting message from the chatbot, "Hello! How can I assist you today?"
Then, we define a function called "transcribe" that handles the audio input from the user, transcribes it using OpenAI's audio transcription API, and sends it to OpenAI's natural language processing engine for a response. The function also generates an audio response using an ElevenLabs voice and saves it as a WAV file that is played using the winsound module.
To begin, the function reads in the audio file and transcribes it using the OpenAI Audio API. The resulting transcription is then appended to the "messages" list as a string. We also print the transcript to the console for debugging purposes.
After that, we use OpenAI's natural language processing API to generate a response to the user's query. We pass in the most recent message from the "messages" list as the prompt and ask for a maximum of 60 tokens in the response. We also set the temperature parameter to 0.5, which means the model will make somewhat unpredictable choices rather than always selecting the most likely word. The resulting system message is also appended to the "messages" list.
Next, we use ElevenLabs' text-to-speech API to generate an audio response to the user's query. We select a voice named "Antoni" and pass in the system message as the text to be spoken. The resulting audio is saved as a WAV file. The file is then played.
Finally, we join all of the messages in the "messages" list into a single string and return it to the Gradio interface, where it is displayed to the user as text.
Finally, we create a Gradio interface using the "transcribe" function as the callback. We set the input to be audio from the microphone and the output to be text. We also set a timeout of 60 seconds to ensure that the chatbot will respond
voicechatbot, OpenAI, ElevenLabs, Gradio, natural language processing, chatbot, audio response, voice technology, Python, API keys, text-to-speech, machine learning, voice assistant, artificial intelligence, speech recognition, and #VTuber