Use Gemini 2.0 to Build a Realtime Chat App with Multimodal Live API

Показать описание

In this video, I'd like to introduce how to build a real-time chat app with voice and video interaction by using Gemini 2.0 Multimodal Live API.

TIME STAMPS:
00:00 Overview of Gemini 2.0
02:04 Google AI Studio
04:52 Multimodal Live API
08:18 Code Walkthrough
17:09 Run the App

USEFUL LINKS:

MY CONNECT:

Рекомендации по теме

Комментарии

Google-genai has been upgraded, and the session.send() definition was changed. Make sure you run the demo code on google-genai==0.3.0.

yeyulab

"Incredible video! This is super valuable for all developers. Would love to see an implementation of Twilio incoming calls using the Google Gemini 2.0 Flash model!"

bunnynikhil

This video was so helpful. Thank you for the good work. Always worth byuing you some coffee. i was wondering if its possible to train this system on my plumbing business data such that when customers live stream a video of their plumbing issues, the API can analyse the issue and advice accordingly then tell the customer that we have the solution in stock/inventory and it costs so much! Maybe the customer can purchase the item at the same time. (my wild thought)

thabisonaha

May I know what input llm takes from websocket I mean any specific format in audio and wat it returns to websocket

learningtech

Great video! Are we able to choose between different voices?

Swollphin

Isn't it multimodal able anyway from the beginning on as gemini 2.0 flash? Or was your dev for local purposes?

RealLexable

The video was very helpful. With the help of this video I was able to build my own application. But having trouble in deploying this. Could you please make a video on how to deploy this application. For information, I am having issue in deploying it on render, pyaudio cannot be used, how to solve it. Please help

mayank

Hey, I hav one doubt I hav already tried implementing this
Facing some issues here is the approach which I followed,
Audio live streaming I want
Firstly giving input through the microphone and converting it into base64 and sending to websocket there I'm decoding audio and sending to llm and response audio wanna send back to user via websockets
But getting some issues I'm unable to get the res frm llm could you please help

learningtech

can this api be used to deploy and launch a web application to be used by others?

adarsh

did google update the api, mine seems to be failing with timed out during handshake error since yesterday.

aryanchauhan

Hello my friend, im a person with low vision=sight, j have somd questions

1- is this runnibg gemini locally on your device? Deprmfing on my GPU and CPU?

If yes, what is minimum to get this working locally?

2- can I use it to shsre my screen to it?

I'm asking about this because i feel likd it would bd helpful fof mh case to using my pc
I knos i can do this wifh Google AI studio, buf i want it locaalg to be as fast as possible.

ahmedal-ani

Could you check the git repo, its not working properly.. config w text is working okayish but the audio config isnt working

niv

Can I get response in both audio and text?

I tried:

CONFIG = {"generation_config": {"response_modalities": ["AUDIO", "TEXT]}} but it gives error.

AbdulRahman-vjel

This tutorial is misleading. Ran `pip install google-genai` in the console successfully, but the project wasn't populated with all the files `1.txt`, `index.html`, etc.; and `Demo Source Code` is a list of YouTube video links.

chrisBBGun

it is only showing response in text how do i make it talk back?

simphyy

getting this error ?
Error in Gemini session: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)
Gemini session closed.

lohithnh

Use Gemini 2.0 to Build a Realtime Chat App with Multimodal Live API

Gemini 2 0 API to Make com Easy Setup!

Build AI Agents for Machine Learning (SmolAgents + Gemini 2 0 or Ollama)

FREE Product Photo Editing: Google Gemini 2 0 AI Power

NEW Google Gemini Nodes in n8n — Full Setup & Use Cases

Coding with Gemini 2.5 Pro is actually insane

DeepMind’s New Gemini AI: Build Anything For Free! 🏅

6 PRACTICAL Google Gemini Uses for Everyday Life!

How To Use Google's Veo 3 AI Video Generator in Gemini AI Ultra

Google Gemini Gems: Build AI Assistants That Actually Remember You - Advanced Tutorial (2025)

AUTOMATE Every TASK with AI 🚀 Google AgentSpace Review | GEMINI 2.0 Flash, Notebook LM & AI Agen...

Build a Star Wars themed AI Application with Genkit JS, Gemini 2 0 Flash and Imagen 3

Creating in Flow | How to use Google’s new AI Filmmaking Tool

ChatGPT Vs Gemini AI

How to use gemini 2.0 for interactive learning and skill building (2024)

Build Anything with Gemini 2.5 Pro, Here’s How

Bolt DIY + DeepSeek V3 & Gemini 2 0 Flash Build Smarter, Faster, and Cheaper with Open Source AI...

30 Gemini 2.5 Pro Hacks You Need to Know in 2025 (Become a PRO!)

New Google Gemini 2.0 Flash & Pro - Comparing 4 FAST Models

Google Gemini 2.5 Pro is Insane...

Use GOOGLE GEMINI ADVANCED FOR FREE | Free Alternative to ChatGPT PLUS!

Gemini 2 Flash vs Midjourney: Text to Image in Seconds #gemini2 #imageediting

Gemini 2 0 Tutorial for Content Creators: No Tech Skills Needed!

Create an AI Chatbot in Minutes Using n8n! 🤖 (No Coding Required)

Google Gemini Full Tutorial for Beginners (Gemini for Google Workspace | Gen AI Tools for Business)