Gemini 2.0 for developers

preview_player
Показать описание
Discover Gemini 2.0, the latest of Google’s multimodal AI models. This model is capable of generating native image and audio output and includes enhanced spatial understanding, and tool usage (Google Search, code execution and function calling). Explore the new Multimodal Live API allowing developers to build real-time, multimodal applications with audio and video-streaming inputs from cameras or screens. Try Gemini 2.0 Flash (experimental) in Gemini API, Google AI Studio and VertexAI.

Chapters:
0:00 - Gemini 2.0 Developer Announcement
0:41 - Multimodal Live API
3:58 - Multilingual native audio out
5:25 - Gemini 2.0 features
6:11 - Start building with Gemini 2.0

Resources:
Try Gemini 2.0 Flash (experimental)

Try code examples for Gemini 2.0 Flash

Watch videos on Gemini 2.0:
Building with Gemini 2.0:
Multimodal Live API:

Follow the new “Google AI for Developers” social channels

#GoogleAI #Gemini

Products Mentioned: Gemini 2.0, Google AI Studio, Gemini API, Vertex AI
Рекомендации по теме
Комментарии
Автор

I just used it to analyze gameplay almost in real time, it blew my mind, but crashed every few minutes. Incredible model

mrbananapsychooo
Автор

Great improvement! Good job to all involved 👏

banzai
Автор

These interactions won't last more than 3 minutes, i get "something went wrong" then have to start all over

bitcode_
Автор

I tried to share screen with Gemini while watching a movie, but it just kept responding to the movie every two or three seconds. It seems it cannot tell the difference between my voice and sound in the video.

kekewan-ll
Автор

I feel like I can make an assistant for.... anything... with these APIs. Multi-lingual assistants, even though I am not. Insane.

NewNerdInTown
Автор

Wait a second, if this is out.. Astra can't be far off!! 😊

Maybe by the end of this year?!

EchoYoutube
Автор

It can listen to you while interrupting so crazy!😮

aigriffin
Автор

I'm so disappointed. There's no file access.

dr.mikeybee
Автор

Esto es extraordinario, habre un abanico de posibilidades impresionantes

stanleyillidge
Автор

Google Ai Studio - Always returns error "An internal error has occurred"

orikla
Автор

What about user voice isolation without that voice experience is very limited

tijendersingh
Автор

Would be nice… if studio didn’t throw and error every minute or two.

Jason_vinion
Автор

Thats very impressive, nice work Google! 3:08 a remark, wouldn't it be better if when a user interrupts the ai reading that the ai stops like immediately and listens what the user want and not after the user is finished with the new request.

Richard_GIS
Автор

When it started to speak English, French and Korean was mind-blowing 😮

Lruiz
Автор

Can tts output the word timings, too? We'd like to have closed captions for accessibility requirements.

DangRenBo
Автор

I can't reproduce the demo with the car to convertible, no image is generated, anyone else?

jackquiver
Автор

How do I embed voice enabled help in my website using Gemini? For eg. I own a bank & I want my customers to learn how to use the website - say connecting another bank to transfer money.

ArunKumar-jkpq
Автор

Hi, is it possible to share my screen with Gemini, and show my visual trading setup, showing some days and trades, and explaining with audio my setup, at the same time on the video, in order to code this human trading decisions in csharp ? for let's say for example quantower api ? thanks for your answer

evanbassmusic
Автор

We can put this multimodal live API to a robot and interact with them more naturally

checkoverstripes
Автор

Web-socket technology is simply not designed for such a high volume of data transfer and concurrency - thats why we see so much complaints about errors - before we were using SSE (Server Sent Events) and it was much more robust. We need to improve websockets in order to be able to stablish long stabe and multimodal conversations

hoomansedghamiz