Gemini 1.5 Pro for Video Analysis

preview_player
Показать описание

Next Gemini video will look at Code with Gemini 1.5

For more tutorials on using LLMs and building Agents, check out my Patreon:

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Intro
00:14 Let's examine a talk by Jeff Dean
01:03 Uploading a 50 minute video to Google AI Studio
02:58 Prompt: General Short Summary
05:08 Where is the talk being held?
05:54 When does he start talking about Gemini?
08:16 Citing Noam Shazeer
10:25 Gemini Training Data
12:25 Adding the video transcript
14:27 Slides breakdown
16:31 Bard Name Change
17:51 Write a blog post
Рекомендации по теме
Комментарии
Автор

Gemini 1.5 seems like a truly gigantic leap in LLMs.. probably the first time I've been wowed since the release of gpt-4

pratikindap
Автор

Sam - Great video! More Google content, please. New features made Gemini useful in my workflows.

micbab-vgmu
Автор

I hope they give access to this model soon :/

hiramcoriarodriguez
Автор

Awesome video! thanks for being the hero we needed! Keep going forward and enjoy Singapore!
I wonder if it could be good at coding / making coding agents for people who don't know code at all.

rezakn
Автор

Could you try a narrative video? This would be really useful to understand the model's capacity to understand semantics of juxtaposed images.

dus
Автор

very interesting and informative! I am wondering how this would work for literature review type workflows. Say you choose a technical topic like text to video, and you upload 5-10 key relevant papers (like the ones hf summarized after the SORA release), how would will the model perform in synthesize the papers? An even crazier task is to add another literature review paper as an example, so it will be like 5-10 papers with a 1-shot prompt. If the model can kind of reason through this, the implications would be huge

randomgc
Автор

Can you please do a video on Gemini 1.5 pro reading an entire college level science textbook?? That would be so awesome!

JJBoi
Автор

Can you share your Gemini Chat like ChatGPT allows you to ❓
What were the costs to process all that video multiple times ❓
Keep up the good work 👍

ScottzPlaylists
Автор

can you upload a storybook or novel and and ask about characterization of some new book that may have just released? so excited about this. can't wait to try it out

jenishpatelbmc
Автор

Legal doc review basically fully automated at this point.

delxinogaming
Автор

I am making an automated video editor using gpt vision and another speech to text api. It does work this way, but I would like to see what Gemini can do!

Can you please test if Gemini 1.5 can act as a Professional Video Editor and output timestamps where to place zoom in/out effects, emojis or sound effects?

randotkatsenko
Автор

nice! could you try something harder, say: show a security camera video of a breakin and ask it describe what happens in the video(don't mention breakin).

Emerson
Автор

Please anything on Data Analysis tasks ? (CSV, XLS ….)

mandlasibanda
Автор

You could've as Gemini to return the timestamp of its responses, so that you could then verify if it was said around the timestamp it returned. That why you'd actually have a higher likelihood of seeing if it was really said in the video.

jmg
Автор

Hi! How are you? I'm trying to "develop" a tool that could read advertising videos in an automatic way.
I'm trying to put it into Google Sheets with Gemini AI assistant, but it can't read or analyze videos... do you think a possible way to create an automatic tool where I put in all the videos from youtube for example (links), and then they create an analysis? For example, summary, main objective, advertising tone, something like that...

What do you think?
Thanks in advance, great content!!!!

Niikolses
Автор

Can you do a video with audio summarization? Feed it a large audio file and ask for a per-timestamp summary?

JacobAsmuth-jwuc
Автор

why blur release date of the video? it was the 16th of feb if you're wondering

parthcosic
Автор

Any sense of whether it could understand a video with no captions or words spoken in the video. Like maybe 30 seconds of a stream in a snowstorm?

larryvelezbx
Автор

Great video. However, the video that was uploaded was probably not the one that would demonstrate its potential the best. Frame by frame analysis using conventional entity extraction could have yielded similar results. The context is written in text on the slides. Using things like sports analytics where might have been a bigger stretch where motion is tested.

dusanbosnjakovic
Автор

how long of a response can you get out of it. could it describe a full video like a normal human does, if yes how long of a video. will it ever be able to work with audio and video at once?

abhishekak
welcome to shbcf.ru