GPT-4o is WAY More Powerful than Open AI is Telling us...

preview_player
Показать описание
OpenAI just unveiled their new GPT-4o model, and it's more powerful than we ever imagined! In this video, we dive deep into what makes GPT-4o truly multimodal, capable of generating text, images, audio, and even video. Discover the groundbreaking features and hidden capabilities that OpenAI didn't fully reveal. From stunning image creation to lifelike audio generation, GPT-4o is set to revolutionize the AI landscape. Watch now to uncover the full potential of this game-changing model!

▼ Link(s) From Today’s Video:

-------------------------------------------------

▼ Extra Links of Interest:

Let's work together!

Thanks for watching Matt Video Productions! I make all sorts of videos here on Youtube! Technology, Tutorials, and Reviews! Enjoy Your stay here, and subscribe!

All Suggestions, Thoughts And Comments Are Greatly Appreciated… Because I Actually Read Them.

Timestamps:
00:00 Introduction and Initial Reactions
00:36 Overview of GPT-4o and Multimodal AI
01:42 Comparison with GPT-4 Turbo
03:22 Text Generation Capabilities
07:22 Audio Generation Capabilities
12:22 Image Generation Capabilities
19:04 Advanced Features
23:27 Video Understanding Capabilities
27:34 Conclusion
Рекомендации по теме
Комментарии
Автор

I think the image editing is one of THE most mind blowing pieces of this... What do you guys think?

MattVidPro
Автор

14:17 Matt, the multiple whiteboards/chalkboards at the top ARE realistic. This is actually how chalkboards in older classrooms used to work. They would have multiple chalkboards on sliders that you could pull up and down.

itsallgoodaversa
Автор

I don't know about everyone else but most of the people I come in contact with have no clue about the rapid developments in AI. Kind of eery...

chrisbtr
Автор

One of the things I think I would have try with GPT-4o is take a photo of a page from a manga or comic book or even a novel and ask it to read back the text in voice of of the characters as they speak.

reifuTD
Автор

Idk if i'm more impressed with the life-like sound of the voice, or how human it feels to interact with (ie. it understands our emotions)

MikeWoot
Автор

Chalkboards often have multiple boards that slide onto of each other

evilknight
Автор

Timestamps for yall:
00:00 - Introduction and Initial Reactions

Introduction to the video.
Reaction to OpenAI's real-time AI companion.
00:36 - Overview of GPT-4o and Multimodal AI

Explanation of GPT-4o.
What does "multimodal" mean?
01:42 - Comparison with GPT-4 Turbo

Differences between GPT-4o and GPT-4 Turbo.
Audio capabilities of GPT-4o.
03:22 - Text Generation Capabilities

Speed and quality of GPT-4o's text generation.
Examples of high-speed text generation.
07:22 - Audio Generation Capabilities

Demonstration of GPT-4o's audio generation.
Examples of emotive and natural voice outputs.
12:22 - Image Generation Capabilities

Explanation of GPT-4o's image generation.
Examples of high-quality image outputs.
19:04 - Advanced Features

Image recognition and video understanding.
Examples of practical applications and scenarios.
23:27 - Video Understanding Capabilities

Discussion on GPT-4o's video capabilities.
Potential future developments and limitations.
27:34 - Conclusion

Final thoughts on GPT-4o's impact and potential.
Invitation to viewers to subscribe and join the community.

MattVidPro
Автор

GPT-4o is also A LOT more reliable when it comes to long-form text processing. Not even comparable to either GPT-4 or Gemini. It follows the prompt much better, doesn't get lazy so easily, and doesn't start to hallucinate so quickly. I tried four hours to get GPT-4 and Gemini to do what I wanted, and they failed miserably. GPT-4o completed the whole damn task in 40 minutes without so much as a hiccup.

helge
Автор

Services like Audible should release AI that reads the books, but also allows you to talk about the topics, do quiz tests, and more, making the entire book library an instant interactive homeschooling study resource for anyone wanting to level up in life. In contrast to just 'consuming' audiobooks as we do in todays passive one way relationship dynamic.

fynnjackson
Автор

the most mind blowing think is the speed. With that speed and variety of natural voices you can make a real rpg game with Ai NPC

kfrfansub
Автор

About the chalkboard. I think the dual chalkboards are not unrealistic. We had those a lot when I was studying. You could move them up and down to have more space.

fabiankliebhan
Автор

they didn't showcase these features because it's 100x SENSORY OVERLOAD. their 4o demo was strategic to get the worlds attention but mysterious enough for inquisitive minds to dig deeper, and you did. that said, I've been a dev for 2 decades and after watching this, I'M TRIPPING BALLS RIGHT NOW

justinoneill
Автор

Man, the image understanding of GPT-4o is crazy

SpikyBlade
Автор

15:53 Actually no, the image generation didn't screw up. If you look that's actually EXACTLY what is written, including capitalisation (or lack-thereof). What's even more impressive is that it actually split the word "sound's" across multiple lines and it did it completely corrctly! Actually mind-blowing! 🤯🤯🤯

starblaiz
Автор

Honestly regarding images: What we really need IS multi-modality. The images produced by common models like SD are good enough. The problem is that it doesn't really understand what it is doing. If they can keep the quality of current models and just add a deep understanding to it, that multiplies the actual quality of the outcome by orders of magnitude in the sense that you get what you actually want AND can change specific things instead of getting images that so-so follow a prompt somewhat and then inpainting and hoping for the best.

johannesdolch
Автор

An odd thing about GPT-4o is that it's better at poetry than it used to be. It has a better idea of the meter of a limerick or a sonnet than it did before it had a multimodal understanding of what words sounded like. Words like "love" and "prove" don't rhyme any more. You can see this by asking GPT-4 turbo and GPT-4o to produce poems using the existing text interface. It's also the first time I found a model that can reliably produce a Petrarchan/Italian sonnet instead of a Shakespearean/Elizabethan sonnet--previous models always used the much-more-common Elizabethan rhyming scheme.

nathanbanks
Автор

This is the first AI model that I feel the urge to use. The capabilities are incredible.

wannaBtraceur
Автор

14:10
many university blackboards like this come in sets of three at different depths above the wall. You can slide them up and down to access the other boards. It allows the lecturer to keep writing on new board while allowing students to still see previous steps in the lesson if they need to look back and also means the professor doesn't have to waste time erasing the whole board every 5/10 mins.

alansmithee
Автор

Cracked me up at “I wouldn’t even be able to tell you this was a missile in the first place! This things a professional!” 😂

iamjohnbuckley
Автор

12:27 Unless it's an app specific feature, GPT-4o in the ChatGPT interface explicitly states that it generates images using DALL-E 3.

WordsInVain