OpenAi's New Q* (Qstar) Breakthrough Explained For Beginners (GPT- 5)

Показать описание

Welcome to our channel where we bring you the latest breakthroughs in AI. From deep learning to robotics, we cover it all. Our videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on our latest videos.

Was there anything we missed?

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience
#IntelligentSystems
#Automation
#TechInnovation

TheAIGRID

Рекомендации по теме

Комментарии

With Q*, it's not just about finding answers; it's about discovering possibilities we didn't even know existed! Creativity coming to AI.

noorm

Funny thing: If Q* is *too* creative, we might not reward it for getting us in the right direction...because we can't comprehend how it moves us in the right direction. (Like many people would have tried to "fix" move 37)

meterfeeder

I love how ai is advancing more in a day than it used to in several years

NickDrinksWater

Timestamps:
00:02 Q learning is a type of machine learning used in reinforcement learning.
01:54 Q learning helps computers learn and improve by finding optimal solutions.
03:39 Q-learning is a learning process that helps an agent make optimal decisions in an environment.
05:29 Q* (Qstar) is being explored as a viable option for the future of large language models.
07:24 Limitations of large language models
09:14 Traditional LLMs have limitations like static knowledge, lack of context understanding, and biases.
11:01 Q learning (Q*) has dynamic learning and optimization of decisions, making it suitable for goal-oriented tasks.
12:53 Researchers are exploring advanced techniques to overcome the limitations of standard AI methods.
14:48 Gemini is currently delayed and it will be interesting to see how GPT-5 compares to GPT-4 and whether it will contain Q*.

Nick-Quick

What’s will be crazy is when AI becomes able think of things humans haven’t before. When it starts getting creative and inventing things that have never existed and we cannot comprehend, things will truly get interesting… and scary

SHEQUAN

What many people don't get is that GPT 4 was build before the release of GPT 3, 5 and all the hype. At that time not many people or Companies worked on AI. Now every major company and country is working on AI with billions of dollars investments from all sides.
Still people are saying there won't be an exponential growth or explosion and AGI is 5-10 year away from us. It's delusional at most really.

David.Alberg

The optimal Q-Value (for a given state-action pair) is called Q* in machine learning. I have no clue who came up with this A* crap and why, but probably someone who has never heard of Q-learning or even reinforcement learning (the most common form of reinforcement learning is Q-learning). So instead of reading an introductory book on reinforcement learning, they just googled around and accidentally bumped into this A* stuff.

jan

Great video, I keep hearing Q* everywhere lately...thanks for the explanations🙏

andycampano

0:50 What do you mean the A* paper was written in 2019? It was published in 1968.

Bokbind

reinforcment learning is not learning in a maze kind of environment because there is nothing to learn in a maze, the best thing it can is to memorize the way out OR if there is a special thing about this set of mazes that needs to be learn OR if the goal is the quickest time the algorithm will learn to minimize the time it goes in circels to zero.

sharon

5:20 this is like how the feedback system in a school is the teachers answering the kids questions or correction the behaviors kinda. Human feedback systems are variable and intertwined though so I’m guessing that’s what the ai has achieved, the ability to intertwine the understanding and feedback

Jacobk-gr

The title of your last video was “official: agi achieved”. Id love to know why we need a breakthrough in future LLMs in that case 😂

luizbattistel

7:20 on creativity: possibilities are infinite, it is beyond simple to generate a random thing... but that thing is likely to be garbage so it isnt about 'newness' it is about a higher level of selecting from just what we would evaluate as 'appropriate' or 'good'.
if we had that space of what is 'good' we could randomly pick from it, we dont need search.

judgeomega

This can lead to AGI, however to apply this on one game is a thing to make this general or multiples "games" at once will be a nightmare, maybe even worse extrapolating one to another should be insanely hard.

l.lawliet

Move 37 actually came from a complex trillion of trial and error algorithmic experience, it's vast quick learning is just a method it follows. I think we hit the benchmark AGI

maxiimillion

The name "QAR" likely comes from the Q learning and A* search algorithms that are used in reinforcement learning and pathfinding and graph traversal algorithms. Q learning involves an environment and an agent, states and actions, a Q table, learning by doing, and updating the Q table. The Q table contains the best actions to take in each state, and is updated using a formula that considers current and potential future rewards. OpenAI's potential breakthrough involves using Q learning to train large language models with a reinforcement learning approach, allowing them to learn and improve from their experience. This approach can solve complex problems and find the best solutions, similar to how one might figure out the best way to beat a video game. The six key steps in understanding Q learning are: environment and agent, states and actions, Q table, learning by doing, updating the Q table, and reinforcement learning. Q learning is like training a pet, where positive actions are rewarded and negative actions are penalized.

marchlopez

Great video.. I feel they should add step 7 as ‘Value’ as well to ensure the whole framework : agents, actions, goals everything is aligned with responsible AI n betterment of humanity values in-built in model so that there are less chances of misuse by design.

bhishammalani

best explanation so far and I have listened to many well done

attainconsult

I'm DCAing in AMS90X as well. ETH heavier DCA and ALGO. I'm taking your advice and starting Google tomorrow with a 50 dollar purchase and continuing Microsoft and Apple. VTI and VOO on another app and longterm portfolio. Here we go family!

Hasoisback_A_

He's so happy about the Recession coming in, like he's super excited to be witnessing it.

KasımBalsever

OpenAi's New Q* (Qstar) Breakthrough Explained For Beginners (GPT- 5)