Q* - Clues to the Puzzle?

preview_player
Показать описание
Are these some clues to the Q* (Q star) mystery? Featuring barely noticed references, YouTube videos, article exclusives and more, I put together a theory about OpenAI’s apparent breakthrough. Join me for the journey and let me know what you think at the end.

Рекомендации по теме
Комментарии
Автор

My computer crashed 7 times while making this video and I had a hard deadline to get a flight. There is little of my normal editing in here, or captions, just my raw investigation! Do follow the links for more details.

aiexplained-official
Автор

Dude woke up and thought to himself, how thorough will I be today and said: “Yes!” You definitely should get some interviews with those top researcher’s.

SaInTDomagos
Автор

I'm truly grateful for this channel. Finding accurate news about almost anything is hard as heck, and having accurate AI news is especially important. We can't afford to be mislead.

nathanfielding
Автор

Ah, the Q* video I have been waiting for from the only youtuber i really trust on the subject. Thanks!

DevinSloan
Автор

THIS was the technical dive I've wanted to find for the last few days. thank you so much for taking the time to dig into the development of these papers and the technologies they represent.

pedxing
Автор

I'd expect the Q to refer to Q-learning. Human beings think/function by predicting the future and acting upon those predictions, at least at a subconscious level. The way we make these predictions is by simulating our environment and observing what would happen in different variations of that simulation given the different choices we make. We then pick the future we feel is best and take the actions to manifest that future.
I think a good example might be walking through a messy room with legos everywhere. You observe that environment(the room) identify the hazards(legos) then plan out a course through the room of where you can step to be safe(not step on lego). You would imagine that stepping in one spot would mean you are stuck or would step on a lego, so that whole route is bad and you try another. Repeat till you find a solution or decide there isn't one and just pick some legos up, or give up, or whatever. Of course not everyone does this, some people just walk on through without thought and either accept stepping on legos or regretting that they did not stop to think. These emotional responses of acceptance of consequences or regretting them is more akin to reinforcement learning imo. There are times when you need to act without thought, for example, if the room was on fire you might not have the time (or compute) to plan it all out.

The Q learning stuff, in the context of these LLMs, seems like it would be their version of simulating the future/environment. It would generate a whole bunch of potential options(futures) then pick the best one. The difficult task there is creating a program that knows what the best option actually is, but they apparently already have that figured out.

My bet is we will need to add in a few different systems of ‘thought’ that the AI can choose from given different contexts and circumstances, these different methods of decision-making will become tools for the AI to use and deploy and at that point it will really look like AGI. That’s just my guess and who knows how many tools it will even need.
Either way it's cool to see progress and all this stuff is so cool and exciting.
Now to go look for some mundane job so I can eat and pay off student loans lmao, post-money world come quickly plz XD.

dcgamer
Автор

We all spent the last week watching the soap opera drama and listening to wild ideas and nobody put it all together in a nice package with a bow on it until you posted this video. It is a theory, but one that is well thought out has references, and seems extremely logical. Thanks for putting so much work into this, but it's not falling on deaf ears, we truly appreciate you. Thanks, Bill Borgeson

Madlintelf
Автор

This channel outpaces in quality ANY other channel on AI News in YouTube. The way you try your best to keep the hype out and reduce the amount of speculation is really something to be proud of and really what makes your content so different from other creators.

You sir, is the only channel in the topic that I am happy to watch (and like) every video. ❤

Cheers from Brazil!

caiorondon
Автор

Wow. Very impressive investigative journalism. No other AI channel does their homework better than you. Well done sir.

bobtivnan
Автор

The Q* as an optimizing search through the action space sounds quite plausible. Just like the A* algorithm that is more of a generic optimal path finding algorithm.

Peteismi
Автор

you are my first source for AI news, you go deep into the details and do not cut corners, like a true teacher

a.s
Автор

The New York Times or another major newspaper should hire you, seriously. The amount and quality of research and the way you explain and convey AI news and information is truly remarkable. You are currently my favourite yt channel.

gmmgmmg
Автор

At 17:20 Lukacs Kaiser says multi-modal chain of thought would be basically a simulation of the world. Unpacking this, you can think of our own imaginations as essentially a multi-modal "next experience predictor", which we run forwards as part of planning future actions. We imagine a series of experiences, evaluate the desirability of those experiences, and then make choices to select the path to the desired outcome. This description of human planning sounds a lot like Q-learning - modeling the future experience space as a graph of nodes, where the nodes are experiences and the edges are choices, then evaluating paths through that space based on expected reward. An A* algorithm could also be used to navigate the space of experiences and choices, possibly giving rise to the name Q*, but it's been many years since I formally studied abstract pathfinding as a planning method for AI, and as far as I can tell from googling just now over my morning coffee, it seems like the A* Algorithm would not be an improvement over the markov decision process traditionally used to map the state space underlying Q-learning.

My extrapolation gets a bit muddy at that point, but maybe there's something there. To me, a method that allows AI to choose a path to a preferred future experience would seem a valuable next step in AI development, and a possible match for both the name Q* and the thoughts of a researcher involved with it.

nescirian
Автор

This really is the best AI channel around, we're lucky to have you

grimaffiliations
Автор

18:49 I believe Q* is a reference to the “A* search algorithm” in graph theory. Machine learning is fundamentally described by graph theory, and an algorithm like A* (which traverses each layer of a graph as efficiently as possible) would make total sense.

rcnhsuailsnyfiue
Автор

As someone outside the industry, this is such a great resource. Thank you very much for the hard work and keeping us in the loop! I've been waiting for this video since the Reuters article

garrettmyles
Автор

I was in two minds about whether to take the Q* thing seriously until you posted about it. Now I accept that it is atleast not just sensational hype. Thanks for keeping us up to date!

apester
Автор

Amazing work. Thanks for, ahem, pushing back the veil of ignorance 😁

So refreshing to get an informed and non-sensational take on this latest OpenAI X-Files case. It doesn't even matter if your educated guess ends up missing the mark. It's this kind of detective work that is sorely needed in any case, at least before we get some official and/or trustworthy info on this James Bond style "great achievement" called Q*

etunimenisukunimeni
Автор

This channel and Dave Shapiro are my go to for AI news!

DomainAspect
Автор

Game Developers will be familiar with the "A*" algorithm, used to find optimal shortest paths between 2 points on a grid containing obstacles (eg. a path between the players location and some target, or between an AI opponents position and the players position). I wonder if Q* is some similar shortest path finding algorithm between two more abstract nodes in an AI network problem containing some kind of obstruction that has to be navigated around ?

colinutube