Can ChatGPT o1-preview Solve PhD-level Physics Textbook Problems? (Part 2)

preview_player
Показать описание
I test ChatGPT o1 with some more (astro)physics problems I solved in graduate school. This time, I pick a set of problems that were hand-crafted by my professor, meaning that the probability that these problems exist on the internet are slim. Needless to say, the results were surprising.
Рекомендации по теме
Комментарии
Автор

You’re saying that it didn’t do it your way, but that’s a good thing. One of the things we should expect is new and novel ways of solving things.

adamsigel
Автор

I watched both parts. It is great that this content is created by an actual researcher. Greatly impressed by Kyle’s content and o1’s capabilities.

sujoy
Автор

Beginning of 2023 we were flabbergasted that AI can produce a more or less coherent chat. Now we're disappointed if it's not perfect on a PhD-Level math test. This is pretty astounding in my book.

CodepageNet
Автор

It is laughable that some are saying this model is no big deal. I have thrown some tough questions at it and it got everything right. In fact, it had unique insights into those problems that I hadn't considered.

jeffwads
Автор

super cool seeing your experiments, excited to see what o1 can do with your dissertation problems!

rpraka
Автор

By far, the most insightful and useful experiment to access the capabilities of this new model to solve complex non-coding related problems. Thank you very much for sharing this!!

seregv
Автор

This is absolutely insane. What a time to be alive.

atsoit
Автор

I would start a new chat if the question is completely unrelated.

pcdowling
Автор

The reason why it’s possible is because it’s trained by rewarding it on learning the next step in reasoning to solve problems. So you take a set of physics and math problems and have it learn how to project one step forward at a time until it gets the answer and then it’s able to learn how to generally reason across new domains. It’s trained on perfecting the step by step process so that it’s able to figure out new problems by assessing what is most likely the next step to get to the solution

Khari
Автор

It got the right answer in 5 seconds. To say, it went a little overboard and took unnecessary steps, is silly to me. It took only 5 seconds! Do you know of any human on earth, living or dead that can get right answer in 5 seconds and have it all typed out in a clear format, with explanations on the thought process?

jdsguam
Автор

This is our first glimpse of superhuman general intelligence. It's a problem that humans can solve, but it solves it exponentially faster. Soon, the reasoning for each answer will be far more in-depth, and it will happen at instantaneous speed. If this is 11 steps of reasoning in 15 seconds, imagine 11, 000 steps in 0.25 seconds, like a chess engine, but for real-world problems.

EGarrett
Автор

Guys who invested billions in AI saw it's coming. Now everybody can have their personal team of PhD assistants at hand.

tomaszzielinski
Автор

Thank you for such a useful video! Really impressive model - I had to resort to using the standard GPT 4o to come up with tasks in various domains difficult enough to challenge o1 preview. By the way, a possible reason why it went for such a convoluted solution in Problem 1 might be that you put it in the old chat/conversation from your previous video. And because you had much harder Jackson problems prior to this, the model kept all of them (and its reasoning steps) in context while answering a much easier Problem 1 from this video. So it might have assumed that the difficulty level would be comparable. For this reason I try to start a new chat for new topics/problems - and it also saves Microsoft/OpenAI compute resources as the model doesn't have to keep all the previous context in its head :D

AlexisLionel
Автор

I can't believe it can do university level physics. Part of me doesn't even want to believe what it just did even though I saw it. Genuinely looking at this in awe.

duduzilezulu
Автор

Labs around the world have figured out the shape of 200, 000 proteins over all these years, and alfa fold did 200 millions just over months, i think this will be true for mathematics, there will be no unsolved problem everywhere 😅

amirsafari
Автор

Really impressed with those tests. I did my phD (engineering) back on 1998 and I was using the most powerful pc’s that we had on the department back then with just 32Mb of RAM to run my mathematical models and my heuristics and GA approaches. There was just the begging of using graphics acceleration CUDA back then although I had no access to that kind of CUDA equipment so my models needed about 10h of computer time to execute.
I can imagine nowadays using this kind of AI on an agent giving it access to tools to execute and test different models alternatives in order to test and advance the research exponentially faster.
I cannot imagine how much easier and faster can go the research today with tools like this.

Автор

If we wish to climb a mountain and there are 3 people sharing that idea. There are perhaps going to be as many as 3 paths to that experience, standing on the apex of the mountain. That an other doesn't arrive there by the same path isnt a failure, it is the revelation of the validity of a different path. im glad that you shared a completely real experience. my life is changed as thoroughly as yours.

dennycote
Автор

Firstly this is only preview, the actual o1 is even better. And members of open AI have said the rate of improvement in these models are significantly faster than in the previous gpt models. Even in a months time we should see significant improvements. Exciting times ahead.

Junior-zfyy
Автор

o1-preview is astounding and i hope it gets smarter!

mrshankj
Автор

Btw this is o1 preview and openai has confirmed the next model will drop next month which will be o1 full release. It's apparently 30% better than the current o2

llsamtapaill-ocsh