GPT-4: MIT Exams w/ 100% score for Mathematics? No MIT!

preview_player
Показать описание
MIT published a pre-print, on GPT-4 scoring the perfect 100% on MIT Mathematics final exams, MIT Major Mathematics. And an exact 90% score as a plain vanilla GPT-4, without prompt engineering at all. We prove this statement wrong.

An autoregressive transformer architecture is the perfect mathematical reasoning machine, according to this pre-print by MIT, Harvard, Stanford and Boston Univ? A pure vanilla GPT-4 without (!) any prompt engineering receives a 90% MIT exam score? No way MIT!

All rights with the authors of this published arxiv pre-print (not a peer reviewed publication):
Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models
Sarah J. Zhang, Samuel Florin, Ariel N. Lee, Eamon Niknafs, Andrei Marginean, Annie Wang, Keith Tyser, Zad Chin, Yann Hicke, Nikhil Singh, Madeleine Udell, Yoon Kim, Tonio Buonassisi, Armando Solar-Lezama, Iddo Drori

#gpt4
#massachusettsinstituteoftechnology
#harvarduniversity
#stanforduniversity
Рекомендации по теме
Комментарии
Автор

I don't understand the statement that "they didn't show the data"? Didn't they use GPT4? That has whataever data was used in its training, right? ANYONE using GPT4 has the same limitation in not being able to show the data that was used in their study.

kevinboles
Автор

Awe inspiring! I was hooked and drawn in to the end and I matched your excitement… thank you

Pure_Science_and_Technology
Автор

On point 1: Using Plugins (Wolfram), GPT can perform calculations? I assumed the GPT-4 entry was allowed plugins

OM-ynpt
Автор

Helloo
Great video and thank you for covering topics like this! I have some advice to ask from you, where can I contact you?

jalalelzein
Автор

Was this published as a warning about preprints? Some papers that have passed peer review seem to me to be using questionable performance metrics.

densonsmith
Автор

I have a feeling a major backtrack on this is coming

fkxfkx
Автор

When I viewed your first video on this subject (when the word Hoax still appeared in the title) it was obvious, given GPT-4's inability to deliver anything but trivial mathematical calculations, that the MIT math tests were purely symbolic manipulations.
Long ago when I encountered professional mathematicians at university (a limited set, to be sure), they were contemptuous of numerical calculations which destroy the precision of a pure mathematical vision. If I wanted to talk about numbers, they would direct me to the engineering side of the campus, in much the same way that a fancy hotel desk would direct the Roto-rooter (sewer cleaner) man to the tradesman's entrance at the back of the building.
It is with that block of salt that I understood the claims of GPT-4 mathematical perfection.

johnbrisbin
Автор

I genuinely hope that this is just some sort of misunderstanding.

NeuroScientician