Why Is ChatGPT Bad At Math?

preview_player
Показать описание

Sometimes, you ask ChatGPT to do a math problem that an arithmetically-inclined grade schooler can do with ease. And sometimes, ChatGPT can confidently state the wrong answer. It's all due to its nature as a large language model, and the neural networks it uses to interact with us.

Hosted by: Stefan Chin
----------
----------
Huge thanks go to the following Patreon supporters for helping us keep SciShow free for everyone forever: Matt Curls, Alisa Sherbow, Dr. Melvin Sanicas, Harrison Mills, Adam Brainard, Chris Peters, charles george, Piya Shedden, Alex Hackman, Christopher R, Boucher, Jeffrey Mckishen, Ash, Silas Emrys, Eric Jensen, Kevin Bealer, Jason A Saslow, Tom Mosner, Tomás Lagos González, Jacob, Christoph Schwanke, Sam Lutfi, Bryan Cloer
----------
Looking for SciShow elsewhere on the internet?

#SciShow #science #education #learning #complexly
----------

Sources:

Images

Рекомендации по теме
Комментарии
Автор

Speaking of ChatGPT being bad at math, ChatGPT also makes up sources and quotes.

ironiccookies
Автор

Numberphile did a video about this a few months ago. They made the point that, when you really think about it, it's surprising that chatGPT is as good at arithmetic as it is, simply because it wasn't _designed_ to do arithmetic. Sure, calculators and calculator apps can do arithmetic problems nearly instantly and with nearly 100% accuracy ("nearly" because there's always some small probability that _something_ goes wrong, such as the famous video where a particular TI calculator erroneously gives an answer to a random arithmetic problem in terms of π), but they were specifically _designed_ to do that. ChatGPT and similar models, on the other hand, were designed to model _natural languages._ All they're _really_ doing is picking up on patterns in their training data and using those patterns to predict what the most likely output to a given input is. So, that it can usually at least get _close_ to the right answers to arithmetic problems on large numbers, even though it's unlikely its training data contained a large amount of such problems, is surprising. It says something about the similarities between the grammatical and syntactic rules of a natural language and the algorithmic rules of arithmetic that a tool designed to model the former can do reasonably well at the latter.

Lucky
Автор

I tried multiple math problems and found it failed them quite often, and in the way humans do. A more interesting one was I asked how long something took in minutes and seconds. It gave something like 7 min and 80 seconds. It was right, but weird. Asking for it in mins and seconds gave the same answer, so I told it to limit the seconds to a range of 0-59.9. Result was "7 min and 20 seconds". Wrong. I spent several minutes telling it how to do this correctly and it eventually did, apologizing each time it was wrong - now here's the twist, you can then tell it is is something entirely wrong, like 9 mins and 20 sec and it will again apologize, and then make up bogus math to show why your new opinion is correct. It will ALWAYS pander, and never correct you or raise a concern regarding accuracy. Be warned.

LFTRnow
Автор

I found that it is more helpful explaining math concepts than it is at actual math. I used it for help with linear algebra and it was honestly better than my professor’s office hours.

edwardduda
Автор

Another theory is that training data contained enough calculations for the model to memorize. For short numbers it almost certainly saw every possible combination. For long numbers that it hasn't seen before it may smoosh together number prefix and suffix combinations that it has seen during training, to get something that resembles a correct result. With addition the output is very predictable, so this gets quite close to the real solution. Typically just a few numbers in the middle that are wrong, as per the example. Multiplication is much less predictable, hence results are much worse.

jfolz
Автор

It’s a far better idea to ask ChatGPT _how_ to solve a math problem, step by step, and then do the math yourself. It’ll be much more reliable since asking for guidance in language terms is its specialty, not necessarily acting as a calculator

joshp
Автор

3:22 programmers have trouble with this, too. Getting the customer to explain to us what they are looking for so we can translate it for the computer

nebulan
Автор

TL/DR: Don't let literature majors teach math.

wterlep
Автор

Wolfram Alpha is impressive in it own right. A bit too impressive since some had started to rely on it a bit too much rather than trying to understand the logic behind the answers. Even if it does present that logic just so you can understand the answer. But I recommend people check it out if they have not already.

Cythil
Автор

I asked ChatGPT how many days are between 19 April and 8 May this year, and it said "19 + 12 + 8 = 39 days" 😅I was just trying to calculate how many days old is my pigeon hatchling, and gladly this error was obvious (no way it was over a month old), but can imagine it causing major problems in cases less obvious. In another request it casually mentioned that planet Jupiter is located in Сancer constellation.🤔Makes me think it's unwise to ask it things that aren't known before asking, because it would be extremely hard to tell what is true and what was made up and spilled out with absolute confidence.

LoneIrbis
Автор

I found that chatGPT was very useful for AP physics because weirdly, it almost always gives the correct way of solving the problem and explains it so I can learn how to do it myself, far better than anything I can accomplish looking through notes or googling things. However, in answering these questions, where the math was weapped up in all sorts of physics stuff, any problem with multiple steps just about always had at least 1 weird error with the math. It also sometimes would explain what math I had to do correctly, then proceed to do entirelt different math. I eventually had to just use it to see what to do then redo all the math myself

kevincronk
Автор

As a mathematician, I am extremely grateful my career isn’t in danger (yet). EDIT: I see that this video was mostly talking about arithmetic. Chat GPT is also pretty spotty at higher, more conceptual math.

anthonyymm
Автор

There's a concept that helps interpret ChatGPT outputs. Garbage in, garbage out. If a large body of factually correct work exists for a subject, you'll probably get good results. If it's a contentious or obscure topic, not so much. This applies to your prompts as well, it weighs your text heavily. If you tell it to do something dumb, it'll do it without question beyond self censorship.

One thing I enjoy doing with ChatGPT is asking it to interpret phrases as sophisms.

Leadvest
Автор

Now we would recognize a true AI if it would know it's socially totally acceptable to say: "I hate maths. I'm so bad with numbers." 🖖🙄

susanne
Автор

Can a normal calculator not just be permanently or even temporarily integrated into chat gpt? It seems like it would be effective to have chat gpt interact with randomized equations using a separate calculator to confirm the numbers, then it could use the normal process for updating incorrect answers, feed it a few billion or trillion calculations should be enough to train it's own neural network correctly. I'm sure it's more complicated but it certainly seems doable.

BenCDawson
Автор

I'm glad this video is out, it was one of the first things I noticed. It's not just with math, it's with coding as well. It seems impressive at first, but with real tasks it will fail very quickly. It's actually kind of funny, if you ever ask it to make a calculation and then ask it to recalculate, it just assumes that it made a miscalculation, and will force itself on to a slightly different answer. It always apologizes and assumes that the user is right.

EDIT: Okay, actually if you ask it to calculate something REALLY simple like 2+2, it will be very firm about its answer 😂 but if you really want to bully it then you could say you wanted to know 2+2 in base 3.

starship
Автор

I found something it can't do a few weeks ago: Generate a list of words satisfying some criteria, like "adjectives that start with the letter j"), _sorted into alphabetical order._

Ice_Karma
Автор

ChatGPT is amazing for the first hour or so of use. After that you quickly realise how limited it is. It's super powerful source of information, but only when you hold it's hand and literally guide it towards the answer.

Xyt
Автор

Yep, chatGPT told me (even after arguing my point repeatedly) that cos(89°) was larger than cos(80°).

davelordy
Автор

8:56 I've never forgotten to carry a 2

frostebyte