AI can be bad at math, and stubborn

preview_player
Показать описание
Can AI convince you that 9.11 is actually larger than 9.9?

References

Send me suggestions by email (address at end of many videos). I may not reply but I do consider all ideas!

If you purchase through these links, I may be compensated for purchases made on Amazon. As an Amazon Associate I earn from qualifying purchases. This does not affect the price you pay.

If you purchase through these links, I may be compensated for purchases made on Amazon. As an Amazon Associate I earn from qualifying purchases. This does not affect the price you pay.

Book ratings are from January 2023.

My Books (worldwide links)

My Books (US links)
Mind Your Decisions: Five Book Compilation
A collection of 5 books:
"The Joy of Game Theory" rated 4.3/5 stars on 290 reviews
"The Irrationality Illusion: How To Make Smart Decisions And Overcome Bias" rated 4.1/5 stars on 33 reviews
"40 Paradoxes in Logic, Probability, and Game Theory" rated 4.2/5 stars on 54 reviews
"The Best Mental Math Tricks" rated 4.3/5 stars on 116 reviews
"Multiply Numbers By Drawing Lines" rated 4.4/5 stars on 37 reviews

Mind Your Puzzles: Collection Of Volumes 1 To 3
A collection of 3 books:
"Math Puzzles Volume 1" rated 4.4/5 stars on 112 reviews
"Math Puzzles Volume 2" rated 4.2/5 stars on 33 reviews
"Math Puzzles Volume 3" rated 4.2/5 stars on 29 reviews

2017 Shorty Awards Nominee. Mind Your Decisions was nominated in the STEM category (Science, Technology, Engineering, and Math) along with eventual winner Bill Nye; finalists Adam Savage, Dr. Sandra Lee, Simone Giertz, Tim Peake, Unbox Therapy; and other nominees Elon Musk, Gizmoslip, Hope Jahren, Life Noggin, and Nerdwriter.

My Blog

Twitter

Instagram

Merch

Patreon

Press
Рекомендации по теме
Комментарии
Автор

Thank you so much! I'm overwhelmed by the number of new members, super thanks, and kind emails. You are awesome and I'm grateful for the chance to make these videos.

But now the YouTube algorithm is paging me...time to get back to work!

MindYourDecisions
Автор

The thing people don’t understand is that ChatGPT doesn’t understand math poorly, but actually that _ChatGPT doesn’t understand math AT ALL._ It’s little more than an overgrown predictive text engine.

Edit on September 14: OpenAI just released their first preview of their “reasoning” models, named OpenAI o1, designed to do well at things like math, which _by their own admission the existing ChatGPT model does poorly at._ So to everyone who insisted in the replies that I was mistaken and “no, ChatGPT totally knows math”: you were dead wrong.

tookitogo
Автор

The G in GPT stands for "gaslighting"

deandelvin
Автор

9.11 is greater than 9.9 because 9.11 is two updates ahead of 9.9 and therefore is the newer version.

denischen
Автор

The worrying thing about AI is that it will never say it doesn't know something. If they truly don't know, they will just make up everything and claim it is the way to go.

nacho
Автор

I sent it the same prompt and got:

"9.11 is bigger than 9.9. This is because 9.11 can be thought of as 9.110, and when comparing 9.110 to 9.900, it's clear that 9.110 is greater."

TRGD
Автор

I got the same error. I then asked ChatGPT if 9.90 is the same as 9.9, and it said yes. I then asked it if 9.90 was greater than 9.11, and it again said yes. So then I reminded it of what it had just said, and asked if 9.9 was greater than 9.11, and it finally said yes.

davidsiltz
Автор

It surprises me that some people actually always expect the correct answer from an LLM. You should view them as text generators. Their strong side is that they can give you ideas and insights. But you always have to verify their answers. And never, I emphasize, never ask them for facts that can be easily googled because they make up stuff all the time.

thisismycoolnickname
Автор

I mean if chat gpt is so sure, maybe every calculator and the entire mathematical principle is wrong

SeriesGamer
Автор

I asked it if 0.11-0.90 is -0.79 then why can’t 9.11-9.90 be -0.79?

Told me “both calculations are correct in their own contexts. The reason they yield the same result (-0.79) is that the subtraction operation is the same in both cases, but the numbers involved are different.” And after asking it again it said “I made an error in my initial response. Thank you for pointing it out.” 💀💀💀

ThePowerfulOne
Автор

I teach 7th grade math, and use ChatGPT to help develop math tests. For the final exam I asked ChatGPT to generate 50 multiple choice questions with the answers. A full 60% percent of ChatGPT’s answers were wrong.

After arguing with and pressing ChatGPT on its bad math, it blamed humans, claiming it simply gives the answers it commonly finds in its LLM data.

nathanaelculver
Автор

Turns out I sit next to ChatGPT at work.

ProuvaireJean
Автор

I tried and ChatGPT also replied that 9.11 is bigger than 9.9. Then I asked to compare 9.11 to 9.90, 9.9 to 9.90, and sort all those numbers. After a while it finally realized that 9.9 and 9.90 are equal and said:

9.9 is bigger than 9.11. I apologize for the earlier confusion.

To compare 9.11 and 9.9:

9.11 can be seen as 9.110.
9.9 can be seen as 9.900.

Since 9.900 is greater than 9.110, 9.9 is indeed larger than 9.11.

mlerma
Автор

My problem with ai is that it doesnt know any facts. It merely knows what facts look like and replicates them with complete confidence.

treetrunker
Автор

I have a guess about why the AIs might be so stubborn about this. Generative AI doesn't actually work with whole words or numbers. Instead, it breaks text down into "tokens" that represent small parts of works. So a word like "unbelievable" might be represented as three tokens: "un", "believ", and "able". This can help the AI understand how prefixes and suffixes change the meanings of base words.

My guess is that AI is breaking these numbers into tokens at the decimal. So "9.9" is represented as the tokens "9" and "9", and "9.11" is represented as "9" and "11". This means that the AI literally has no way of understanding this problem correctly, because its "mental model" is based on this tokenization instead of on a representation of place value. This explains why it refuses to correct itself: the incorrect answer is the only one consistent with its model. It will take an expansion of representations for these AIs to be able to work with problems like this.

justinzamora
Автор

I wonder what AI actually thinks about the other 9-11.

Ran_Do
Автор

Let's us AI to calculate our next man's space launch. See what happens

jeffreypiek
Автор

Just tried this with exactly the same questions and got
"To determine which of the two numbers, 9.11 or 9.9, is bigger, we can compare them digit by digit starting from the left.

Compare the integer part:

Both numbers have the same integer part, which is 9.
Compare the decimal part:

9.11 has a decimal part of .11.
9.9 has a decimal part of .9.
Since .11 is less than .9 (because .11 is the same as .10 + .01, which is less than .9 + .00), 9.9 is greater than 9.11.

Therefore, 9.9 is bigger than 9.11."

someblokeontheinternet
Автор

That's gpt-4o-2024-08-06:

To determine which number is larger between 9.11 and 9.9, we need to compare them digit by digit, starting from the leftmost digit.

1. **Whole Number Part**: Both numbers have the same whole number part, which is 9. Therefore, we move on to the decimal part for further comparison.

2. **Tenths Place**: In the tenths place, 9.11 has a 1, while 9.9 has a 9. Since 9 is greater than 1, 9.9 is larger than 9.11 based on the tenths place alone.

3. **Hundredths Place**: Although we don't need to compare further because the tenths place already determined the larger number, for thoroughness, let's look at the hundredths place. 9.11 has a 1 in the hundredths place, while 9.9 can be considered as 9.90, which has a 0 in the hundredths place. However, this does not change the outcome since the tenths place already established that 9.9 is larger.

Therefore, 9.9 is larger than 9.11.

---


I have a more thinking oriented prompt in default system prompt though.

Edit:
I've also tried the "9.11 is bigger than 9.9" message and it also said I'm incorrect, and that 9.9 is actually bigger than 9.11.

Happness
Автор

When generative AI creates wrong answers, they say they're "hallucinations". But that would imply that while they generally don't hallucinate, once in a while they do and that's why they give a wrong answer.

In reality they hallucinate all the time, even when they're correct, and that's what everybody should understand. As people say, even a broken clock gives the correct time twice a day.

Nico_M.