The AI Reasoning Lie

Показать описание

The reasoning lie regarding Large Language Models (LLMs). Including extended reasoning of CLAUSE Sonnet 3.7 or other thinking models. Insights from this video apply to classical LLMs and Test-Time-Compute Scaling TTS models like o1, o3 and so on. Latest research uncovers that there is no inherent logical emergence of intelligence in AI systems.

All rights w/ authors:
Order Doesn’t Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation
Qianxi He1, Qianyu He1, Jiaqing Liang2, Yanghua Xiao1,
Weikang Zhou3, Zeye Sun3, Fei Yu3
from
1 Shanghai Key Laboratory of Data Science, Fudan University
2 School of Data Science, Fudan University
3 Ant Group

#reasoning
#intelligence
#airesearch
#aiagents

Рекомендации по теме

Комментарии

It doesn’t reason but every time I use it to do my novel tasks it gets better and better at getting the task correct. So it’s not reasoning but doing better at the reasoning tasks I need it to do.

darkdragoon

amen to “Logic is about the relationships between the ideas, not the order of presentation.

kevon

We have amazing Logic Programming Languages. I don’t see why LLMs can't feed premises into that Programming Language and then present that data back to us?

Alorand

I knew these things very long time ago. I kept on telling people that LLMs don't do actual reasoning but people don't like to believe.

mjlmfhj

At their core LLMs compress data into a latent space where tokens are spatially related to each other. COT is an attempt to synthetically “re-tune” weights based on a specific problem (even though they are actually frozen). What we need is a model that can “think” internally and retune within the LLM’s latent space. I think this can be done and would likely qualify as reasoning

Aeorthian

"Jumbo is sleepy" doesn't lead to a contradiction. Jumbo could be e.g. a lion, and thus a mammal (but, importantly, not an elephant) and there would be no conflict with "Jumbo is sleepy". So the correct answer is that the conclusion is uncertain.

decare

Don't we, as humans, also fall prey to this condition shuffle, especially as kids? And isn't the design of the elephant puzzle (and similar ones) created to trick us into forcing to break our "training", or "crystallized" intelligence and think about them as concepts, akin to code variables, rather than actual animals.

I think it's not that LLMs aren't able to "truly" reason, but rather that we humans reason the same way. We just had more time to evolve and learn other ways to think

askmedov

17:30 I'm a bit annoyed we weren't given the real answer to the riddle. From my deductions the answer is "uncertain". The reason being none of the premises talk about how sleepy other mammals are. And none of the premises give a constraint on Jumbo being a different mammal than an elephant. So Jumbo could be both a sleepy and a non-sleepy mammal.

Censored_Truth_Addict

9:31
«... it doesn't matter for us humans...»

Maybe this is a dumb question, but is this a validated fact? What I mean is, if you give this same problem to an experimental group of people, has it been validated that the order does not influence how long it takes for each to solve the problem (or even if they solve the problem)? If this has not been experimentally validated, I do not think that this is a safe assumption.

cosmicaug

I still disagree. The latest public models may not have the technology implemented yet but with the advent of the paper on Latent recurrent reasoning and the paper that uses a Markov decision process, these methods DO line up with actual reasoning. Will need to test the reverse curse to see if this is the case.

Slappydafrog_

Labeling commutativity as "the crucial property" overlooks these other essential aspects. Logical reasoning is a multifaceted discipline where various properties work together to form robust systems of deduction. While commutativity is certainly useful—especially for simplifying and rearranging expressions—it is just one of many properties that contribute to effective reasoning.

hailrider

These studies with toy examples are not the last word on the subject. Humans will be arguing about this for a long time, and such arguments will become less and less convincing as AI's become ever more powerful.

glyph

This is why I am very excited by diffusion language models, those are less biased from the order of information.

Koroistro

Great summary! Emphasizing order-centric augmentations during training might significantly boost LLM performance. Directed acyclic graphs look promising.

CharlotteLopez-ni

That kind of leakage is pernicious - we've known about those kinds of security risks for 40 years, yet modern processors continue to leak data. Now it's LLMs. Running locally is going to be the only safe choice. (Consider also the risks of cache poisoning)

joehopfield

Just want to say that the strawberry test proves nothing about intelligence. The LLM sees a stream of tokens not invidual letters. For it to be able to accurately count the letters it would need to know the textual content of every possible token.

jondoe

17:19 Jumbo is a mammal but not an elephant, that is still a possibility. So Jumbo can still be sleepy. Why is there a direct contradiction?

carlgeorgbiermann

The ego’s fear drives the panic over AI potentially possessing human-like reasoning or even consciousness. That small voice inside whispers, "I am special! I must be...?" clinging to a sense of uniqueness.

Most people don’t realize how deeply their thought structure is shaped by inner dialogue, making it a direct byproduct of language itself.

That same voice once led religious people to deny animals a soul and early scientists to dismiss their consciousness. At its core, this resistance stems from the human need to assert superiority—to declare itself distinct, exceptional, and above all else.

dopaminefield

Side note, you talk at a perfect cadence for me to understand at 2x speed. A side effect, everyone else speaks really slowly after watching videos this way ...

kylek

Well theres alot of people who also follow paterns and are incapable of thinking, only applying patterns (saw it in uni, some people when solving math or physics taska were not capable of applying knowledge, just following patterns seen before)

jurgitronik

The AI Reasoning Lie

The AI Reasoning Lie

Generative AI is not the panacea we’ve been promised | Eric Siegel for Big Think+

Problem of the Two Doors: Classic Logic Puzzle

Anthropic found a 'terrifying' consequence of adding reasoning to AI

OpenAI's CEO on What Kids Should Be Studying

IQ TEST

ChatLogic Revolutionizes AI Reasoning Skills

Anthropic Just Dropped a Stunning Paper — And It Might Change How We Understand AI | FrontPage

DeepSeek, Reasoning Models, and the Future of LLMs

🌟 Revolutionizing AI Reasoning: Enter Coconut – A New Era for LLMs! 🧠🤖 #shorts

Open AI o1: The Strawberry AI Reasoning Revolution + 3 Specific prompts and example responses

gemini's multimodal reasoning

Artificial Intelligence and Human Reasoning | Najibullah Adamji | TEDxRajenderPark

Atheists DEBUNK Scientific Claim from the Quran #shorts

What is special about the Reasoning LLM Models?

Are Your LLMs Capable of Stable Reasoning?

ChatGPT creator is now worried about AI

From System 1 to System 2: A Survey of Reasoning Large Language Models (January 2025)

6 Logical reasoning questions to trick your brain

Top 3 AI Tools for Programmers 👨‍💻

How to PROVE God Does Not Exist #god #skeptic #atheist #logic #reason #religiousdebates

Are you creative or analytical? Find out in 5 seconds.

OpenAI's NEXT BIG LEAP in AI Reasoning

Unlocking AI's Hidden Brain Power: Latent Reasoning Explained!