Possible Impossibilities and Impossible Possibilities

preview_player
Показать описание
Yejin Choi (University of Washington)
Large Language Models and Transformers

In this talk, I will question if there can be possible impossibilities of large language models (i.e., the fundamental limits of transformers, if any) and the impossible possibilities of language models (i.e., seemingly impossible alternative paths beyond scale, if at all).
Рекомендации по теме
Комментарии
Автор

Talk starts at 1:30 — and is mind blowing 🤯

afrozenator
Автор

🎯 Key Takeaways for quick navigation:

00:06 🎤 Introduction of the next speaker, Agent Choi from the University of Washington, who specializes in Natural Language Processing (NLP).
01:26 🧪 Agent discusses the potential and limitations of Artificial General Intelligence (AGI), pondering on the various possibilities and impossibilities in the near future, specifically around 2050.
02:23 🌌 Speculation about the future: Climate change forcing a move to metaverse or Mars, AI writing and reviewing papers, and potential advancements in AI technology.
03:44 💭 Considering the potential breakthroughs and debates surrounding AI, including the possibility of quantum computing and other developments that might change our understanding of AI.
04:54 📚 Drawing parallels with the history of physics, hinting at the potential of unforeseen discoveries and breakthroughs in AI.
06:03 ❓ Raising several questions about the future and current capabilities of AI, touching upon topics like embodiment, factuality, and compositionality.
06:59 🧮 Discussing a study exploring the multiplication capabilities of GPT-4, and the varying levels of success in calculations with different digits.
08:17 🔍 Experimenting with supervised training to improve the multiplication abilities of the AI models.
10:36 🙋‍♂️ Engaging with the audience to gauge their expectations about the performance of Transformers in mathematical calculations after supervised training.
13:00 📝 Introducing the concept of using a scratch pad to enhance the AI's ability to perform multiplication tasks with higher accuracy.
15:24 🗣 Sharing community reactions and discussing various attempts and strategies to improve the algorithmic reasoning capabilities of AI models.
18:13 💡 Delving deeper into why multiplication presents a challenge for AI and exploring if similar challenges exist in other areas of computation or reasoning.
20:00 📊 Analyzing the types of errors and the strategies AI models utilize when attempting multiplication tasks.
22:55 🤔 Posing further questions about the limits and potentials of AI, and encouraging an open discussion about the future directions in AI research.
27:31 🧩 Discusses the complexities and limitations of current AI models in representing certain algorithms.
28:00 💡 Mention of the challenging circumstances in Indian startups and academia regarding the development of foundational models.
28:41 🤔 Introduces a thought experiment about improving a low-quality model (GPT-2) to compete with higher quality models.
29:50 📊 Discusses the limitation of distilling knowledge from larger models and the potential of smaller models in specific tasks.
30:43 🧠 Shares fascination with human ability to abstract knowledge compared to AI models and hints at exploring this area further.
32:35 💽 Emphasizes the pivotal role of training data in AI development and experiments with not relying on larger models or supervised data.
35:09 📉 Details the failures of GPT-2 in summarization tasks and the efforts to enhance its capabilities.
37:29 🔄 Describes a novel technique where GPT-2 is prompted to summarize its own outputs, increasing the success rate to 10%.
39:18 🔍 Explains the filtering process involving length, diversity, and entailment checks to develop a high-quality dataset.
41:35 🔄 Talks about the iterative training process where students can become teachers for the next generation.
45:41 📝 Presents examples of text summarization comparing their model's output with GPT-3's.
47:30 🤯 Introduces the concept of a "common sense paradox" in AI, comparing the capabilities and limitations of various GPT models.
50:00 📏 Describes an experiment testing GPT-4's understanding of a physical problem and its change over time with various prompts.
52:25 🔄 Highlights the need for specific custom prompts to get better responses from GPT-4 and the ongoing cycle of correction and updates.
53:04 🤖 Discussion on the curiosity surrounding how humans inherently know certain answers while machines find it mysterious.
53:46 💡 Mention of GPT-4's ability to provide a substantial amount of common sense prior knowledge.
54:13 🤔 Proposition that higher-level reasoning might be easier for computers than for humans, especially in the realm of language understanding.
54:55 💼 A quick discussion on the ongoing challenges and paradoxes in AI, focusing on generation and understanding capabilities.
56:19 🧪 Highlighting a surprising observation where altering the phrasing of a question can lead the AI back to making original mistakes.
57:00 ⚖ Discussion on the potential alignment issues between the assumptions held by humans and those by AI during decision-making processes.
58:01 📚 A query about the existence of textbooks on common sense and the attempts to codify common sense rules for AI.
59:54 ✍ Analysis of GPT's capabilities in generating content, especially in comparison to the understanding capabilities and the challenges in aligning them.
01:01:05 🧐 Delving into the AI's method of giving intuitive guesses based on large language models and the curiosity surrounding its reliability.
01:02:36 📝 Suggestion of improving the Scratch Pad learning model to enhance understanding and performance in AI systems.
01:03:30 🔄 Inquiry about incorporating recursion into neural programming synthesis to enhance generalization and problem-solving capabilities.

Made with Socialdraft

AntonioEvans
Автор

Imo the drying clothes math question can be considered less commonsense than people sometimes think. This is a trick question that people would get wrong as well. The whole setup of the question is to get people to answer wrong because it phrased like a typical question that you pose to children to test their knowledge about linear relationships / proportionalities. The chat models are especially trained to be good at zero shot learning so they must make broad assumptions to get high scores in the most likely case in the first try (in some way they are already dumbed down in the same way that Google became over time).

I particular dislike this setup of typical math questions because they require you to disable your usual conversational context prediction of assuming good intention. Even assuming a genuine well intended math question, often not even an answer is expected but asking further questions (or conscious speculation) to reduce the potential degrees of freedom in the original question.


Often if you poke the model after such an answer, it will give you a reasonable rationalization for the answer. E.g. In this case that there is only limited drying space (not unrealistic at all by personal family experience).

That said, in my experiments, trying to condition the model into understanding that this is a trick question and it should not make any further assumptions above the stated seldom improves the reasoning and results.

gambistics
Автор

Oh, this is very important, probably beyond a lot folks imagination. If neural nets can handle +-*/% properly, they can interact with almost everything we have in code based on integers, I mean, they can use existing database for short term memory. This is probably the last piece of puzzle before we get to human level.
But why not in base2?

hanyanglee
Автор

Why are supposedly intelligent people using "be like" in their talks?

redryan
Автор

Just another boring talk with quirky details

nullvoid