WE MUST ADD STRUCTURE TO DEEP LEARNING BECAUSE...

preview_player
Показать описание
Dr. Paul Lessard and his collaborators have written a paper on "Categorical Deep Learning and Algebraic Theory of Architectures". They aim to make neural networks more interpretable, composable and amenable to formal reasoning. The key is mathematical abstraction, as exemplified by category theory - using monads to develop a more principled, algebraic approach to structuring neural networks.

We also discussed the limitations of current neural network architectures in terms of their ability to generalise and reason in a human-like way. In particular, the inability of neural networks to do unbounded computation equivalent to a Turing machine. Paul expressed optimism that this is not a fundamental limitation, but an artefact of current architectures and training procedures.

The power of abstraction - allowing us to focus on the essential structure while ignoring extraneous details. This can make certain problems more tractable to reason about. Paul sees category theory as providing a powerful "Lego set" for productively thinking about many practical problems.

Towards the end, Paul gave an accessible introduction to some core concepts in category theory like categories, morphisms, functors, monads etc. We explained how these abstract constructs can capture essential patterns that arise across different domains of mathematics.

Paul is optimistic about the potential of category theory and related mathematical abstractions to put AI and neural networks on a more robust conceptual foundation to enable interpretability and reasoning. However, significant theoretical and engineering challenges remain in realising this vision.

Please support us on Patreon. We are entirely funded from Patreon donations right now.
If you would like to sponsor us, so we can tell your story - reach out on mlstreettalk at gmail

Links:
Categorical Deep Learning: An Algebraic Theory of Architectures
Bruno Gavranović, Paul Lessard, Andrew Dudzik,
Tamara von Glehn, João G. M. Araújo, Petar Veličković

Symbolica:

Dr. Paul Lessard (Principal Scientist - Symbolica)

Neural Networks and the Chomsky Hierarchy (Grégoire Delétang et al)

Interviewer: Dr. Tim Scarfe

Transcript:

More info about NNs not being recursive/TMs:

Geometric Deep Learning blueprint:

TOC:
00:00:00 - Intro
00:05:07 - What is the category paper all about
00:07:19 - Composition
00:10:42 - Abstract Algebra
00:23:01 - DSLs for machine learning
00:24:10 - Inscrutability
00:29:04 - Limitations with current NNs
00:30:41 - Generative code / NNs don't recurse
00:34:34 - NNs are not Turing machines (special edition)
00:53:09 - Abstraction
00:55:11 - Category theory objects
00:58:06 - Cat theory vs number theory
00:59:43 - Data and Code are one and the same
01:08:05 - Syntax and semantics
01:14:32 - Category DL elevator pitch
01:17:05 - Abstraction again
01:20:25 - Lego set for the universe
01:23:04 - Reasoning
01:28:05 - Category theory 101
01:37:42 - Monads
01:45:59 - Where to learn more cat theory
Рекомендации по теме
Комментарии
Автор

I like Dr. Paul's thinking - clear, concise and very analytical. LLMs don't reason, but they can do some form of heuristic search. When used on some structure, it can lead to very powerful search over the structure provided and increase their reliability.

johntanchongmin
Автор

This looks like very early stage academic research with very low prospects of a returns in the near/mid term, surprised that somebody was willing to put their money into it. Very interesting but too academic for a company, all the best to the guys.

SLAM
Автор

As a public service I offer this reading list of polymaths to help to do what the title of the video seems to be looking for. Most of these were mathematically literate: one even invented the calculus.
Aristotle on categories. Spinoza on substance and essence. Leibniz on universal language. Locke on empiricism and experience. Roget's synopsis of categories - still the only comprehensive workable schema of categories that I have been able to find. C S Peirce on semiotics. Nicolai Hartmann on levels of reality. Joseph Needham on integrative levels. Norbert Wiener, cybernetics. Turing not only on computing but patterning, activation-inhibition. George Spencer Brown, distinctions or the calculus of indications. Heinz von Forster on cybernetics. You could go back and read McCulloch et al on neural networks if you haven't already. Only then go back and continue coding, or trying to, if you haven't given up. But I still don't know what problem machine learning is trying to solve or how we will know when we've solved it.

briancornish
Автор

How many people started watching this and feel like your passion for AI somehow tricked you into getting a maths degree

andrewwalker
Автор

You cannot use those sound effects
Please find a new sound library
I'm going to flip

felicityc
Автор

Dizzying abstract complexity surfing on a sea of reasonable issues and goals.

asdfasdfasdfasdf
Автор

Removing the distinction between a function and data type is at the heart of Algorithmic Information. AND gee guess what? That is at the heart of Ockham's Razor!

jabowery
Автор

With regard to inscruitability around the 26 minute mark. My personal feeling is that the issue we face is with overloading of models. As an example, let's take an LLM. Current language models take a kitchen sink approach where we are pressing them to generate both coherent output and also apply reasoning. This doesn't really scale well when we introduce different modalities like vision, hearing or the central nervous system. We don't really want to be converting everything to text all the time and running it through a black box. Not simply because it is inefficient, but more that it isn't the right abstraction. It seems to me we should be training multiple models as an ensemble that compose from the outset where we have something akin to the pre-frontal cortex that does the planning in response to stimuli from other systems running in parallel. I have done quite a bit of thinking on this and I'm reasonably confident it can work. As for category theory and how it applies. If I squint I can kind of see it, but mostly in an abstract sense. I have built some prototypes for this that I guess you could say were type safe and informed by category theory. I can see it might help to have the formalism at this level to help with interpretability (because that's why I built them). Probabalistic category theory is more along the lines of what I have been thinking.

jumpstar
Автор

Bro can build an Ai but doesn't know how to turn off twitch alerts 💀

Wulk
Автор

Damn this is the first podcast I couldn’t just leave on 2x speed

Edit nvm it was just the first 5 min

bwhit
Автор

Both branches in an if expression in Haskell have to be of the same type. There are no union types like in other languages.

colbyn-wadman
Автор

The point of abstraction is to enable one to achieve a view of some particular forest by avoiding being blinded to such by the sight of some trees.

derricdubois
Автор

I believe Category Theory is the route to uncover how DNN and LLM work. For now, I think of a category as a higher level object that represents a semantic or topology. Imagine how lovely it would be if LLMs could be trained on categories possibly flattened into bytes.

adokoka
Автор

Every researcher on planet earth want the bitter pill to be false.
What's more exciting:
1) get more data + GPUs
2) engineering smart solutions

bacon_boat
Автор

This is an amazing video. I really love this tape. The idea about building formal language based on category theory to reason about some systems isn't limited to just application in neural network for sure. I can definitely see this being used in gene regulatory pathway. Thank you for the video, and I will definitely check out the paper.

aitheignis
Автор

I am not convinced. Of course you can slap category theory over this thing, it has arrows and compositions, so, well, you can put things in this language. I am missing that crunching argument were category theory reveals why it works, the same way functional analysis reveals why Fourier transform works.

radupopescu
Автор

I was thinking of Plato‘s Allegory of the Cave all the time through this episode.

alivecoding
Автор

I'm skeptical about composability explaining neural networks because small neutral networks do not show the same properties as many chained together. Composability seems like a useful tool once the nets you're composing are already quite large.

I think that the main contribution of category theory will be providing a dependent type theory for neural net specification.

The next hype in explainable AI seems to come from the "energy based methods".

srivatsasrinivas
Автор

I feel that my IQ increases just by watching this video.

AutomatedLiving
Автор

Yet another exceptionally invaluable episode. Thank you Tim

AliMoeeny