Miles Cranmer - The Next Great Scientific Theory is Hiding Inside a Neural Network (April 3, 2024)

preview_player
Показать описание
Machine learning methods such as neural networks are quickly finding uses in everything from text generation to construction cranes. Excitingly, those same tools also promise a new paradigm for scientific discovery.

Рекомендации по теме
Комментарии
Автор

00:00-Introduction
01:00-Part I
03:06-Tradititional approach to science
04:16-Era of AI (new approach)
05:46-Data to Neural Net
13:44-Neural Net to Theory
15:45-Symbolic Regression
21:45-Rediscoverying Newton's Law of gravity
23:40-Part II
25:23-Rise of foundation model paradigm
27:28-Why does this help?
31:06-Polymathic AI
37:52-Simplicity
42:09-Takeaways
42:42-Questions

heliocarbex
Автор

For anyone that does not do work with ML, the takeaway of symbolic regression as a means of model simplification may seem quite powerful at first, but often our rational to justify neural net usage is precisely due to the difficulty in the derivation of explainable analytical expressions to phenomena. People like Stephen Wolfram suggest that perhaps this assumption of assuming complex phenomena can be model analytically is precisely why we are having problems advancing. The title of the video to seasoned ML researchers sounds like the speaker will be explaining techniques to analyze neural net weights instead of talking about this.

chrisholder
Автор

It seems like very powerful idea, when AI observes the system, then learns to predict behaviour and then the rules of this predictions are used to delivery math statement. Wish the authors the best luck

antonkot
Автор

It is precisely what I'm working on for some time now, very well explained in this presentation, nice work! (the idea of pySR is outrageously elegant, I absolutely love it!)

cziffras
Автор

The folding analogy looks a lot like convolution. Also, the piecewise continuous construction of functions is used extensively in waveform composition in circuit analysis applications, though the notation is different, using multiplication by the unit step function u(t).

ElectronicsGuitar
Автор

So here we are, you guys seems to be chosen by algorithm for us to meet here. Welcome, for some reason.

Bartskol
Автор

It makes intuitive sense that a cat video is better initialization than noise. It's a real measurement of the physical world

andrewferguson
Автор

Being able to derive gravity laws from raw data is a cool example. How sensitive is this process to bad data? For example, non-unique samples, imprecise measurements, missing data (poor choice of sample space), irrelevant data, biased data, etc). I would expect any attempt to derive new theories from raw data to have this sort of problem in spades.

donald-parker
Автор

There are multiple different awesome ideas in this presentations.

For example, an idea of having a neural net discovering new physics, or simply of being the better scientist than a human scientist. Such neural nets are on the verge of discovery or maybe in use right now.

But I think the symbolic distillation in the multidimensional space is the most intriguing to me and a subject that was worked on as long as the neural networks were here. Using a genetic algorithm but also maybe another (maybe bigger?) neural network is needed for such a symbolic distillation.

In a way, yes, the distillation is needed to speed up the inference process, but I can also imagine that the future AI (past the singularity) will not be using symbolic distillation. Simply, it will just create a better single model of reality in its network and such model will be enough to understand the reality around and to make (future) prediction of the behavior of the reality around.

nanotech_republika
Автор

I am re-reading once again the book By David Foster Wallace History of Infinity. There he describes the book by Bacon Novum Organum. In book one there is an apt statement that I would like to paste

8. Even the effects already discovered are due to chance and experiment, rather than to the sciences. For our present sciences are nothing more than peculiar arrangements of matters already discovered, and not methods for discovery, or plans for new operations.

tehdii
Автор

I was wondering or missing the concept of Meta-Learning with transformers, especially because most of these physics simulations shown are quite low-dimensional. Put a ton of physics equations into a unifying language format, treat each problem as a gradient step of a transformer, and predict on new problems. In this way, your transformer has learned on other physics problems, and infers maybe the equation/solution to your problem right away. The difference to pre-training is that these tasks or problems are shown each at a time unlike the entire distribution without specification. There has been work to this on causal graphs, and low-dimensional image data of mnist, where the token size is the limitational factor of this approach, I believe.

randomsocialalias
Автор

Well not sure this will go anywhere except maybe modify some of our archaic equations for nonlinear terms. The problem is probably related to NP hardness and using more expansive nonlinearity methods to crack certain problems that are more specified. We will always not know what we don't know. Using more general nonlinear models was bound to greatly improve our simulations. The real question for NN is this the MOST ACCURATE or most INSIGHTFUL and BEST of nonlinear methods to do so? Somehow I doubt this, but it's certainly a nice proof of principle and place to venture off further. To put all our faith in it might be a mistake though. We might be looking at long predicted by mathematicians limits to reductionism, and our first method to not overfit billions of parameters will give us an illusion that this is the only way, and we could be looking at a modern version of epicycles. If we want to really go further we need to use such models to not just get better at copying reality, but finding general rules that allow it's consistent creation and persistence through time. Perhaps one way to do this would be to consider physical type symmetries on weights.

zackbarkley
Автор

I came here to read all the insane comments, and I’m not disappointed.

laalbujhakkar
Автор

Love the definition of simplicity, I found that to be pretty insightful.

jim
Автор

There's a paper on Feature Imitating Networks that's gotten a few good applications in medical classification, and subtask induction is a similar line of thought. FINs are usually used to produce low dimensional outputs, but I was thinking about using them for generative surrogate modeling. FINs can help answer the question of how to use neural networks to discover new physics.

An idealized approach would turn every step of a coded simulator into something differentiable.

It occurs to me that the approach of this talk, and interpretability research generally, is essentially the inverse problem of trying to get neural networks to mimic arbitrary potentially nondifferentiable data workflows.

lemurpotatoes
Автор

This is SO cool! My first thought was just having incredible speed once the neural net is simplified down. For systems that are heavily used, this is so important

benjamindeworsop
Автор

The 'Avada Kedavra' potential of that pointy stick is immense. Brilliant presentation.

FrankKusel
Автор

Jesus christ, okay Youtube I will watch this video now stop putting it in my recommendations every damn time

GeneralKenobi
Автор

This is a very nice idea. I hope it will work! It will be very interesting to see new analytical expressions coming out of complicated phenomena.

ryam
Автор

Great presentation!
My main takeaway is that we need a more unified approach to neural network models. Interoperability is important and can substitute for or even supercede the quality increase of pre-training.

zefk