Using NeuralODEs in Real Life Applications | JuliaCon 2023

preview_player
Показать описание
Despite the great potential of NeuralODEs - the structural combination of an artifical neural network and an ODE solver - they are not yet a standard tool in modeling of physical systems. We believe, there are mainly two reasons for that: First, NeuralODEs develop their full potential if paired with existing models that capture some basic physics. However, existing models typically are set up in dedicated simulation tools and thus not available in Julia. Second, training of NeuralODEs is tricky and not yet plug and play. In this tutorial, we share methods to employ models from various simulation tools in NeuralODEs and training strategies that can deal with challenges of industrial scale. Both was validated in real examples ranging from automotive to medical use cases.
(1) Export your model from your favorite simulation tool as FMU,
(2) Import the FMU in Julia,
(3) Use the FMU as layer of an ANN in Julia,
(4) Train the resulting NeuralODE, called NeuralFMU.
Unfortunately, the fourth step is more complicated than it looks at first glance. NeuralFMUs or NeuralODEs in general tend to exhibit expensive gradient computation and thus very long training times. Further, they tend to converge to local minima or enter unstable regions. In order to cope with these challenges, we present strategies to
• design a target-oriented topology for such a model
• initialize or even pre-train such models
• deal with large data and use batching
• efficiently train such multi-domain models

Equipped with knowledge about typical pitfalls and their workarounds, workshop participants will have an easier time dealing with their own NeuralODE applications.

Time Stamps:
00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.

Рекомендации по теме
Комментарии
Автор

Very impressive work. I am looking forward to using those tools and also reading the paper for the 2023 Modelica Conference.

A few questions,
1) how events are handled? For example, the model could be in different states ("state" as in state machine) due to time/states events, the der(x) vector passed into the ANN could be identical, so I am wondering how would ANN react to that?
2) how to get the training "started"? i might have missed this in your presentation, but how would the ANN make reasonable predictions during the initial training phases when the prediction quality is low, I guess this is done through the Eigen informed training?
3) have you guys tried CS-FMU? which corresponds to a discrete ANN, I assume the process is similar?

hereisthecase