Hanna Meyer: 'Machine-learning based modelling of spatial and spatio-temporal data'

Показать описание

Remote sensing is a key method in bridging the gap between local observations and spatially comprehensive estimates of environmental variables. For such spatial or spatio-temporal predictions, machine learning algorithms have shown to be a promising tool to identify nonlinear patterns between locally measured and remotely sensed variables. While easy access to user-friendly machine learning libraries fosters their use in environmental sciences, the application of these methods is far from trivial. This holds especially true for spatio-temporal since its dependencies in space and time bear the risk of overfitting and considerable misinterpretation of the model performance.
In this introductory lecture I will introduce the idea of using machine-learning for the (remote sensing based) monitoring of the environment and how they can be applied in R via the caret package. In this context error assessment is a crucial topic and I will show the importance of "target-oriented" spatial cross-validation strategies when working with spatio-temporal data to avoid an overoptimistic view on model performances. As spatio-temporal machine-learning models are highly prone to overfitting caused by misleading predictor variables, I will introduce a forward feature selection method that works in conjunction with target-oriented cross-validation from the CAST package.
In summary this talk aims at showing how "basic" spatial machine-learning tasks can be performed in R, but also what needs to be considered for more complex spatio-temporal prediction tasks in order to produce scientifically valuable results. Based on this talk, we will go into a practical session on Tuesday, where machine-learning algorithms will be applied to two different spatial and spatio-temporal prediction tasks.

Рекомендации по теме

Комментарии

i think there was a misunderstanding in the last question asked at the end 50:31 :
there are of course no data available in the response variable in the more remote areas of antarctica. the question was how a different approach to cross validation will get better predictions for those areas.

perfectmoments

Awesome video. Big shout out from brazil

theforester_

Thanks for sharing, it's helpful for me!

gezahagnnegash

thanks for posting, very helpfull and interesting

ritwek

In the end, there is a mixing of two factors here: features and the CV method. Therefore it is not possible to understand what the effect of the CV method is.

In the lecture, it seems that the problem is with the features of the coordinates, which cause overfitting, and indeed in the solution there was a reference to this with the Feature Selection by FFS, where the aforementioned features were indeed removed from the training. Therefore, whether one or another method is used for CV, the factors For overfitting are the features and not the CV method, at least in this case.

Only if the model was trained with the help of Spatial CV together with the features of the coordinates and did not reach overfitting, would it be possible to conclude that indeed the CV method is the cause and solution for this.

natannvw

This seems to be completely disconnected from the field of climate informatics, and all the sophisticated methods they use there, no mention of phsyically informed, deterministic models which already make good global predictions, all things regarding data assimilation, it seems weird to ignore this. This talk boils down to quite simple things: we have observations and we model them with simple ML models becasue they can deal with complex relationships. We validate these algorithms appropriately. Not much more than that, when a key issue, *what exactly it is you are trying to model* aside from tree species, is right there for discussion.

TheSwordfish-gr

Hanna Meyer: 'Machine-learning based modelling of spatial and spatio-temporal data'

Hanna Meyer: 'Machine-learning based modelling of spatial and spatio-temporal data'

Hanna Meyer: 'Machine-learning based modelling of spatial and spatio-temporal data' (pract...

Hanna Meyer - Machine learning for earth observation

GSV2020 Estimating the area of applicability of spatial prediction models (Hanna Meyer)

Interview: Hanna Meyer - Summer School 2020

Hanna Meyer: Plenary - 04.09.2019

Plenary: Hanna Meyer, Dainius Masiliunas, Paula Moraga

OpenGeoHub Summer School - Room 1 - Day 4 - Hanna Meyer

Profile: Hanna Meyer (Philipps University Marburg, Germany)

Sensitivity of a Machine Learning-Based Model to Predict Chlorophyll-a Using Multi-Media Modeling

What and where? - Machine learning for geospatial image analysis - Mathilde Ørstavik

Plenary - 04.09.2019

SAP HANA Spatial - Machine Learning with Geospatial Data

Madlene Nussbaum - Mastering machine learning for spatial prediction (part 1)

Machine Learning for Prediction of Terrestial Climate and Weather

Madlene Nussbaum: Mastering ML for spatial prediction II - model selection and interpretation

Madlene Nussbaum: Plenary - 05.09.2019

Geospatial Data Science & Machine learning

Madlene Nussbaum: Mastering ML for spatial prediction I - overview and introduction in methods

Tutorial: Climate Change: Challenges for Machine Learning

Data Mining the City: Agent Based Simulation for Spatial Behavior Prediction by Violet Whitney

Earth Observation & Machine Learning for Agroecological Applications

Introduction to Ensemble Machine Learning for Predictive Soil Mapping (landmap package) part I

Boost your applications with in-database spatial & Machine Learning | Mathias Kemeter | SAP