Deep Learning for Tabular Data: A Bag of Tricks | ODSC 2020

Показать описание

Table of Contents

Motivation: 0:15
Impute missing values: 1:37
Prepare categoricals, text, and numerics: 2:49, 3:10, 3:31
Properly validate: 3:54
Establish a benchmark: 5:24
Start with a low capacity network: 6:10
Determine output activation and loss function for classification and regression: 7:17, 8:26
Determine hidden activation: 9:46
Choose batch size: 10:57
Build learning rate schedule: 12:02
Determine number of epochs: 14:35
Track and interpret regression predictions: 15:30
Track metric and/or loss: 16:09
Track and interpret classification predictions: 16:45
Benchmark the network: 17:11
Dealing with discontinuities: 18:16
Tuning the network: 19:31
Handing overfitting vs. underfitting: 20:41
All tricks in one place: 21:35

Stay connected with DataRobot!

Рекомендации по теме

Комментарии

Thank you for this. As material’s science researcher dabbling in applying ML techniques to my datasets, this is great.

michaeljuhasz

I’ve also noticed a lack of emphasis of tabular data with respect to NN’s. This is a great presentation and very informative. Thanks for putting it together.

briantroy

This is awesome - so glad to see some serious, methodical work on this.

markryan

very interesting point on the suggested loss functions based on the distribution of the target variable. Learned a lot. Thank u

alirezaamani

Great use of Grant Sanderson's graphics library

ZachMeador

Amazing amazing video, I learnt so much. Thank you

rupjitchakraborty

By the way, one hot encoding and making an embedding are the same thing, except embedding is faster. What do you do with your one hot encoding? you multiply by a matrix. What happens when you multiply a vector with 0's and only one 1 by a matrix? That's right, you basically choose a column of the matrix. And that column is the embedding.

MiguelRaggi

Nice vid, but tbh this sounds like a crazy amount of work for something that will only ever tangentially approach boosted trees performance on most tabular datasets

jivan

Thanks for Sharing this are golden advices !

sayedathar

you helped me A LOT, amazing content and prefect presentation, keep it up

mohamedesdairi

Thanks, really cool video! But I have a question. You said "set the batch size 1% of dataset". Is these informations provided for deep learning on tabular datasets or other types of data too?

mehdiozel

how to spot random subset of data from a given set of data?

TheOraware

what about the 1d sensory data collected from physical and chemical instruments ? i know we can still treat them as tabular data but what about when we have thousands of variables and hundreds of samples only and the variables are not single identity but they are sort of grouping features, how to treat the data analysis ?

username

Deep Learning for Tabular Data: A Bag of Tricks | ODSC 2020

Why Deep Neural Networks (DNNs) Underperform Tree-Based Models on Tabular Data

Finally, Deep Learning for Tabular Data: The TabPFN Transformer

Why Do Tree Based-Models Outperform Neural Nets on Tabular Data?

Machine Learning on Tabular Data - First Chapter Summary

Deep Learning for Tabular Data: A Bag of Tricks | ODSC 2020

INTRODUCTION to Deep Learning with tabular data | Lecture 00

Neural Networks on Tabular Data #datascience #machinelearning #neuralnetworks #deeplearning

GANs for Tabular Synthetic Data Generation (7.5)

Batch 6 Sql DBA Advanced Performance Tuning Introduction Session Class 1 || Contact +91 990259140

Deep Learning for Tabular Data – Mr. Yam Peleg

Contrastive Learning for Tabular Data - SCARF

XGBoost outperforms Deep Learning Models for Tabular Data: Paper Summary

Intro to Machine Learning for Tabular Data

#DLDC2020 | Deep Learning Dev Con | Luca Massron - Deep Learning For Tabular Data

Deep Learning with Tabular dataset: Dos and Don'ts

Deep Learning for tabular data - #DevFestVeneto19

Talks # 4: Sebastien Fischman - Pytorch-TabNet: Beating XGBoost on Tabular Data Using Deep Learning

Tabular Data Analysis with Neural Networks: Predictive Modeling Tutorial

Tensor Flow 2.0 - Part 1: Deep Learning and Tabular Data + tf.function

W&B Paper Reading Group: Revisiting Deep Learning Models for Tabular Data

Deep Learning for Tabular Data Innovation Meets Practice, Adi Watzman

[Open DMQA Seminar] Comparison of Machine Learning and Deep Learning for Tabular Datasets

Numerai Quant Club / Why do tree-based models still outperform deep learning on tabular data?

Tabular Learning: skrub and Foundation Models with Gaël Varoquaux, PhD