[S1E5] Calculating Training Errors | 5 Minutes With Ingo

Показать описание

If overfitting is a common problem in machine learning, how can you know if you have been running into this trap? And in general, how do you know how well a predictive model will perform in the future? This is a challenge since analysts don’t know in advance exactly what situations their data model will encounter at any given time.

In this episode, Ingo Mierswa, CEO & Data Scientist-In-Demand at RapidMiner, discusses why calculating the training error is always a bad idea. He explains why this leads to an always-perfect 1-Nearest-Neighbors classifier, and what analysts should do instead in order to figure out if their models are overfitting already and how well the models will predict the future. Sounds easy, right?

Plus, Lloyd Christmas pops in and "likes it a lot", we see the reflection of Nirmal Patel and, finally, Ingo works with Data Scientist Number 7 to get over his temporary depression by helping him with a logistic regression. (It's funny because it rhymes!)

MUSIC CREDITS:
Opening: Theme from Mission Impossible, Island Records, Inc., 1996.
Closing: Everybody Hurts, Automatic For The People, R.E.M., Warner Brothers, 1992