Model Metrics | Introduction to Text Analytics with R Part 9

Показать описание

This talk provides an overview of Model Metrics that includes specific coverage of:
1. The importance of metrics beyond accuracy for building effective models.
2. Coverage of sensitivity and specificity and their importance for building effective binary classification models.
3. The importance of feature engineering for building the most effective models.
4. How to identify if an engineered feature is likely to be effective in Production.
5. Improving our model with an engineered feature.

The data and R code used in this series is available here:

--

--

Unleash your data science potential for FREE! Dive into our tutorials, events & courses today!

--

📱 Social media links

--

Also, join our communities:

_

#modelmetrics #textanalytics

Рекомендации по теме

Комментарии

Hi Dave, absolutely loved the video series!!!!. I haven't seen ANY other tutorial that goes into so much depth and walks step-by-step through such a problem. Great work!. Please do keep sharing more such videos.

archowdhury

Probably best NLP in R and overall ML in R series ive seen

djangoworldwide

I started the series today and I'm addicted to it... Congratulations on your work! I'm already a fan!

rafaelsilva

Hi Dave, This is one of the best tutorials I ever have seen. thank you very much. I was wondering if you have any plans to cover test the model with test data and eventually how to put this in to production?

BhakthiLiyanage

amazon work. stunning lecture. so exciting!

TomerBenDavid

Hi Dave! If I do not have this two categories (ham and spam), I just have respondents row by row and text. What should I do?

vaz.felipe

Another great video.... waiting for the next one, how many videos are there in this series?

junaideffendi

I can't run rf.cv.1 function please anyone can help me?

farhanamim

Hi Dave, your videos are great. one quick questions. if we have more than 2 variables and there by multi dimensional confusion matrix then how are we going to deal with the Accuracy, Sensitivity and Specificity

WorldAroundWe

Thanks a lot Dave for these, immensely helpful.

One correction, the correct confusion matrix command should have been

confusionMatrix(rf.cv.1$finalModel$predicted, train.svd$Label)

instead of

confusionMatrix(train.svd$Label, rf.cv.1$finalModel$predicted)

as data is the first param and reference is the second. And hence, spam precision is good but spam recall is poor, rather than the other way round.

AmitYadav-zkzm

When using the Confusion Matrix, i believe we would have to be careful on the order of the actual and predicted parameters that go into the function.. What i mean is confusionMatrix (actual, predicted) would yield different results compared to confusionMatrix(actual, predicted) - is that correct ?

rajeshwaran

Hi Dave, you are talking about loading your cached results. How can I cache results myself? For my own project it seems caching results will save me a lot of time when I want to return to my own generated results.

mbeekink

hey I need help, I want to do confusionmatrix on rpart.cv.2 and I use the code confusionMatrix(train.tokens.tfidf.df$Label, then Error: `data` and` reference` appear should be factors with the same levels

any suggestions for my problem?

thank you

PCI

Hi Dave, Sensitivity is ratio of correct ham predictions over actual total ham values Not over total ham predictions. The column totals are the actual correct values and rows are the total predictions for the corresponding label class. Your formula is correct but what you said @8:04 was different. You defined Precision which is correct ham predictions over total ham predictions (row total).
Similarly, Specificity is ratio of correct spam predictions over actual total spam values ( second column total) Not total spam predictions (second row total).
I hope I've not confused you - I'm pointing out the difference between total predictions and total values...

shobhamourya

Hi Dave, I am running through the videos and applying this to the random acts of pizza dataset from kaggle. I am up to the point on running random forest on the train.svd and viewing the results. I have used the same stratified splits as in the videos. This data comes out at roughly 75/25. However, when I view the resulting confusion matrix it looks like this.

I expected a common split on the reference but this is way off. Am I doing something wrong or is my expectation incorrect, and this is more to do with the quality of the model so far? Appreciate any tips.

Reference
Prediction FALSE TRUE
FALSE 2123 10
TRUE 692 4

terrybrooks

Model Metrics | Introduction to Text Analytics with R Part 9

How to evaluate ML models | Evaluation metrics for machine learning

Model Metrics | Introduction to Text Analytics with R Part 9

135 - A quick introduction to Metrics in deep learning. (Keras & TensorFlow)

Evaluation Metrics for Machine Learning Models | Full Course

Tutorial 34- Performance Metrics For Classification Problem In Machine Learning- Part1

Metrics vs KPIs

15. Metrics and evaluation of machine learning models

LLM Evaluation Basics: Datasets & Metrics

Introduction to machine learning and Data Science With Examples

Machine Learning Regression Models Metrics

12.0 Lecture Overview (L12 Model Eval 5: Performance Metrics)

Machine Learning: Testing and Error Metrics

Maria Khalusova: Machine Learning Model Evaluation Metrics | PyData LA 2019

software metrics | software engineering |

Math Antics - Intro to the Metric System

The SaaS business model & metrics: Understand the key drivers for success

Every SaaS Metric Explained

Classification Metrics - EXPLAINED!!

Evaluation Metrics | Classification

A Beginners Guide To The Data Analysis Process

SE 33: Software Measurements & Metrics | LOC | FP

Part 19-Performance metrics for machine learning classification models

Webinar: Scoring Metrics for Classification Models

Metrics