Credit Card Fraud Detection - Dealing with Imbalanced Datasets in Machine Learning

Показать описание

Error: The neural net predictions function is using shallow_nn everytime instead of the model passed in, sorry about that! This changes the results a bit, but the main point is choosing and creating a model, which this doesn't impact.

Subscribe if you enjoyed the video!

Best Courses for Analytics:
---------------------------------------------------------------------------------------------------------

Best Courses for Programming:
---------------------------------------------------------------------------------------------------------

Best Courses for Machine Learning:
---------------------------------------------------------------------------------------------------------

Best Courses for Statistics:
---------------------------------------------------------------------------------------------------------

Best Courses for Big Data:
---------------------------------------------------------------------------------------------------------

More Courses:
---------------------------------------------------------------------------------------------------------

Рекомендации по теме

Комментарии

Stunning bro just clear cut explanation not wasting a single minute it's just a gold mine of information
best video on a project explained step by step

vishnusunil

One thing worth mentioning would be the data wrangling part. It's often a good idea to check for feature relevance and feature importance. Funny enough, the amount of transaction and the time of it were not considered as the features that had a substantial impact on the general outcome of the model to see if a transaction was fraudulent or not.
This not only reduces bias in our data frame, but it can also substantially increase the computation speed of that model! (mine had a 36% boost in speed while losing only 0.01 points in F1 score, and 0.02 in precision.)
Another thing would be to write a function that fits the training and validation data in each of the models automatically. It will substantially help with the cleanliness and readability of the project.
I would also consider hyperparameter tuning and pipelining everything together to make it a robust project. However, great video and a great demonstration of how to check each model and measure their suitability for the problem at hand.

somechad

I'll be trying this soon, thanks Greg

mellowftw

Great video on classification. Good luck with the channel!

petarganev

Hope to see more of this kind in the coming days!!

machinelearning

how do you balance test set when you don't have labels in real life?

sushantpargaonkar

Great video ❤❤ looking forward for more videos like this..

saitejatangudu

What is your opinion on doing oversampling (SMOTE) on the minority class?

joxa

im getting errposts on the rest train and val run for the numpy

KeKuHauPiOx

After training the model on the balance population please find the model performance on the original population the imbalanced one.

motilalmeher

Thanks greg!!
Is it okay to do projects by looking at the tutorial videos!? When is the time, we need to do it on our own

sakshirathi

12:51 shouldn't shape of y_train be (240000, 1) since it consists of exactly one column?

devjain

In the predict function, you’re taking model as input arg but returning on shallow-nn. Is it correct? Or should it be model.predict() 28:31

amannagarkar

Are you not leaking targets if your normalize before splitting the data?

MatTheBene

Thank you for your amazing efforts! I don't have much experience in building different models, so this video helped me a lot! Btw, I tried increasing max_depth to 6 in random forest model, and it really increased model's performance better than I expected. Thanks again!

prathameshmore

Great vídeo. I was just wondering if taking a slice from the original dataset to use as a test set is a more consistent way to evaluate the resampling procedure. Because in production, the model still has to deal with imbalanced data.

mahelvson

really like your vidoe!
One thing though, when you downsampling the data, shouldn't you still keep validating/testing on the ratio of data?
In your case, you are basically assuming the testing data is also have a 50/50 split, which in reality will never be the case.

garlicman

Hey Greg, thank you for the video but I have a question. At first, we had a dataset that had 280000 rows and 30 columns but towards to end of the video, we decreased the dataset that only had 984 rows. Doesn't this make the model bad because we're trained on less data?
Or the real problem was we were getting bad results at first because we had so many not_fraud data compared to fraud ones?

unlucky-

Thanks man. I'm going to try this one. It's really helpful. 🙏😍

sivanujansivakumar

I just wanna know whether it gives the accuracy details only or detect whether card is fraud or not

vinsanargeese

Credit Card Fraud Detection - Dealing with Imbalanced Datasets in Machine Learning

Credit Card Fraud Detection - Dealing with Imbalanced Datasets in Machine Learning

Fraud Detection: Fighting Financial Crime with Machine Learning

Project 22 : Credit Card Fraud Detection Using Machine Learning

Credit Card Fraud Detection Using Machine Learning Final Year Project by Mahesh Huddar

Project 10. Credit Card Fraud Detection using Machine Learning in Python | Machine Learning Projects

How Does Credit Card Fraud Protection Work?

Credit Card Fraud Detection

Credit Card Fraud Detection using Machine Learning from Kaggle

Credit Card Fraud Detection | Project In Machine Learning | Intellipaat

Why Credit Card Fraud Hasn't Stopped In The U.S.

18. Project 13 Credit Card Fraud Detection using ML | Handling Imbalanced Dataset | ML Project

Credit Card Fraud Detection Using Python | Information Security Python Projects

Data Science: Credit Card Fraud Detection Project | Python | Machine Learning | Full Project

CreditCard Fraud Detection | 6-ML Models | XGBoost | RandomForest

How Do Credit Card Issuers Detect Fraud? - Credit Card Insider

Data science project: Credit Card Fraud Detection using Machine Learning from Kaggle

Train, Evaluate, Repeat: Building a Credit Card Fraud Detection System - Leela Senthil Nathan

Credit Card Fraud Detection

Credit Card Fraud Detection (Classification) | Machine Learning | Python

CREDIT CARD FRAUD - How to Prevent Credit Card Fraud

Data Science Project- Credit Card Fraud Detection using Machine Learning | Python Training |Edureka

What is a Credit Card Fraud Investigation?

Credit Card Fraud Detection Using Machine Learning And Python | Data Science Projects | Intellipaat

Credit Card Fraud Is This Easy! (Why Your Cards Aren't Secure)