Amazon Data Scientist Mock Interview - Fraud Model

Показать описание

====== ✅ Details ======

🤔 Try these machine learning questions asked in Amazon's data science interviews:

"Q1 - What is the variance and bias trade-off?"

"Q2 - What's the difference between boosting and bagging?"

"Q3 - How would you detect seller fraud on Amazon?"

This is a mock interview session covering machine learning questions asked in Amazon's data science interviews. The interviewer was a data scientist at Google and PayPal. The interviewee is a candidate preparing for data science interviews at FAANG companies.

====== ⏱️ Timestamps ======

0:00 Intro
00:55 Variance & Bias Trade-Off
03:47 Boosting vs Bagging
06:33 Seller Fraud Modeling
26:24 Assessment

====== 📚 Other Useful Contents ======

1. Principles and Frameworks of Product Metrics | YouTube Case Study

2. How to Crack the Data Scientist Case Interview

3. How to Crack the Amazon Data Scientist Interview

====== Connect ======

DataInterview

Рекомендации по теме

Комментарии

Great Video Dan, it was eye-opening! Thank you so much from NYC! just one note that, boosting and Bagging methods are not just for the tree-based ML systems and can be used with any ML method. However, they are much more popular for tree-based methods due to their fast training time and relatively straightforward application.

hsoley

PCA is a feature extraction technique. Feature selection techniques would choose from features list, extraction techniques would create features which capture the majority of vairance. Whatever the interviewee chose for feature selection are good I feel.

gpprudhvi

Im bagging, We won't say a model as weak leaner's.We use the word weak learners only in boosting and to specifically in Adaboost, because it only has a stumps for prediction not a full tree so only we say adaboost models as a weak learners

shilashm

Concerning '# of positive reviews' feature: I have to assume that there exists a subset of fraudulent sellers using bots/review farms to boost #/ratio of positive review. If positive reviews are locally important for non-fraudulent true positives, I imagine that this could potentially lead to a recall problem in our model. thoughts?

rr

In classification we use to have precision-recall tradeoff ryt?

shilashm

Great mock interview and I believe it is pretty representative! Thanks for providing this!

danielxing

Is this a typical interview for an L4 or L5 role?

aaronrasquinha

I would have asked about the provenance of the data, i.e. on what grounds the sellers and transactions were classified as fraud. If these were simply reported as fraud by other users, a fair share of these could be from bad-faith competitors. In this case, I would think of alternative ways to gather data, and propose a more lenient decision boundary for fraud.

The interviewee did not have time for a deep dive into the mechanisms for determining image/title misrepresentation. A subset of misrepresentation (image/title mismatch) could be captured by CV models.

qingyangzhang

HI SIR I AM ZAKIYAH FATHIMA M. I AM 12 YEARS OLD .I USED TO WATCH YOUR VIDEOS AND SUNDAS MAM'S CHANNEL. MY DREAM IS TO BECOME A DATASCIENTIST . I KNOW THE PROGRAM LANGUAGE PYTHON .

zakiyahfathimam.

Higher variance means more flexibility? In general, can't you look at variance in the same way you look at overfitting. I.e., a model with vary high variance will capture outliers, tend to overfit data that doesn't accurately represent the underlying phenomena that produced the data. In this case, wouldn't it make sense to say it does NOT correspond to more flexibility, since the higher variance means it is better suited for ONLY the training data? Just curious where my logic is straying from the interviewers. Thank you for posting this it has been very informative!

Drewbie_T

is it just me or you'll rather do clustering to find labels, then classify....

xEl_ence

I feel like the dude got lost in the sauce with seller based, listing based type shit.

yoyo-uepf

? from where hyperparamter comes into decision boundary. which kind of intangible things are they cooking on their own. God please save.

MrMandarpriya

Amazon Data Scientist Mock Interview - Fraud Model

Amazon Data Scientist Mock Interview - AB Testing

Amazon Data Scientist Mock Interview - Fraud Model

Amazon Data Scientist Interview Prep | Interview Coach

Amazon Data Science Interview: Linear Regression

Amazon Behavioral Interview Questions | Leadership Principles Explained

Amazon Data Science Business Case | FAANG Interview Prep

Amazon Data Scientist Interview Questions - Can You Solve Them?

The Amazon Data Science Interview

Amazon Interview Strategy - CRUSH The Loop

Amazon SQL Interview Question | Shipments Per Month | CONCAT & DATE_PART

Amazon Data Scientist Interview Questions | Statistics for Data Science

Amazon Prep Video - Applied Scientist (AS)

Amazon Deep Learning Interview: Justify a Neural Network

Data Science Job Interview – Full Mock Interview

Data Scientist Interview Tips & Career Advice (Uber, ex-Amazon)

STAR Method - How to Ace Your Amazon Interview

How To Crack Data Science Interviews In Amazon| Discussing The Entire Process

Amazon BI Engineer Interview - Executive Metrics | Business Case + SQL

How to use the STAR Method in Job Interviews 🌟 #careeradvice

Acing the Product Data Science Interview (for Facebook, Google, and Amazon Interviews)

Can YOU Pass this Amazon Job Interview Question?

Data Science Interview - Increasing Sales through A/B Testing (with Tinder Sr. Data Science Mgr)

Can You Solve This Amazon Interview Question? | Puzzles for Software Engineers Part-4 🔍

Solving an Amazon Data Science Interview Question in Python Pandas (medium difficulty)