How data mining works

preview_player
Показать описание

In this video we describe data mining, in the context of knowledge discovery in databases. Nowadays (around 2020) people are using the term 'data science' however with a lot of similarity to what we see in this video, using a reference from 1997.

More videos on classification algorithms can be found at

Subscribe to my channel!
Рекомендации по теме
Комментарии
Автор

Dear 叶熙泽
Suppose you have 3 variables with the following feature values:
x1 = [1, 2, 5, 6]
x2 = [4, 2, 7, 9]
x3 = [1, 2, 5, 6]

x1 and x3 are very correlated (i.e. equal in this case). Then it is not efficient to any algorithm to use both variables, since computations of distances and so on will return equal values. So the proposal here is to use only x1 and x2.

Regards

tkorting
Автор

This is the BEST explanation i found so far, so thanks a lot!

fredhm
Автор

Could you please explain why we remove the highly correlated variables?

sajeedarahuman
Автор

Thanks Thales for a great video. It was short and to the point. I will be pointing my students here.

One question I see that you get over and over again is about the problems of correlated data. Another reason to find uncorrelated data is that many knowledge discovery engines and tools, such as neural networks and almost every statistical package do not work with with highly correlated variables. That is because if you end up using two or more correlated variables in a prediction algorithm, the correlated variables tend to bias the direction of the prediction to be based on what those variables represent. So if you are using five variable, three correlated and two independent, then the three correlated variables own 60% of the influence on the outcome (potentially, assuming the prediction engine starts with balanced effect).

stevewalczak
Автор

i'm ttying to learn how to find unreleased
music ...

vanessamerced
Автор

How to dattaminig in free fire
Can you reply me???

rajnarayanmandi
Автор

Please give some tips to select the data target

sadhasivamraman
Автор

can you please tell me What is the best Data mining process methodology (CRISP-DM, SEMMA, KDD )that can be used in Heart attack predict classification and why?

nowxdi
Автор

Hi, how can we implement these algorithm that you mentioned in the video to create patterns.

ahmedobeid
Автор

why is correlated data useless ? why do we have to make it uncorrelated ?

NilloBandiKowa
Автор

sir, can u explain how data mining works on Road Accidents..

leemajelifa
Автор

No volume in this video.. can you help !!

vin
Автор

Too abstract for me to understand why thats data processed in first place? Is that something useful to bitcoin network or just random data?

andrandonan
Автор

Thanks so much. I have liked, subscribed, and shared this video to my peers.

RazerBlackShark
Автор

Nicely explained! But I still have one question. In transformation process you said "using two correlated variables is useless" I am not sure I understand why? Thanks.

nickye
Автор

Great video! I finally understood the relation between KDD and data mining now I can see the big picture, thanks.

l.k.alghamdi
Автор

Nice and thanks it helped in my presentation on data mining

raghavabattineni
Автор

Dear Thales,

It was an awesome presentation!! very happy to learn from you now and ever.. Subscribed!

sivaji
Автор

If videogame companies are data mining, I know it's generally unethical for obvious security reasons, but if it's helping stabilize and ensure that future launches are more smooth, and it could even be used to wriggle out a bug or glitch based on multiple player data algorithms. By asserting the pattern. Companies are able to release day one patches for new releases to get bugs that would take a few weeks to a few months to fix. Down to an overnight patch update

jx
Автор

if variables are correlated...it doesn't mean they are useless in using them.

emmanuelameyaw