#8 Data Preprocessing In Data Mining - 4 Steps |DM|

preview_player
Показать описание
Abroad Education Channel :

Company Specific HR Mock Interview :
A seasoned professional with over 18 years of experience with Product, IT Services and Agri industry of valuable experience in Human Resource Management, Extensive Experience in Talent Acquisition, Personnel Management, Compensation and Benefits, Performance Reviews, Training & Development and all facets of Human Resources will be performing mock HR Interviews and provides feedback on the session and guides with interview techniques.

paper presentation for semester exams :
Рекомендации по теме
Комментарии
Автор

Super mam ..it's very helpful those who r studying one night before exam ❤️

Skvalibhasha
Автор

Legends watched it till the end. Thanks ma'am it's easier than other tutors.

manifestation
Автор

The best thing in your videos is every topic is broken down into simple one. We are waiting for more and more subjects.

horc_rux_
Автор

Ma'am can you upload unit 2, because we have exams on 11th, and thanks for videos we love ur videos ❤️. We are waiting for further videos

kingofdwarka
Автор

All the steps of pre-processing are explained very good and that too in simple language

sahilkale
Автор

Thanks for creating DM playlist and explaining evth. so simply and in an easy way.
It is really helpful. Thanks again for creating wonderful videos.
Stay safe. Tc.

eklavyak
Автор

08- Data Preprocessing In Data Mining
(the process of transforming raw data into an understandable format)

Four major tasks


1- data cleaning - removing noisy data (incorrect, incomplete, inconsistent data) and replace missing values
For missing values, replace with N/A, a mean value (normal), a median value (non normal) or most probable value
manually for small data sets, automatic for large data sets

-1 Noisy data -> binning, sort data, assign into bins
smoothing process - remove error values
-smooth by bin mean
-smooth by bin median
-smooth by bin boundary (min or max values)

-2 Regression - numerical prediction of data

-3 Clustering - similar data items are grouped at one place
dissimilar items - are outside the cluster

2- data integration - multiple heterogeneous sources of data are combined into a single dataset
Two types of data integration
1- tight coupling - data is combined together into a physical location
2- loose coupling - only an interface is created and data is combined and accessed through the interface
data is stored in the DB

3- data reduction - the volume of data is reduced to make analysis easier
methods for data reduction
1- dimensionality reduction
reduced the number of input variables in the dataset, because large input vars -> poor performance
2- data cube aggregation - data is combined to form a data cube and redundant noisy data is removed
3- attribute subset selection (attributes are columns)
highly relevant attributes should be used, others are discarded (data is reduced)
4- numerosity reduction - store only a model (a sample) of data rather than the entire dataset

4- data transformation - transformed into appropriate form suitable for the DM process
Four methods

1- Nominalization - scale the data values in a specified range (eg; -1.0 to 1.0 or 0 to 1)
2- Attribute selection - new attributes are created using older ones
3- Discretization - raw values are replaced by interval levels
eg; 10, 12, 13, 14, 21, 22, 34, 36 -> 10-20, 20-30, 30-40
4- concept of hierarchy generation - converting attributes from a low level attribute to a higher level attribute
eg; city -> country


.

stephanieezat-panah
Автор

lots off efforts mam thank you for this❤ knowledge our exams are near no need to read aal the stuff just watch your videos 2 times and done for this topic for exam😊

graphics
Автор

Really well explained, it's a perfect combination of detailed but brief, all clear thnks to you❣

uditgupta
Автор

Thank you so much best content on internet with explanation thats what we need😌

shanugamer
Автор

Ma'am your teaching style is too good, easily understandable.would you please upload SUPPORT VERTEX MACHINE(SVM)in data mining...❤️

vengat
Автор

Really Thanks Mam for easy content to understand and ur way of teaching is really awesome ⚡✨.

gp-qkqg
Автор

Outstanding video. Thoroughly and articulately summarizes each aspect of data preprocessing. Will definitely return to this channel in the future for more help!

CharlieArgust
Автор

Fuff!! OMG your voice is so calm and clean. ♥

vinushan
Автор

🎯 Key points for quick navigation:

00:44 *📊 Data preprocessing transforms raw data into an understandable format, such as tables or graphs.*
01:41 *🔄 Four main steps in data preprocessing: data cleaning, data integration, data reduction, and data transformation.*
02:07 *🧹 Data cleaning involves removing incorrect, incomplete, or missing data and handling missing values.*
03:06 *🛠️ Missing values can be filled manually or automatically using methods such as median or mean replacement.*
04:44 *📦 Handling noisy data can involve binning, regression, and clustering methods to smooth out errors.*
07:15 *📊 Data integration combines data from multiple sources into a single dataset using tight or loose coupling.*
08:24 *🔍 Data reduction reduces data volume to simplify analysis, using methods such as dimensionality reduction and data cube aggregation.*
10:15 *🤏 Data transformation adapts data into a suitable form for mining, involving steps like normalization and discretization.*
14:50 *✨ Concept hierarchy generation converts low-level attributes to high-level ones, enhancing data abstraction.*

Made with HARPA AI🎯 Key points for quick navigation:

00:44 *📊 Data preprocessing transforms raw data into an understandable format, such as tables or graphs.*
01:41 *🔄 Four main steps in data preprocessing: data cleaning, data integration, data reduction, and data transformation.*
02:07 *🧹 Data cleaning involves removing incorrect, incomplete, or missing data and handling missing values.*
03:06 *🛠️ Missing values can be filled manually or automatically using methods such as median or mean replacement.*
04:44 *📦 Handling noisy data can involve binning, regression, and clustering methods to smooth out errors.*
07:15 *📊 Data integration combines data from multiple sources into a single dataset using tight or loose coupling.*
08:24 *🔍 Data reduction reduces data volume to simplify analysis, using methods such as dimensionality reduction and data cube aggregation.*
10:15 *🤏 Data transformation adapts data into a suitable form for mining, involving steps like normalization and discretization.*
14:50 *✨ Concept hierarchy generation converts low-level attributes to high-level ones, enhancing data abstraction.*

Made with HARPA AI🎯 Key points for quick navigation:

00:44 *📊 Data preprocessing transforms raw data into an understandable format, such as tables or graphs.*
01:41 *🔄 Four main steps in data preprocessing: data cleaning, data integration, data reduction, and data transformation.*
02:07 *🧹 Data cleaning involves removing incorrect, incomplete, or missing data and handling missing values.*
03:06 *🛠️ Missing values can be filled manually or automatically using methods such as median or mean replacement.*
04:44 *📦 Handling noisy data can involve binning, regression, and clustering methods to smooth out errors.*
07:15 *📊 Data integration combines data from multiple sources into a single dataset using tight or loose coupling.*
08:24 *🔍 Data reduction reduces data volume to simplify analysis, using methods such as dimensionality reduction and data cube aggregation.*
10:15 *🤏 Data transformation adapts data into a suitable form for mining, involving steps like normalization and discretization.*
14:50 *✨ Concept hierarchy generation converts low-level attributes to high-level ones, enhancing data abstraction.*

Made with HARPA AI

yagnasaipravallika_jonnakuti
Автор

this video helped me so much with my exam thank you

themoonkiddo
Автор

Tq for explaining the preprocessing mam will help me for my exams

Vamsi_here
Автор

Best explanation according to computer science and business systems data mining and analytics syllabus.

debajyotibanerjee
Автор

My whole topic in 18 minutes! Great Job!

harshwardhankurale
Автор

Thanks alot mam actually I enjoyed the video and well understood ❤

tamilupdate
welcome to shbcf.ru