Dealing with Missing Data in R

Показать описание

Data imputation is a technique that allows missing data to be replaced with data without affecting the trend of the analysis. It can be done in a huge numbers of ways. In R there's a lot of package that could allow the imputation of data easily as long as you understand the method you desire and why you are running on such method. IN this video I want to show case how you can use the mice package to easily replace data in a matrix and how you can compare the performance of each algorithm using ggplot2.

Slides

Github

Chapters
0:00 Introduction
1:05 What's imputation
1:45 Types of missing data
3:22 Measuring success
3:55 A number of different imputation techniques
9:05 R Script: introduction of the rmd format
10:06 Mean Imputation
11:40 locf and nocb
14:36 kNN and kNN imputation
19:00 Advance imputation with mice()
23:00 How does pmm and rf performed?
25:07 TCGA data Imputation
30:13 Effectiveness of Imputation

Рекомендации по теме

Комментарии

Fantastic video! Really really helpful and informative! I recommend! Thanks for your video!

helenahh

Woow.. This is wonderful.. Thank you for creating and sharing informative videos

mangalahegde

Thanks for this thorough demonstration! I wonder what you think about what percentage of missing values is okay to do imputation. Also the number of available complete cases might be important. E.g. if I have 3.000 complete cases is it okay to impute 12.000 missing values in the other cases? Information on these considerations are rarely to be found.

Philantrope

Thank you for your informative video!// At 15:03, I was wondering if you could provide me with reason(s) as to why data need to be normalised first before applying the KNN imputation. What would be consequence(s) if actual values are used for KNN imputation directly?// Are there quantitative method(s) which could be used to assess the accuracy of the imputation rather than visualisation? My data contains more than three thousand rows, so it is hard to assess the accuracy by using the three types of plotting described in the video.

LeviRafal

If I have panel data from 2000 to 2019 with health indices as predictors and data for these indices is missing for some years due to the frequency of data reporting. What type of missingness is that?

elizabethnalule

It would be nice to know where some of the functions you are using are coming from (without having to visit github). I cannot find locf, nobc or forbak in nomemica. I checked the zoo package. It does not have those but similar ones (na.locf for both LOCF and NOBC).

haraldurkarlsson

Nice presentation. However, I find difficult to find a good account of the difference between the different classes of missings (MCAR, MAR, MNAR). After reading the description of these types of classes by different youtubers I am just left a loss. Perhaps no one can explain these things?

haraldurkarlsson

How to check the quality of the imputation with Mice?

abdulbouraa

Dealing with Missing Data in R

Understanding missing data and missing values. 5 ways to deal with missing data using R programming

Don't Replace Missing Values In Your Dataset.

Dealing with Missing Values in Machine Learning: Easy Explanation for Data Science Interviews

Handling Missing Data Easily Explained| Machine Learning

Dealing With Missing Data Part I

How to fix missing values in your data

Dealing with Missing Data in Machine Learning

How to Deal with Missing Values in DataSet | Data Preprocessing & Data Cleaning 🧹 Imputation Met...

Turbocharge Your Fix for MISSING Values in MULTIPLE Columns!

Missing Data? No Problem!

How to deal with missing data when analyzing research findings

Handling Missing Values in Pandas Dataframe | GeeksforGeeks

#21 Dealing with missing data | Python for Data Science

Advanced Methods for Dealing with Missing Data

How To Handle Missing Values in Categorical Features

Python Pandas Tutorial 5: Handle Missing Data: fillna, dropna, interpolate

Handling Missing Data | Part 1 | Complete Case Analysis

Dealing With Missing Data - Multiple Imputation

Filling missing data in excel | ms excel | @techbro2

Simple techniques for dealing with missing data

Missing Value Treatment in Excel | Data Cleaning Using Excel Ep 6 | IvyProSchool

How to Deal with Missing Values in a Dataset

Handling Missing Values | Python for Data Analysts

How to handle missing data in SPSS