How to Use SPSS-Replacing Missing Data Using Multiple Imputation (Regression Method)

preview_player
Показать описание

Resources:

Schafer, Joseph L. "Multiple imputation: a primer." Statistical methods in medical research 8.1 (1999): 3-15.

Sterne, Jonathan AC, et al. "Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls." BMJ: British Medical Journal 338 (2009).

McKnight, Patrick E., Katherine M. McKnight, and Aurelio Jose Figueredo. Missing data: A gentle introduction. Guilford Press, 2007.

Haukoos, Jason S., and Craig D. Newgard. "Advanced statistics: missing data in clinical research—part 1: an introduction and conceptual framework." Academic Emergency Medicine 14.7 (2007): 662-668.

Newgard, Craig D., and Jason S. Haukoos. "Advanced statistics: missing data in clinical research—part 2: multiple imputation." Academic Emergency Medicine 14.7 (2007): 669-678.
Рекомендации по теме
Комментарии
Автор

A simple thank you might not be appropriate for this great work you did and shared with the public and by doing this with me.
So I want to tell you, if you ever feel down and or even feel worthless, remember that somewhere in Austria you made someone really happy by doing this tutorial!!! Thanks a lot.
At first I thought it might be a bit long but it was worth every second and you did a really good job.

Flaya
Автор

This was a very informative video. I am currently examining some longitudinal data and of course there is a significant amount of attrition. I initially ran a regression analysis using exclude cases listwise but I didn't feel this was the best way to analyze the data. This technique definitely helps address some of those issues. Thank you so much for posting this!

stephaniesmith
Автор

Thank you for the tutorial. I just ran this on my dataset successfully. However, I was wondering if there is a way to obtain pooled means and 95%  CI's across iterations. For inferential analyses (e.g., correlation), I am able to obtain the pooled statistics. However, when I use Analyze -> Descriptive Statistics -> Explore, it will only give me the descriptive for the original data and each iteration *individually*. Is there a way to obtain the pooled descriptive for variables? Also, is there a way for SPSS to generate a dataset that only contains the imputed data after the final iteration?

Thanks!

gregl
Автор

what happens if your data is missing not at random? I did the lIttle's test and it was significant. I can't figure out which MI to do in that case

jessicabarton
Автор

The observed discrepancy was because some cases had missing values in all the three included variables for multiple imputations. The problem resolved by adding variable(s) in which these cases had values.

sameeral-abdi
Автор

this video was very useful; thank you. however, even when splitting the file by imputation, i cannot get pooled analyses. spss will perform the analysis for the original data and each of the five imputations but will then only give me the means and standard deviations for the pooled data, not, for example, chi-square or t-test values; nor will it give me a p-value. why might this be?

masumarahim
Автор

The video explains the concept in such a easy to follow steps. A great video for multiple imputation technique. 

duallumni
Автор

I have a large number of variables and SPSS does not seem to be able to do the imputation with all the variables at once. So, I did groups of variable separately. However, I get multiple imputed data files. How do you recommend combining the data files?

kimconsultants
Автор

When writing a manuscript for a trial that has used multiple imputation to address missing data, what additional reporting should I include? Data pre and post imputation? Anything else?

mrflowers
Автор

How are degrees of freedom reported after a t-test is performed using multiple imputation? I see that the number of df for the pooled data can be in the thousands, and it does not feel right to report such a high number when the N = 50 for example. Any advice or paper or paper that discusses this issue? Thanks!

AldoAguirreC
Автор

Thanks so much for your reply - sorry you misunderstood me, I've got 570 participants so I'll do a EM and see how I go. Thanks again, and thanks for doing the videos - I've just started my PhD and I'm sure I'll be tuning in quite a bit!

janecooper
Автор


I saw earlier on your comment what to do on this issue, but I was not able to set min or max value. However, I found out that you can adjust the parameter in the syntax. It did worked out as I saw all the imputed value on the output. Unfortunately, on the data view tab I couldn't see any imputed variable, nor the upper right option to switch different data files. So, what went wrong?

Could you help me out? Thanks in advance!

Sspecial_KK
Автор

hi, thank you for very helpful video. I followed all the steps but my output after running my first ANOVA, only showed the 5 imputations, not the pooled figures. how do I get the pooled figures?

eligardner
Автор

Question: if results and parameters are "pooled" (and not averaged) what is the specific calculation? e.g. for bivariate correlations, or linear regression outputs, for example?

seanicusvideo
Автор

hello! thank you so much for the video.
I have a question however. From what I understood you dont get one single databased with missing values replaced; you should work with the pooled results. So, my question is if there is any way to crate a new single database to import to other programs (for instance mplus or lisrel) and work on. I need to do that for CFA on my data...

Luhna
Автор

this is helpful.  the use and purpose of the extra imputation history file might be better elaborated.  was very nice to include some references! thanks!

seanicusvideo
Автор

This is a great presentation. I really enjoyed it. Unfortunately for me, as I tried to follow it to impute my missing data, I keep receiving a warning which says that the imputation model for some variable contains more than 100 parameters. Below is an example of such warnings: "An iteration history output dataset is requested, but cannot be written.
The imputation model for SYNC2 contains more than 100 parameters. No missing values will be imputed. Reducing the number of effects in the imputation model, by merging sparse categories of categorical variables, changing the measurement level of ordinal variables to scale, removing two-way interactions, or specifying constraints on the roles of some variables, may resolve the problem. Alternatively increase the maximum number of parameters allowed on the MAXMODELPARAM keyword of the IMPUTE subcommand.
Execution of this command stops."
This is repeated for quite a number of variables. Can someone help me understand how to hand this trouble? Thank you. Juvenal Balisasa

Jbalisasa
Автор

The raw data was a dummy variable regression so there are only 1 and 0. Also, the experimental design was such that each respondent had their own design where they saw either all or just a subset of the variables. So I am looking to fill in the coefficients for the variables they did not see.

chavianddavid
Автор

I understand your point. But by outcome variables I mean Dependent Variable(s)!

yaldaamir
Автор

First off, thank you so much for posting this video...it was very well made and I look forward to exploring other videos you have. As a follow up question to enemenoff's question...what are the differences for MI for random vs. non-random patterns? Did I miss that part in the video? Do you have a source I could visit? Thank you in advance!

chetanm