SAS Tutorial | How to Restructure Your Data Using Arrays and DO Loops

preview_player
Показать описание
Do you get data that are “short and wide”? Such data sets have one observation per subject and multiple time points and variables within one row. Jennifer Waller, Professor at Augusta University, prefers “long and skinny” data sets. In this SAS How-To tutorial, Jennifer shows you how to restructure data sets by setting up arrays and then using DO loops on those arrays.

Download Data Files

Content Outline
0:00 – Welcome
1:04 – Why Jennifer likes a long and skinny dataset
2:31 – How to set up arrays and use DO loops on the arrays
7:34 – How to output each iteration of this DO loop (OUTPUT statement is key)

Learn more about SAS Software

SUBSCRIBE TO THE SAS USERS YOUTUBE CHANNEL #SASUsers #LearnSAS

ABOUT SAS
SAS is a trusted analytics powerhouse for organizations seeking immediate value from their data. A deep bench of analytics solutions and broad industry knowledge keep our customers coming back and feeling confident. With SAS®, you can discover insights from your data and make sense of it all. Identify what’s working and fix what isn’t. Make more intelligent decisions. And drive relevant change.

CONNECT WITH SAS
Рекомендации по теме
Комментарии
Автор

Thanks for viewing and please share this with your SAS peeps. I welcome any comments/questions. Also, if you are not subscribed to the SAS Users YouTube channel, you should be. There are some great how-to videos. Thanks again and good luck coding!

jenniferwaller
Автор

This is really good! I never understood how my professor taught this in the classroom, but this is very easy to understand! Thanks!

zijunzhang
Автор

Excellentl. Will use this to explain to the modelers I support.

PabloJNogueras
Автор

Thank you. It gives me some ideas in the step I am working now.

kiwiweiyt
Автор

This is fantastic, Jennifer! Thank you!

rogerward
Автор

Excellent! How do you restructure date, numerical and text data from long to short? Do you have any video?

md.barkatullah
Автор

Hi there,
I have a data where there is one original account number which has four duplicate accounts under it .. I want to check if these accounts are sorted in ascending order using loops

aenuguladeepankar
Автор

Excellent! I was left with a doubt: Why did you request to drop until field "34"?

EderbalFilho
Автор

Can you please help me to create vertical array and sum up backward based on some condition.??

ehsaas_ke_sath
Автор

Great! Do I use arrays if I want to do the following?
I have COVID dataset like this
patient_id collection_date test_type test_result
1 3/1/2020 Antibody Positive
1 3/3/2020 PCR Positive
1 3/14/2020 PCR Negative
2 2/12/2020 Antibody Negative
2 4/10/2020 Antibody Positive
3 6/10/2020 Antibody Positive
3 6/15/2020 PCR Negative

I want to output to new table ONLY those patients that have Ab positive, but NO PCR done. So in this case, only patient_id 2 (second line showing Antibody Positive) should output because he had no PCR done.

abstract-thoughts
Автор

What would the difference be between 1) creating the 4 arrays then dropping them, and 2) making the arrays temporary using the _temporary_ option in the array statements?

benhouck
Автор

If there are 34 variables in each group don't understand why we iterate from 1 to 33 and not from 1 to 34? (i=0 to 33 instead of i=1 to 33). What happens if I were to look at that data set today? What would need to change in the code?

michaeltuchman
Автор

how to do rotations(converting variables into observations-observations into variables) by using array-syntax

raghucharan
Автор

high, I'm just wondering why I still succeeded when I had

array pos {*} pos1-pos6; rather than
array pos {6} pos1-pos6;

shenwaskijeff
Автор

A fantastically beautiful woman and extremely useful information!

olgakozlova
Автор

Hi Team,
I am working on the restructure of the long to wide format and in the data I have
data main(rename=visit=visitnum);
do usubjid='101';
do visit=10, 20, 30, 40, 50;
paramcd='ABC';
if visit=10 then aval=60;
if visit=20 then aval=65;
if visit=30 then aval=.;
if visit=40 then aval=45;
if visit=50 then aval=41;
output;
end;
end;
run;

when I am restructuring with the below code:

data trial1;
set main;
by usubjid paramcd;
array visitn(5) visitn10 visitn20 visitn30 visitn40 visitn50 ;
retain visitn;
if first.paramcd then do ;
do i=10, 20, 30, 40, 50;
visitn(i)=.;
end;end;
visitn(visitnum)=aval;
if last.paramcd;
run;
I am getting subscript error, how to resolve this .
Your help is appreciated.

priyankaveerapaneni
Автор

Why is a negative case considered a cases? Thought negative means No virus found and therefore wouldn't be considered a covid19 case? please clarify ... Thanks

felixamey
join shbcf.ru