This particular example illustrate creation of a new wide format dataset from a long listed dataset.
Dataset 'test_long' consists of data values of 9 different variables recoded from 2 individual cases.
Here each variable name, to which the data values are belong are listed in 'VList' colum.
Their corresponding values for each individul are listed in 'DVal' colum.
We use Proc Transpose procedure to convert this data to its wide format of 9 variables of each individuals.
You would be easily generalised this to any no. of variables.
data test_long;
input Id_no $ VList $ DVal;
DATALINES;
0111 v1 222
0111 v2 225
0111 v3 228
0111 v4 231
0111 v5 234
0111 v6 237
0111 v7 240
0111 v8 243
0111 v9 246
0112 v1 302
0112 v3 303
0112 v4 304
0112 v5 305
0112 v6 306
0112 v7 307
0112 v8 308
0112 v9 309
;
run;
proc transpose data=test_long out=test_wide (drop= _NAME_) prefix=VList;
by id_no;
id VList;
var DVal;
run;
Proc print data=test_wide;
run;
The variable _NAME_ is a auto created variable of SAS, we drop it because no use with our data.
Now resulting dataset has 10 columns (for different variables) and 2 rows (for individual cases)
Resulting data :
Obs Id varv1 varv2 varv3 varv4 varv5 varv6 varv7 varv8 varv9
1 0111 222 225 228 231 234 237 240 243 246
2 0112 302 . 303 304 305 306 307 308 309
Assume dataset with 6 variables, we need to transpose last 3 columns of it to show the data in vertically.
This data is global death estimates for 2010-2050
Variables: Country_code, Country, Period, Age_0_4, Age_5_14, Age_15_plus
We want to integrate deaths shown in 3 different age columns into one variable and show in vertically.
Step 1: data compilation
Step 2: Assign all variables that need to make present in vertical to a array;
Step 3: output all array values using a do loop;
Step 4: Drop all unnecessary variables in the dataset;
/*Step 1: data compilation */
data World_death_h;
input Country_code $ Country $ Period $ Age_0_4 Age_5_14 Age_15_plus;
datalines;
900 WORLD 2010-2015 40622 7017 242828
900 WORLD 2015-2020 37473 6496 260780
900 WORLD 2020-2025 34165 5964 282418
900 WORLD 2025-2030 30974 5429 307458
900 WORLD 2030-2035 28052 4896 336518
900 WORLD 2035-2040 25426 4401 368495
900 WORLD 2040-2045 22986 3952 400712
900 WORLD 2045-2050 20706 3546 431274
;
run;
data World_death_v;
set World_death_h;
/* Step 2: */
array Death {3} Age_0_4 Age_5_14 Age_15_plus;
/* Step 3: output all array values;*/
do i=1 to 3;
Age_cat=i; /* This is new age category flag */
Death_T = Death[i]; /* This is integrated variable for deaths */
output ;
end;
/* Step 4: Drop all unnecessary variables*/
drop i Age_0_4 Age_5_14 Age_15_plus;
run;
Proc print data=World_death_v;
run;
/* Resultin data */
Obs Country_code Country Period Age_cat Death_T
1 900 WORLD 2010-201 1 40622
2 900 WORLD 2010-201 2 7017
3 900 WORLD 2010-201 3 242828
4 900 WORLD 2015-202 1 37473
5 900 WORLD 2015-202 2 6496
6 900 WORLD 2015-202 3 260780
7 900 WORLD 2020-202 1 34165
8 900 WORLD 2020-202 2 5964
9 900 WORLD 2020-202 3 282418
10 900 WORLD 2025-203 1 30974
11 900 WORLD 2025-203 2 5429
12 900 WORLD 2025-203 3 307458
13 900 WORLD 2030-203 1 28052
14 900 WORLD 2030-203 2 4896
15 900 WORLD 2030-203 3 336518
16 900 WORLD 2035-204 1 25426
17 900 WORLD 2035-204 2 4401
18 900 WORLD 2035-204 3 368495
19 900 WORLD 2040-204 1 22986
20 900 WORLD 2040-204 2 3952
21 900 WORLD 2040-204 3 400712
22 900 WORLD 2045-205 1 20706
23 900 WORLD 2045-205 2 3546
24 900 WORLD 2045-205 3 431274