SAS Notes

Compiled by : Wilson Suraweera @ CGHR
Contents
Random Numbers in SAS
Data Sub-setting
Selection  of  two Random Samples
Identification and Separation of duplicate records in SAS dataset
Reshape Long list variable dataset to Wide list dataset
Missing values replacement with Means of the respective Variables
Logistic Regression with SAS
SAS made easy using Proc SQL
SAS PROC SQL procedure to access external ODBC data sources
SAS String Data Handling
SAS Missing value arithmetic's
          Update_a_Table_using_Another_Table_in_SAS
Text analysis - an Epidemiological Case Study by WS - SAS Institute HUG -01 April2011
 
External Resources
SAS Knowledge Base - Glossary of SAS Procedures from SAS.com
SAS resources from - UCLA
Logistic regression
          Survival Analysis : Usage of Proc LifeTest and Proc PHREG
SAS Dinosaur - Old and New way of SAS programming
Paul Dicman's Web Page for SAS- This little old discusses SAS 8, but useful
Global Statements Dictionary - Alphabetical listing and Description of SAS Key words
SAS Study Blog
SAS Canada - User Groups
         

 

Method 1 : Reshape Long list variable dataset to wide list dataset using Proc TRANSPOSE

This particular example illustrate creation of a new wide format dataset from a long listed dataset.

Dataset 'test_long' consists of data values of 9 different variables recoded from 2 individual cases.
Here each variable name, to which the data values are belong are listed in 'VList' colum.
Their corresponding values for each individul are listed in 'DVal' colum.

We use Proc Transpose procedure to convert this data to its wide format of 9 variables of each individuals.
You would be easily generalised this to any no. of variables.

data test_long;
  input Id_no $	VList $ DVal;
   DATALINES; 
0111	v1	222
0111	v2	225
0111	v3	228
0111	v4	231
0111	v5	234
0111	v6	237
0111	v7	240
0111	v8	243
0111	v9	246
0112	v1	302
0112	v3	303
0112	v4	304
0112	v5	305
0112	v6	306
0112	v7	307
0112	v8	308
0112	v9	309
;
run;


proc transpose data=test_long out=test_wide (drop= _NAME_) prefix=VList;
   by id_no;
   id VList;
   var DVal;
   
run;
	 Proc print data=test_wide; 
	 run; 
	 
 The variable _NAME_ is a auto created variable of SAS, we drop it because no use with our data.
 	 
 Now resulting dataset has 10 columns (for different variables) and 2 rows (for individual cases) 

Resulting data :

Obs 	Id 	varv1 	varv2 	varv3 	varv4 	varv5 	varv6 	varv7 	varv8 	varv9 
1 	0111 	222 	225 	228 	231 	234 	237 	240 	243 	246 
2 	0112 	302 	. 	303 	304 	305 	306 	307 	308 	309 


Method 2: Reshape Wide list variable dataset to Long list dataset using DATA STEP

 
Assume dataset with 6 variables, we need to transpose last 3 columns of it to show the data in vertically.
    This data is global death estimates for 2010-2050
    Variables: Country_code, Country, Period, Age_0_4, Age_5_14, Age_15_plus
We want to integrate deaths shown in 3 different age columns into one variable and show in vertically.

Step 1: data compilation
Step 2: Assign all variables that need to make present in vertical to a array;
Step 3: output all array values using a do loop;
Step 4: Drop all unnecessary variables in the dataset;


 /*Step 1: data compilation */
	 data World_death_h;
	 input Country_code $ Country $ Period $ Age_0_4 Age_5_14 Age_15_plus;
	 datalines;
	   900	WORLD	2010-2015	  40622	  7017	  242828
	   900	WORLD	2015-2020	  37473	  6496	  260780
	   900	WORLD	2020-2025	  34165	  5964	  282418
	   900	WORLD	2025-2030	  30974	  5429	  307458
	   900	WORLD	2030-2035	  28052	  4896	  336518
	   900	WORLD	2035-2040	  25426	  4401	  368495
	   900	WORLD	2040-2045	  22986	  3952	  400712
	   900	WORLD	2045-2050	  20706	  3546	  431274
	;
	 run;

  data World_death_v;
      set World_death_h;
/* Step 2: */
	   array Death {3} Age_0_4 Age_5_14 Age_15_plus;
/* Step 3: output all array values;*/
       do i=1 to 3;
	      Age_cat=i;               /* This is new age category flag */
		  Death_T = Death[i];  /* This is integrated variable for deaths */
		  output ;
	   end;
/* Step 4: Drop all unnecessary variables*/
       drop i Age_0_4 Age_5_14 Age_15_plus;
  run;

	Proc print data=World_death_v;
	  run;

/* Resultin data  */
 
 Obs Country_code 	Country 	Period 	Age_cat 	Death_T 
 1 	900 		  WORLD 	2010-201 1 		40622 
 2 	900 		  WORLD 	2010-201 2 		7017 
 3 	900 		  WORLD	 	2010-201 3 		242828 
 4 	900 		  WORLD 	2015-202 1 		37473 
 5 	900 		  WORLD 	2015-202 2 		6496 
 6 	900 		  WORLD 	2015-202 3 		260780 
 7 	900 		  WORLD 	2020-202 1 		34165 
 8 	900 		  WORLD 	2020-202 2 		5964 
 9 	900 		  WORLD 	2020-202 3 		282418 
10 	900 		  WORLD 	2025-203 1 		30974 
11 	900 		  WORLD 	2025-203 2 		5429 
12 	900 		  WORLD 	2025-203 3 		307458 
13 	900 		  WORLD 	2030-203 1 		28052 
14 	900 		  WORLD 	2030-203 2 		4896 
15 	900 		  WORLD 	2030-203 3 		336518 
16 	900 		  WORLD 	2035-204 1 		25426 
17 	900 		  WORLD 	2035-204 2 		4401 
18 	900 		  WORLD 	2035-204 3 		368495 
19 	900 		  WORLD 	2040-204 1 		22986 
20 	900 		  WORLD 	2040-204 2 		3952 
21 	900 		  WORLD 	2040-204 3 		400712 
22 	900 		  WORLD 	2045-205 1 		20706 
23 	900 		  WORLD 	2045-205 2 		3546 
24 	900 		  WORLD 	2045-205 3 		431274 

 
 
 

Back to Top