SAS Notes

Compiled by : Wilson Suraweera @ CGHR
Contents
Random Numbers in SAS
Data Sub-setting
Selection  of  two Random Samples
Identification and Separation of duplicate records in SAS dataset
Reshape Long list variable dataset to Wide list dataset
Missing values replacement with Means of the respective Variables
Logistic Regression with SAS
SAS made easy using Proc SQL
SAS PROC SQL procedure to access external ODBC data sources
SAS String Data Handling
SAS Missing value arithmetic's
          Update_a_Table_using_Another_Table_in_SAS
Text analysis - an Epidemiological Case Study by WS - SAS Institute HUG -01 April2011
 
External Resources
SAS Knowledge Base - Glossary of SAS Procedures from SAS.com
SAS resources from - UCLA
Logistic regression
          Survival Analysis : Usage of Proc LifeTest and Proc PHREG
SAS Dinosaur - Old and New way of SAS programming
Paul Dicman's Web Page for SAS- This little old discusses SAS 8, but useful
Global Statements Dictionary - Alphabetical listing and Description of SAS Key words
SAS Study Blog
SAS Canada - User Groups
         

  1. Using Proc SQL

    SAS implements a comprehensive SQL language facility for data manipulation purposes. Using PROC SQL, the SAS structured query language procedure to insert, delete, modify and retrieve information from SAS data tables

    SAS does not allow SQL statements to be used in the DATA step. However, SAS provides PROC SQL which allows operations on SAS datasets with SQL. The SAS terms dataset, observation, and variable respectively correspond to the SQL terms table, row, and column.

    The syntax of PROC SQL is:
     

     PROC SQL < option(s)> ;
         ALTER alter-statement;
         CREATE create-statement;
         DELETE delete-statement;
         DESCRIBE describe-statement;
         DROP drop-statement;
         INSERT insert statement;
         RESET < option < option &gt...> ;
         SELECT select-statement;
         UPDATE update-statement;
         VALIDATE validate-statement;
    

     

    ALTER changes the attributes of columns or adds or drops columns
    CREATE creates tables
    DELETE removes rows from a table
    DESCRIBE display a view definition
    DROP deletes the table
    INSERT inserts a new row into the table
    RESET allows options to be changed during execution
    SELECT retrives data and outputs results
    UPDATE modifies columns in existing rows
    VALIDATE checks a query-expression for syntactic accuracy.

    Note that the DELETE statement has exactly the same function as the
    DELETE statement used in the DATA step. However, DROP in PROC SQL
    will delete the entire table(dataset) rather the columns(variables).
     

    Some of options which may be useful: INOBS=n restricts the number of rows processed from a source
    NUMBER|NONUMBER includes a column with the row number
    PRINT|NOPRINT turns printing for SELECT statements on or off
     

     

Learn More  : http://help.pop.psu.edu/help-by-software-package/sas/sql-with-sas


(1)  Proc SQL SELECT CASE-WHEN use for variable recoding purposes

 Following example illustrates how to use Proc sql procedure to create a new variable by using the algorithm,
 
          Select
              case
                 when
                 else
               end as 
               
   Here we have 2 variables called  'deathdate' and 'str_deathdate' in mortality SAS dataset called 'tem'. 
  'Deathdate' is the original death date in the moratality data. 'str_deathdate' is a secondary information  variable 
  to use where information of 'Deathdate' variable is missing.
  
  We create a new flag variable called  'flag_Misdate' to indicate 4 levels where we can make use of auxilary variable to 
  replace the missing information of the original variable and write all these information to a new SAs dataset called 'tem1'.
  
Example:
proc sql;
  	create table tem1  as
   		SELECT UniqNo, DeathDate, str_deathdate,
			CASE
			    WHEN (DeathDate is missing)   THEN '0'
				WHEN ((DeathDate is missing)  and (str_deathdate is not missing)) THEN '1'
				WHEN ((DeathDate is not missing)  and (str_deathdate is not missing)) THEN '2'
				ELSE '3'
			 END AS flag_Misdate 
	    FROM tem;
QUIT;
--------------------------------------------------------------------------------------------------

(2) Assign the Row numbers to the data


An undocumented function, MONOTONIC( )  can be used to assign line numbers to SAS data (version 9 and later) 

PROC SQL;

	CREATE  TABLE   Data_with_row_num  AS
	SELECT  *, MONOTONIC() AS ROW_NUM  FROM  Data_without_row_num;

QUIT;


Note: This result is very similar to datastep _N_ use for creation of ROW NUMBERS.

Data Step:

	DATA Data_with_row_num;

		SET Data_without_row_num;
		ROW_NUM=_N_;

	RUN;
 
 

Back to Top