This is a continuation of my previous blog post on SAS Data Studio and the Code transform. In this post, I will review some additional examples of using the Code transform in a SAS Data Studio data plan to help you prepare your data for analytic reports and/or models.
Create a Unique Identifier Example
The DATA step code below combines the _THREADID_ and the _N_ variables to create a UniqueID for each record.
The variable _THREADID_ returns the number that is associated with the thread that the DATA step is running in a server session. The variable _N_ is an internal system variable that counts the iterations of the DATA step as it automatically loops through the rows of an input data set. The _N_ variable is initially set to 1 and increases by 1 each time the DATA step loops past the DATA statement. The DATA step loops past the DATA statement for every row that it encounters in the input data. Because the DATA step is a built-in loop that iterates through each row in a table, the _N_ variable can be used as a counter variable in this case.
Cluster Records Example
The DATA step code below combines the _THREADID_ and the counter variables to create a unique ClusterNum for each BY group.
This code uses the concept of FIRST.variable to increase the counter if it is the beginning of a new grouping. FIRST.variable and LAST.variable are variables that CAS creates for each BY variable. CAS sets FIRST.variable when it is processing the first observation in a BY group, and sets LAST.variable when it is processing the last observation in a BY group. These assignments enable you to take different actions, based on whether processing is starting for a new BY group or ending for a BY group. For more information, refer to the topic
The DATA step code below outputs the last record of each BY group; therefore, de-duplicating the data set by writing out only one record per grouping.
Below are the de-duplication results on the data set used in the previous Cluster Records Example section.
Below is the resulting customers2.xlsx file in the Public CAS library.
For more information on the available action sets, refer to the SAS® Cloud Analytic Services 3.3: CASL Reference guide.