SAS Studio is the latest way you can access SAS. This newer interface allows users to reach SAS through a web browser, offering a number of unique ways that SAS can be optimized. At SAS Global Forum 2018, Lora Delwiche (SAS) and Susan J Slaughter (Avocet Solutions) gave the presentation, “SAS Studio: A New Way to Program in SAS.” This post reviews the paper, offering you insights of how to enhance your SAS Studio programming performance.

This new interface is a popular one, as it is included in Base SAS and used for SAS University Edition and SAS OnDemand for Academics. It can be considered a self-serving system, since you write programs in SAS Studio itself that are then processed through SAS and delivered results. Its ease of accessibility from a range of computers is putting it in high demand – which is why you should learn how to optimize its use.

### How to operate

A SAS server processes your coding and returns the results to your browser, in order to make the programs run successfully. By operating in Programmer mode, you are given the capabilities to view Code, Log, and Results. On the right side of the screen you can write your code, and the toolbar allows you to access the many different tools that are offered.

Libraries are used to access your SAS data sets, where you can also see the variables contained in each set. You can create your own libraries, and set the path for your folder through SAS Studio.

In order to view each data set, the navigation pane can also be used. Right click on the data set name and select “Open” to access files through this method. These datasets can be adjusted in a number of ways: columns can be shifted around by dragging the headings; column sizes can be adjusted; the top right corner has arrows to view more information; clicking on the column heading will sort that data.

In order to control your data easily, filters can be used. Filters are accessed by right-clicking the column heading and selecting the filter that best fits your needs.

### How to successfully code

A unique feature to SAS Studio is its code editor that will automatically format your code. Clicking on the icon will properly format each statement and put it on its own line. Additionally, syntax help pops up as you type to give you possible suggestions in your syntax, a tool that can be turned on or off through the Preferences window.

One tool that’s particularly useful is the snippet tool, where you can copy and paste frequently used code.

### Implementing and Results

After code is written, the Log tool can help you review your code, whereas Results will generate your code carried out after it has been processed. The Results tab will give you shareable items that can be saved or printed for analysis purposes.

### Conclusion

These insights offer just a glimpse of all of the capabilities in programming through SAS Studio. Through easy browser access, your code can be shared and analyzed with a few clicks.

### Other SAS Global Forum Programming Papers of Interest

SAS Studio: A new way to program in SAS was published on SAS Users.

As oil and water, hardware and software don't mix, but rather work hand-in-hand together to deliver value to us, their creators. But sometimes, we make mistakes, behave erratically, or deal with others who might make mistakes, behave erratically, or even take advantage of our technologies.

Therefore, it is imperative for developers, whether hardware or software engineers, to foresee unintended (probable or improbable) system usages and implement features that will make their creations foolproof, that is protected from misuse.

In this post I won’t lecture you about various techniques of developing foolproof solutions, nor will I present even a single snippet of code. Its purpose is to stimulate your multidimensional view of problems, to unleash your creativity and to empower you to become better at solving problems, whether you develop or test software or hardware, market or sell it, write about it, or just use it.

The anecdote I’m about to tell you originated in Russia, but since there was no way to translate this fictitious story exactly without losing its meaning, I attempted to preserve its essence while adapting it to the “English ear” with some help from Sir Arthur Conan Doyle. Well, sort of. Here goes.

## The Art of Deduction

Mr. Sherlock Holmes and Dr. Watson were traveling in an automobile in northern Russia. After many miles alone on the road, they saw a truck behind them. Soon enough, the truck pulled ahead, and after making some coughing noises, suddenly stopped right in front of them. Sherlock Holmes stopped their car as well.

Dr. Watson: What happened? Has it broken?

Holmes: I don’t think so. Obviously, it ran out of gas.

The truck driver got out of his cabin, grabbed a bucket hanging under the back of the truck and ran towards a ditch on the road shoulder. He filled the bucket with standing water from the ditch and ran back to his truck. Then, without hesitation, he carefully poured the bucketful of water into the gas tank. Obviously in full confidence of what he’s doing, he returned to the truck, started the engine, and drove away.

Dr. Watson (in astonishment): What just happened? Are Russian ditches filled with gasoline?

Holmes: Relax, dear Watson, it was ordinary ditch water. But I wouldn’t suggest drinking it.

Dr. Watson (still in disbelief): What, do their truck engines work on water, then?

Holmes: Of course not, it’s a regular Diesel engine.

Dr. Watson: Then how is that possible? If the truck was out of gas, how was it able to start back up after water was added to the tank?!

Who knew Sherlock Holmes had such engineering acumen!

Holmes: “Elementary, my dear Watson. The fuel intake pipe is raised a couple inches above the bottom of the gas tank. That produces the effect of seemingly running out of gas when the fuel falls below the pipe, even though there is still some gas left in the tank. Remember, oil and water don't mix.  When the truck driver poured a bucketful of water into the gas tank, that water – having a higher density than the Diesel fuel – settled in the bottom, pushing the fuel above the intake opening thus making it possible to pump it to the engine.”

After a long pause – longer than it usually takes to come to grips with reality – Dr. Watson whispered in bewilderment.

Dr. Watson: Я не понимаю, I don’t understand!

Then, still shaken, he asked the only logical question a normal person could possibly ask under the circumstances.

Dr. Watson: Why would they raise the fuel intake pipe from the tank bottom in the first place?

Holmes: Ah, Watson, it must be to make it foolproof. What if some fool decides to pour a bucket of water in the gas tank!

Are you developing foolproof solutions? was published on SAS Users.

Regardless of the environment in which you run SAS (whether it is SAS® Foundation, SAS® Studio, or SAS® Enterprise Guide®), SAS uses a default location on your host system as a working directory. When you do not specify the use of a different directory within your code, the default location is where SAS stores output.

Beginning with SAS® 9.4 TS1M4, you can use a new DATA step function, DLGCDIR, to change the location for your working directory. You can use this function in Microsoft Windows or UNIX/Linux environments.

Make sure that any directory that you specify with the DLGCDIR function is an existing directory that you have Write or Update access to.

### Finding Out What Your Current Directory Is

To determine what your current working directory in SAS is, submit the following code:

 data _null_; rc=dlgcdir(); put rc=; run;

The following sample code for Windows sets the working directory in SAS as the TEMP folder on your C: drive:

 data _null_; rc=dlgcdir("c:\temp"); put rc=; run;

This sample code (for a Linux environment) changes the working directory in SAS to /u/your/linux/directory:

 data _null_; rc=dlgcdir("/u/your/linux/directory"); put rc=; run;

### Changing Your Directory: Other Tips

The DLGCDIR function temporarily changes the working directory for the current SAS or client session. However, you can create an autoexec file that contains the DATA step code that uses the DLGCDIR function. The autoexec file then executes the code each time you invoke SAS.

In most situations, it is still recommended that you specify the intended target directory for the Output Delivery System (ODS) and in other SAS statements. For example, when you use the ODS HTML statement, you should specify the target directory with the PATH option, as shown here:

 ods html path="c:\temp" (url=none) file="sasoutput.html";

Similarly, with the ODS PDF statement, you should specify the target directory with the FILE option, as shown here:

 ods pdf file="c:\temp\sasoutput.pdf";

I hope you've found this post helpful.

Datasets can present themselves in different ways. Identical data can bet arranged differently, often as wide or tall datasets. Generally, the tall dataset is better. Learn how to convert wide data into tall data with PROC TRANSPOSE.

The post "Wide" versus "Tall" data: PROC TRANSPOSE v. the DATA step appeared first on SAS Learning Post.

Have you ever been working in the macro facility and needed a macro function, but you could not locate one that would achieve your task? With the %SYSFUNC macro function, you can access most SAS® functions. In this blog post, I demonstrate how %SYSFUNC can help in your programming needs when a macro function might not exist. I also illustrate the formatting feature that is built in to %SYSFUNC. %SYSFUNC also has a counterpart called %QSYSFUNC that masks the returned value, in case special characters are returned.
%SYSFUNC enables the execution of SAS functions and user-written functions, such as those created with the FCMP procedure. Within the DATA step, arguments to the functions require quotation marks, but because %SYSFUNC is a macro function, you do not enclose the arguments in quotation marks. The examples here demonstrate this.

%SYSFUNC has two possible arguments. The first argument is the SAS function, and the second argument (which is optional) is the format to be applied to the value returned from the function. Suppose you had a report and within the title you wanted to issue today’s date in word format:

 title "Today is %sysfunc(today(),worddate20.)";

The title appears like this:

 "Today is July 4, 2018"

Because the date is right-justified, there are leading blanks before the date. In this case, you need to introduce another function to remove the blank spaces. Luckily %SYSFUNC enables the nesting of functions, but each function that you use must have its own associated %SYSFUNC. You can rewrite the above example by adding the STRIP function to remove any leading or trailing blanks in the value:

 title "Today is %sysfunc(strip(%sysfunc(today(),worddate20.)))";

The title now appears like this:

 "Today is July 4, 2018"

The important thing to notice is the use of two separate functions. Each function is contained within its own %SYSFUNC.

Suppose you had a macro variable that contained blank spaces and you wanted to remove them. There is no macro COMPRESS function that removes all blanks. However, with %SYSFUNC, you have access to one. Here is an example:

 %let list=a b c; %put %sysfunc(compress(&list));

The value that is written to the log is as follows:

 abc

In this last example, I use %SYSFUNC to work with SAS functions where macro functions do not exist.

The example checks to see whether an external file is empty. It uses the following SAS functions: FILEEXIST, FILENAME, FOPEN, FREAD, FGET, and FCLOSE. There are other ways to accomplish this task, but this example illustrates the use of SAS functions within %SYSFUNC.

 %macro test(outf); %let filrf=myfile;   /* The FILEEXIST function returns a 1 if the file exists; else, a 0 is returned. The macro variable &OUTF resolves to the filename that is passed into the macro. This function is used to determine whether the file exists. In this case you want to find the file that is contained within &OUTF. Notice that there are no quotation marks around the argument, as you will see in all cases below. If the condition is false, the %ELSE portion is executed, and a message is written to the log stating that the file does not exist.*/   %if %sysfunc(fileexist(&outf)) %then %do;   /* The FILENAME function returns 0 if the operation was successful; else, a nonzero is returned. This function can assign a fileref for the external file that is located in the &OUTF macro variable. */   %let rc=%sysfunc(filename(filrf,&outf));   /* The FOPEN function returns 0 if the file could not be opened; else, a nonzero is returned. This function is used to open the external file that is associated with the fileref from &FILRF. */   %let fid=%sysfunc(fopen(&filrf));   /* The %IF macro checks to see whether &FID has a value greater than zero, which means that the file opened successfully. If the condition is true, we begin to read the data in the file. */   %if &fid > 0 %then %do;   /* The FREAD function returns 0 if the read was successful; else, a nonzero is returned. This function is used to read a record from the file that is contained within &FID. */   %let rc=%sysfunc(fread(&fid));   /* The FGET function returns a 0 if the operation was successful. A returned value of -1 is issued if there are no more records available. This function is used to copy data from the file data buffer and place it into the macro variable, specified as the second argument in the function. In this case, the macro variable is MYSTRING. */   %let rc=%sysfunc(fget(&fid,mystring));   /* If the read was successful, the log will write out the value that is contained within &MYSTRING. If nothing is returned, the %ELSE portion is executed. */   %if &rc = 0 %then %put &mystring; %else %put file is empty;   /* The FCLOSE function returns a 0 if the operation was successful; else, a nonzero value is returned. This function is used to close the file that was referenced in the FOPEN function. */   %let rc=%sysfunc(fclose(&fid)); %end;   /* The FILENAME function is used here to deassign the fileref FILRF. */   %let rc=%sysfunc(filename(filrf)); %end; %else %put file does not exist; %mend test; %test(c:\testfile.txt)

There are times when the value that is returned from the function used with %SYSFUNC contains special characters. Those characters then need to be masked. This can be done easily by using %SYSFUNC’s counterpart, %QSYSFUNC. Suppose we run the following example:

 %macro test(dte); %put &dte; %mend test;   %test(%sysfunc(today(), worddate20.))

The above code would generate an error in the log, similar to the following:

 1 %macro test(dte); 2 %put &dte; 3 %mend test; 4 5 %test(%sysfunc(today(), worddate20.)) MLOGIC(TEST): Beginning execution. MLOGIC(TEST): Parameter DTE has value July 20 ERROR: More positional parameters found than defined. MLOGIC(TEST): Ending execution.

The WORDDATE format would return the value like this: July 20, 2017. The comma, to a parameter list, represents a delimiter, so this macro call is pushing two positional parameters. However, the definition contains only one positional parameter. Therefore, an error is generated. To correct this problem, you can rewrite the macro invocation in the following way:

 %test(%qsysfunc(today(), worddate20.))

The %QSYSFUNC macro function masks the comma in the returned value so that it is seen as text rather than as a delimiter.

For a list of the functions that are not available with %SYSFUNC, see the “How to expand the number of available SAS functions within the macro language was published on SAS Users.

The release of SAS Viya 3.3 has brought some nice data quality features. In addition to the visual applications like Data Studio or Data Explorer that are part of the Data Preparation offering, one can leverage data quality capabilities from a programming perspective.

For the time being, SAS Viya provides two ways to programmatically perform data quality processing on CAS data:

• The Data Step Data Quality functions.
• The profile CAS action.

To use Data Quality programming capabilities in CAS, a Data Quality license is required (or a Data Preparation license which includes Data Quality).

### Data Step Data Quality functions

The list of the Data Quality functions currently supported in CAS are listed here and below:

They cover casing, parsing, field extraction, gender analysis, identification analysis, match codes and standardize capabilities.

As for now, they are only available in the CAS Data Step. You can’t use them in DS2 or in FedSQL.

To run in CAS certain conditions must be met. These include:

• Both the input and output data must be CAS tables.
• All language elements must be supported in the CAS Data Step.
• Others.

Let’s look at an example:

cas mysession sessopts=(caslib="casuser") ;   libname casuser cas caslib="casuser" ;   data casuser.baseball2 ; length gender $1 mcName parsedValue tokenNames lastName firstName varchar(100) ; set casuser.baseball ; gender=dqGender(name,'NAME','ENUSA') ; mcName=dqMatch(name,'NAME',95,'ENUSA') ; parsedValue=dqParse(name,'NAME','ENUSA') ; tokenNames=dqParseInfoGet('NAME','ENUSA') ; if _n_=1 then put tokenNames= ; lastName=dqParseTokenGet(parsedValue,'Family Name','NAME','ENUSA') ; firstName=dqParseTokenGet(parsedValue,'Given Name','NAME','ENUSA') ; run ; Here, my input and output tables are CAS tables, and I’m using CAS-enabled statements and functions. So, this will run in CAS, in multiple threads, in massively parallel mode across all my CAS workers on in-memory data. You can confirm this by looking for the following message in the log: NOTE: Running DATA step in Cloud Analytic Services. NOTE: The DATA step will run in multiple threads. I’m doing simple data quality processing here: • Determine the gender of an individual based on his(her) name, with the dqGender function. • Create a match code for the name for a later deduplication, with the dqMatch function. • Parse the name using the dqParse function. • Identify the name of the tokens produced by the parsing function, with the dqParseInfoGet function. • Write the token names in the log, the tokens for this definition are: Prefix,Given Name,Middle Name,Family Name,Suffix,Title/Additional Info • Extract the “Family Name” token from the parsed value, using dqParseTokenGet. • Extract the “Given Name” token from the parsed value, again using dqParseTokenGet. I get the following table as a result: Performing this kind of data quality processing on huge tables in memory and in parallel is simply awesome! ### The dataDiscovery.profile CAS action This CAS action enables you to profile a CAS table: • It offers 2 algorithms, one is faster but uses more memory. • It offers multiple options to control your profiling job: • Columns to be profiled. • Number of distinct values to be profiled (high-cardinality columns). • Number of distinct values/outliers to report. • It provides identity analysis using RegEx expressions. • It outputs the results to another CAS table. The resulting table is a transposed table of all the metrics for all the columns. This table requires some post-processing to be analyzed properly. Example: proc cas; dataDiscovery.profile / algorithm="PRIMARY" table={caslib="casuser" name="product_dim"} columns={"ProductBrand","ProductLine","Product","ProductDescription","ProductQuality"} cutoff=20 frequencies=10 outliers=5 casOut={caslib="casuser" name="product_dim_profiled" replace=true} ; quit ; In this example, you can see: • How to specify the profiling algorithm (quite simple: PRIMARY=best performance, SECONDARY=less memory). • How to specify the input table and the columns you want to profile. • How to reduce the number of distinct values to process using the cutoff option (it prevents excessive memory use for high-cardinality columns, but might show incomplete results). • How to reduce the number of distinct values reported using the frequencies option. • How to specify where to store the results (casout). So, the result is not a report but a table. The RowId column needs to be matched with A few comments/cautions on this results table: • DoubleValue, DecSextValue, or IntegerValue fields can appear on the output table if numeric fields have been profiled. • DecSextValue can contain the mean (metric #1008), median (#1009), standard deviation (#1022) and standard error (#1023) if a numeric column was profiled. • It can also contain frequency distributions, maximum, minimum, and mode if the source column is of DecSext data type which is not possible yet. • DecSext is a 192-bit fixed-decimal data type that is not supported yet in CAS, and consequently is converted into a double most of the time. Also, SAS Studio cannot render correctly new CAS data types. As of today, those metrics might not be very reliable. • Also, some percentage calculations might be rounded due to the use of integers in the Count field. • The legend for metric 1001 is not documented. Here it is: 1: CHAR 2: VARCHAR 3: DATE 4: DATETIME 5: DECQUAD 6: DECSEXT 7: DOUBLE 8: INT32 9: INT64 10: TIME A last word on the profile CAS action. It can help you to perform some identity analysis using patterns defined as RegEx expressions (this does not use the QKB). Here is an example: proc cas; dataDiscovery.profile / table={caslib="casuser" name="customers"} identities={ {pattern="PAT=-]? ?999[- ]9999",type="USPHONE"}, {pattern= "PAT=^99999[- ]9999$",type="ZIP4"}, {pattern= "PAT=^99999$",type="ZIP"}, {pattern= "[^ @]+@[^ @]+\.[A-Z]{2,4}",type="EMAIL"}, {pattern= "^(?i:A[LKZR]|C[AOT]|DE|FL|GA|HI|I[ADLN]|K[SY]|LA|M[ADEINOST]|N[CDEHJMVY]|O[HKR]|PA|RI|S[CD]|T[NX]|UT|V[AT]|W[AIVY])$",type="STATE"} } casOut={caslib="casuser" name="customers_profiled" replace="true"} ; quit ;

I hope this post has been helpful.

SAS Data Studio is a new application in SAS Viya 3.3 that provides a mechanism for performing simple, self-service data preparation tasks to prepare data for use in SAS Visual Analytics or other applications. It is accessed via the Prepare Data menu item or tile on SAS Home. Note: A user must belong to the Data Builders group in order to have access to this menu item.

In SAS Data Studio, you can either select to create a new data plan or open an existing one. A data plan starts with a source table and consists of transforms (steps) that are performed against that table. A plan can be saved and a target table can be created based on the transformations applied in the plan.

SAS Data Studio

In a previous blog post, I discussed the Data Quality transforms in SAS Studio.  This post is about the Code transform which enables you to create custom code to perform actions or transformations on a table. To add custom code using the Code transform, select the code language from the drop-down menu, and then enter the code in the text box.  The following code languages are available: CASL or DATA step.

Code Transform in SAS Data Studio

Each time you run a plan, the table and library names might change. To avoid errors, you must use variables in place of table and caslib names in your code within SAS Data Studio. Indicating variables in place of table and library names eliminates the possibility that the code will fail due to name changes.  Errors will occur if you use literal values. This is because session table names can change during processing.  Use the following variables:

• _dp_inputCaslib – variable for the input CAS library name.
• _dp_inputTable – variable for the input table name.
• _dp_outputCaslib – variable for the output CAS library name.
• _dp_outputTable –  variable for the output table name.

Note: For DATA step only, variables must be enclosed in braces, for example, data {{_dp_outputTable}} (caslib={{_dp_outputCaslib}});.

The syntax of “varname”n is needed for variable names with spaces and/or special characters.  Refer to the Avoiding Errors When Using Name Literals help topic for more Information.  There are also several

CASL Code Example

The CASL code example above uses the ActionSet fedSQL to create a summary table of counts by the standardized State value.  The results of this code are pictured below.

Results from CASL Code Example

DATA Step Code Example

In this DATA step code example above, the BY statement is used to group all records with the same BY value. If you use more than one variable in a BY statement, a BY group is a group of records with the same combination of values for these variables. Each BY group has a unique combination of values for the variables.  On the CAS server, there is no guarantee of global ordering between BY groups. Each DATA step thread can group and order only the rows that are in the same data partition (thread).  Refer to the help topic

Results from DATA Step Code Example

For more information about DATA step, refer to the In my next blog post, I will review some more code examples that you can use in the Code transform in SAS Data Studio. For more information on SAS Data Studio and the Code transform, please refer to this SAS Data Studio Code Transform (Part 1) was published on SAS Users.

It is not laziness—it is efficiency!!! Programmers are often called lazy; we even call ourselves lazy. But we are not lazy, we are just being efficient. It makes no sense to type the same code over and over again or use more keystrokes than are absolutely necessary.

### Keyboard Macros

You might not have heard of keyboard macros. Or, perhaps, you do not know how they could help you. I am very fond of keyboard macros; let me show you why!

In SAS Technical Support, supporting the SAS® Output Delivery System (ODS) and Base SAS® procedures, I often use the same statements to set up test programs. For example, I want any style templates that I create to go into the Work directory. I also use the same data set name all of the time. I have created keyboard macros for the statements, data set names, and options that I use daily.

When I press Ctrl+Alt+w, the following is inserted into my program:

ods path(prepend) work.templat(update);

When I press Ctrl+Alt+p, the following is inserted into my program:

sashelp.class

How did I do that? I recorded a keyboard macro that contains the code that I want. Then, I assigned keys that insert the code when I press them.

Here are the steps for recording your very own keyboard macro in the SAS Enhanced Editor:

1.  Select Tools ► Keyboard Macros ► Record New Macro.

2.  Enter the code that you want to be your new keyboard macro. Consider typing slowly because any backspaces that you use are included in the recording.

3.  After you are done entering text, you need to tell SAS to stop recording. Select Tools ►Keyboard Macros ►Stop Recording.

4.  A pop-up dialog box appears that lets you give the new macro a name and assign the keys that you want to be associated with the macro. You can set the key combination that make sense to you. Just make sure that you do not use a combination that is already assigned to another macro.

Now, whenever you need to insert that piece of code, just use the keys that you assigned!

In SAS® Enterprise Guide®, you can find keyboard macros under Program ► Editor Macros, instead of the Tools drop-down menu. The recording and key assignment steps are the same in both applications.

You can also create keyboard macros that perform tasks.

The Macros selection opens a pop-up dialog box that contains a Create button.

Clicking Create opens another dialog box.

With the Categories option set to All, you can see all of the commands that are already available. Moving these over to the Keyboard macro contents section enables you to build a macro that performs a task that you need to accomplish on a regular basis.

For example, I have combined these commands to select a whole block of code, like from the PROC statement down to the RUN statement.

Keyboards macros are available in the Enhanced Editor in Display Manager SAS (DMS) and in SAS Enterprise Guide. They cannot be used with the Program Editor in DMS or in SAS® Studio.

You can export and import keyboard macros. The file created when you export has the .kmf extension. You can find the options for importing and exporting in the Macros dialog box. You can share your keyboard macros with your friends, or just to keep them as a backup copy in case you need to reinstall SAS.

For more information, see the Using "Keyboard Macros" section in "Using the Enhanced Editor."

### Function Keys

You have probably used the F8 key to submit your program, or the F4 key to recall your last program. Did you know that you can set or change those instructions?

In the Enhanced Editor, you can get the list of assigned keys by entering keys into the command bar or by selecting Keys under Tools ► Options.

I test a lot, which means that I am routinely clearing the log, the results viewer, and the output window. I have assigned an F key, F12, to clear everything and bring the focus back to the Enhanced Editor (see the commands in the screenshot below). I have to press only one key to clean everything up! I use the F12 key over and over again.

The keys that you assign in DMS are valid from both the Enhanced Editor and the Program Editor.

SAS Enterprise Guide includes a large number of commands by default. A lot of them already have keys assigned, but some do not. You can see the list of the commands and their assigned keys by selecting Enhanced Editor Keys under the Program drop-down menu.

Currently, it is not possible to modify the function keys in SAS Studio. However, a number of keys are already defined that you might find useful. You can see the function key shortcuts by clicking the question mark in the upper right, choosing SAS Studio help, and then selecting the option for Accessibility Features. Here are links to additional resources:

I highly recommend using keyboard macro and function keys. Why type the same thing over and over again? Increase your productivity by handing the repetitive tasks over to SAS.

Are you interested in using SAS Visual Analytics 8.2 to visualize a state by regions, but all you have is a county shapefile?  As long as you can cross-walk counties to regions, this is easier to do than you might think.

Here are the steps involved:

Step 1

Obtain a county shapefile and extract all components to a folder. For example, I used the US Counties shapefile found in this SAS Visual Analytics community post.

Note: Shapefile is a geospatial data format developed by ESRI. Shapefiles are comprised of multiple files. When you unzip the shapefile found on the community site, make sure to extract all of its components and not just the .shp. You can get more information about shapefiles from this Wikipedia article:  https://en.wikipedia.org/wiki/Shapefile.

Step 2

Run PROC MAPIMPORT to convert the shapefile into a SAS map dataset.

libname geo 'C:\Geography'; /*location of the extracted shapefile*/   proc mapimport datafile="C:\Geography\UScounties.shp" out=geo.shapefile_counties; run;

Step 3

Add a Region variable to your SAS map dataset. If all you need is one state, you can subset the map dataset to keep just the state you need. For example, I only needed Texas, so I used the State_FIPS variable to subset the map dataset:

proc sql; create table temp as select *, /*cross-walk counties to regions*/ case when name='Anderson' then '4' when name='Andrews' then '9' when name='Angelina' then '5' when name='Aransas' then '11', <……> when name='Zapata' then '11' when name='Zavala' then '8' end as region from geo.shapefile_counties /*subset to Texas*/ where state_fips='48'; quit;

Step 4

Use PROC GREMOVE to dissolve the boundaries between counties that belong to the same region. It is important to sort the county dataset by region before you run PROC GREMOVE.

proc sort data=temp; by region; run;   proc gremove data=temp out=geo.regions_shapefile nodecycle; by region; id name; /*name is county name*/ run;

Step 5

To validate that your boundaries resolved correctly, run PROC GMAP to view the regions. If the regions do not look right when you run this step, it may signal an issue with the underlying data. For example, when I ran this with a county shapefile obtained from Census, I found that some of the counties were mislabeled, which of course, caused the regions to not dissolve correctly.

proc gmap map=geo.regions_shapefile data=geo.regions_shapefile all; id region; choro region / nolegend levels=1; run;

Here’s the result I got, which is exactly what I expected:

Step 6

Add a sequence number variable to the regions dataset. SAS Visual Analytics 8.2 needs it properly define a custom polygon inside a report:

data geo.regions_shapefile; set geo.regions_shapefile; seqno=_n_; run;

Step 7

Load the new region shapefile in SAS Visual Analytics.

Step 8

In the dataset with the region variable that you want to visualize, create a new geography variable and define a new custom polygon provider.

Geography Variable:

Polygon Provider:

Step 9

Now, you can create a map of your custom regions:

If Necessity is the mother of Invention, then, perhaps, the father of Automation is Laziness. Automation is all about convenience, comfort, and productivity. Why do it yourself if you can devise something to do it for you!

In my previous post Running SAS programs in batch under Unix/Linux, we learned that in order to automate SAS jobs submissions, they are often run in batch mode. We also learned that we usually create batch scripts as a convenient way to run SAS programs in batch. To create a unique SAS log file generated with each batch submission, a typical batch script may look like follows:

#!/usr/bin/sh dtstamp=$(date +%Y.%m.%d_%H.%M.%S) pgmname="/sas/code/project1/program1.sas" logname="/sas/code/project1/program1_$dtstamp.log" /sas/SASHome/SASFoundation/9.4/sas $pgmname -log$logname

It will allow you to submit your SAS program /sas/code/project1/program1.sas in batch, and also capture SAS log file with a convenient date-time suffix in the same directory.

## SAS program to write batch scripts

But what if we are deploying multiple SAS programs? Well, then we would need to create a batch script for each of them. They will all look similar to each other, and that is when most human errors usually occur – when we do something similar, monotonously, over and over again.  Besides, I found working with the Unix Visual Editor (“vi editor”) is not quite a 21st century experience.

What would a normal SAS programmer do in such a situation? That’s right – we would write a SAS program to write a batch script file! Let’s do it. Let’s automate the automation.

In its simplest form, to replicate the above batch script example our SAS program would look like this:

filename b '/sas/code/project1/program1.sh';   data _null_; file b; input; put _infile_; datalines; #!/usr/bin/sh dtstamp=$(date +%Y.%m.%d_%H.%M.%S) pgmname="/sas/code/project1/program1.sas" logname="/sas/code/project1/program1_$dtstamp.log" /sas/SASHome/SASFoundation/9.4/sas $pgmname -log$logname ;

## Setting up batch file permissions

As we already know from my previous post, we need to assign certain permissions to our batch file in order to make it executable. For example, if you want to give yourself (Owner) and Group execution permissions then your script file permissions can be as:

-rwxr-x---, or 750 in octal representation.

In order to do that you can to add to your SAS code the following x-statement:

options noxwait; x 'chmod 750 /sas/code/project1/program1.sh';

Alternatively, you can use %SYSEXEC macro statement (no quoting for the OS command) or SYSTASK statement, or CALL SYSTEM routine (used within a data step).

When you create a batch script by running the above code in SAS Enterprise Guide (EG), you don’t have to leave the comfort of your SAS environment or even touch Unix vi editor. Moreover, you can even submit your SAS job in batch mode right from your SAS EG Program Editor.

However, all of these will work fine, unless XCMD System Option is disabled (NOXCMD).

## Assigning batch file permissions when XCMD System Option is disabled

ERROR: Shell escape is not valid in this SAS session.

Bummer! Have you ever seen this error message in SAS Enterprise Guide while trying to run SAS code with the X statement? It indicates that executing OS commands in the SAS environment is not allowed.

In many organizations, IT department policies do not allow enabling the SAS XCMD system option due to cyber-security concerns. This is usually done system-wide for the whole SAS Enterprise client-server installation via SAS configuration.  In this case, no operating system command is allowed to be executed from within SAS.

Of course, this substantially limits SAS’ automation power, but that is the goal and the price to pay for enhanced security.

Still, even without OS command execution at our disposal, we can set Unix script file permissions using FILENAME statement’s PERMISSION= option. Then our above filename statement will look like this:

filename b '/sas/code/project1/program1.sh' permission='A::u::rwx,A::g::r-x,A::o::---';

However, it is important to realize that your ability to fully control file permissions via FILENAME statement’s PERMISSION= option is still restricted by the Unix umask value set by your IT system administrator. But usually, it is not overly restrictive, at least for the purpose of creating executable files in the environments I have worked with.

The double benefit of the FILENAME statement’s PERMISSION= option is that it can be used for setting up file permissions in any SAS installation whether the XCMD system option is enabled or disabled.

## SAS macro to create batch script files

Let’s wrap all the above SAS code pieces into a SAS macro that writes batch scripts. Here is the macro code definition:

%macro write_shell(code); %let fdir = %substr(&code,1,%sysfunc(findc(&code,/,b)));   options dlcreatedir; libname _flib "&fdir"; libname _flib;   %let core = %substr(&code,1,%eval(%length(&code)-4)); filename _fout "&core..sh" permission='A::u::rwx,A::g::r-x,A::o::---'; data _null_; file _fout; put '#!/bin/sh' // 'now=$(date +%Y.%m.%d_%H.%M.%S)' / "pgmname=""&code""" / "logname=""&core._$now.log""" / 'sas $pgmname -log$logname' ; run; filename _fout; %mend write_shell;

The single macro parameter (code) represents full path name of your SAS code. And here is a macro invocation example:

  %write_shell(/sas/code/project1/program1.sas)

The assumption here is that the script file gets created in the same directory as the relevant SAS code and SAS logs for each of the batch runs. It will be assigned the same name as your SAS program, only with the .sh name extension. As you can see, we do some string parsing to derive directory name, script file name and SAS log file name from the single macro parameter representing full path name of your SAS code. As an added bonus, if a specified directory (/sas/code/project1/) does not yet exist, it will be created by this macro. DLCREATEDIR System Option (along with the two subsequent libname statements) are responsible for the directory creation.

If you want to create many script files for your multiple SAS programs, you just invoke the macro as many times. You can even go totally data-driven for mass script file creation.

## Do you find this useful?

Please let me know in the comments section below if you find this blog post useful. Thank you for reading! I also invite you to share your ideas and experiences on the topic.

Let SAS write batch scripts for you was published on SAS Users.