I often get asked for programming tips. Here, I share three of my favorite tips for beginners. Tip #1: COUNTC and CATS Functions Together The CATS function concatenates all of its arguments after it strips leading and trailing blanks. The COUNTC function counts characters. Together, they can let you operate [...]
You should play a little. Add dots. Add color. Your PROC REPORT output does not have to be boring. As a matter of fact, it can be both functional and appealing. Any Unicode value will do, but this blog shows how to use the Unicode value for a dot (filled [...]
The Base SAS DATA step has been a powerful tool for many years for SAS programmers. But as data sets grow and programmers work with massively parallel processing (MPP) computing environments such as Teradata, Hadoop or the SAS High-Performance Analytics grid, the data step remains stubbornly single-threaded. Welcome DS2 – [...]
SAS variables are variables in the statistics sense, not the computer programming sense. SAS has what many computer languages call “variables,” it just calls them “macro variables.” Knowing the difference between SAS variables and SAS macro variables will help you write more flexible and effective code.
According to Glassdoor, data scientist tops the list of the 50 Best Jobs in America. The rankings are determined by combining three factors: number of job openings, salary and overall job satisfaction rating. With a median base salary of $110,000, an abundance of unfilled positions and high job satisfaction, there’s no denying that data science is hot.
We all have different learning styles. Some learn best by seeing and doing; others by listening to lectures in a traditional classroom; still others simply by diving in and asking questions along the way. Traditional face-to-face classroom instruction, real-time classes over the Internet, or self-paced instruction with exercises, SAS Education [...]
In an earlier blog, I asked you to participate in CertMag/GoCertify’s Annual IT Salary survey. The response was fantastic and I’m happy to report that we made the list of the Top 75 IT certifications out of more than 900. This marks the first time SAS has been part of [...]
The post You know the value of your SAS Certification; does the rest of the world? appeared first on SAS Learning Post.
While SAS program development is usually done in an interactive SAS environment (SAS Enterprise Guide, SAS Display Manager, SAS Studio, etc.), when it comes to running SAS programs in a production or operations environment, it is routinely done in batch mode.
Why run SAS programs in batch mode?
First and foremost, this is done for automation, as the batch process does not require human participation at the time of run. It can be scheduled to run (using Operating System scheduler or other scheduling software) while we sleep, at any time of the day or at any time interval between two consecutive runs.
Running SAS programs in batch mode allows streamlining SAS processing by eliminating the possibility of human error, submitting multiple SAS jobs (programs) all at once or in a sequence securing programs and/or data dependencies.
SAS batch processing also takes care of self-documenting, as it automatically generates and stores SAS logs and outputs.
Imagine the following scenario. Every night, a SAS batch process “wakes up” at 3 a.m. and runs an ETL process on a SAS Application server that extracts multiple tables from a database, transforms, combines, and loads them into a SAS datamart; then moves some data tables across the network and loads them into SAS LASR server, so when you are back to work in the morning your SAS Visual Analytics application has all its data refreshed and ready to roll. Of course, the process schedule can be custom-tailored to your particular needs; your batch jobs may run every 15 minutes, once a week, every first Friday of the month – you name it.
What is a batch script file?
To submit a single SAS program in batch mode manually, you could submit an OS command that looks something like the following:
sas /sas/code/proj1/job1.sas -log /sas/code/proj1/job1.log
"C:\Program Files\SASHome\SASFoundation\9.4\Sas.exe" -SYSIN c:\proj1\job1.sas -NOSPLASH -ICON -LOG c:\proj1\job1.log
However, submitting an OS command manually has too many drawbacks: it’s too much typing, it only submits one SAS program at a time, and most importantly – it is manual, which means it is prone to human error.
Usually, these OS commands are packaged into so called batch files (shell scripts in Unix) that allow for sequential, parallel, as well as conditional execution of multiple OS line commands. They can be run either manually, or automatically – on schedule, or called by other batch scripts.
In a Windows/DOS Operating System, these script files are called batch files and have .bat filename extensions. In Unix-like operating systems, such as Linux, these script files are called shell scripts and have .sh filename extensions.
Since Windows batch files are similar, but slightly different from the Unix (and its open source cousin Linux) shell scripts, in the below examples we are going to use Unix/Linux shell scripts only, in order to avoid any confusion. And we are going to use terms Unix and Linux interchangeably.
Here is the typical content of a Linux shell script file to run a single SAS program:
#!/usr/bin/sh dtstamp=$(date +%Y.%m.%d_%H.%M.%S) pgmname="/sas/code/project1/program1.sas" logname="/sas/code/project1/program1_$dtstamp.log" /sas/SASHome/SASFoundation/9.4/sas $pgmname -log $logname
Note, that the shell script syntax allows for some basic programming features like current datetime function, formatting, and variables. It also provides some conditional processing similar to “if-then-else” logic. For detailed information on the shell scripting language you may refer to the following BASH shell script tutorial or any other source of many dialects or flavors of the shell scripting (C Shell, Korn Shell, etc.)
Let’s save the above shell script as the following file:
How to submit a SAS program via Unix script
In order to run this shell script we would submit the following Linux command:
Or, if we navigate to the directory first:
then we can submit an abbreviated Linux command
When run, this shell script not only executes a SAS program (program1.sas), but for every run it also creates and saves a uniquely named SAS Log file. You may create the SAS log file in the same directory where the SAS code is stored, as specified in the script shell above, or specify another directory of your choice.
For example, it creates the following SAS log file:
The file name uniqueness is achieved by adding a date/time stamp suffix between the SAS program name and .log file name extension, in this particular case indicating that this SAS log file was created on December 6, 2017, at 09:15:20 (hours:minutes:seconds).
Unix script for submitting multiple SAS programs
Unix scripts may contain not only OS commands, but also other Unix script calls. You can mix-and-match OS commands and other script calls.
When scripts are created for each individual SAS program that you intend to run in a batch, you can easily combine them into a program flow by creating a flow script containing those single program scripts. For example, let’s create a script file /sas/code/project1/flow1.sh with the following contents:
/sas/code/project1/program1.sh /sas/code/project1/program2.sh /sas/code/project1/program3.sh
When submitted as
it will sequentially execute three scripts - program1.sh, program2.sh, and program3.sh, each of which will execute the corresponding SAS program - program1.sas, program2.sas, and program3.sas, and produce three SAS logs - program1.log, program2.log, and program3.log.
Unix script file permissions
In order to be executable, UNIX script files must have certain permissions. If you create the script file and want to execute it yourself only, the file permissions can be as follows:
-rwxr-----, or 740 in octal representation.
This means that you (the Owner of the script file) have Read (r), Write (w) and Execute (x) permission as indicated by the green highlighting; Group owning the script file has only Read (r) permission as indicated by yellow highlighting; Others have no permissions to the script file at all as indicated by red highlighting.
If you want to give yourself (Owner) and Group execution permissions then your script file permissions can be as:
-rwxr-x---, or 750 in octal representation.
In this case, your group has Read (r) and Execute (x) permissions as highlighted in yellow.
In Unix, file permissions are assigned using the chmod Unix command.
Note, that in both examples above we do not give Others any permissions at all. Remember that file permissions are a security feature, and you should assign them at the minimum level necessary.
Conditional execution of scripts and SAS programs
Here is an example of a Unix script file that allows running multiple SAS programs and OS commands at different times.
#!/bin/sh #1 extract data from a database /sas/code/etl/etl.sh >#2 copy data to the Visual Analytics autoload directory scp -B userid@sasAPPservername:/sas/data/*.sas7bdat userid@sasVAservername:/sas/config/.../AutoLoad #3 run weekly, every Monday dow=$(date +%w) if [ $dow -eq 1 ] then /sas/code/alerts_generation.sh fi #4 run monthly, first Friday of every month dom=$(date +%d) if [ $dow -eq 5 -a $dom -le 7 ] then /sas/code/update_history.sh /sas/code/update_transactions.sh fi
In this script, the following logical operators are used: -eq (equal), -le (less or equal), -a (logical and).
As you can see, the script logic takes care of branching to execute different SAS programs when certain timing conditions are met. With such an approach, you would need to schedule only this single script to run at a specified time/interval, say daily at 3 a.m.
In this case, the script will “wake up” every morning at 3 a.m. and execute its component scripts either unconditionally, or conditionally.
If one of the included programs needs to run at a different, lesser frequency (e.g. every Monday, or monthly on first Friday of every month) the script logic will trigger those executions at the appropriate times.
In the above script example steps #1 and #2 will execute every time (unconditionally) the script runs (daily). Step #1 runs ETL program to extract data from a database, step #2 copies the extracted data across the network from SAS Application server to the SAS LASR Analytic server’s drop zone from where they are automatically loaded (autoloaded) into the LASR.
Step #3 will run conditionally every Monday ( $dow -eq 1). Step #4 will run conditionally every first Friday of a month ($dow -eq 5 -a $dom -le 7).
For more information on how to format date for use in shell scripts please refer to this post.
Do you run your SAS programs in batch?
Please share your batch experiences in the comment section below. I am sure the rest of us will really appreciate it!
Finding a pattern like a phone number or national ID number embedded in text can be difficult and time consuming. The traditional DATA step has a family of functions (collectively referred to as PRX functions) that allow using Perl regular expressions in your SAS programs to make pattern search easier. [...]