10月 152014

hadoop-HPAIn this post we dig deeper into the fourth recommended practice for securing the SAS-Hadoop environment through Kerberos authentication:

When configuring SAS and Hadoop jointly in a high-performance environment, ensure that all SAS servers are recognized by Kerberos.

Before explaining the complex steps in connecting to secure Hadoop within a SAS High-Performance Analytics environment such as SAS Visual Analytics, let’s start by reviewing a simpler connection from a standard SAS session through SAS/ACCESS for Hadoop.

Making connections in a standard SAS session

Hadoop_standard_thumbnailHere's a behind-the-scenes look at the steps involved in the connection. Click on the thumbnail to view the full diagram. This graphic depicts an environment where the SAS Servers are configured to authenticate the users with the same back-end directory server as the Hadoop instance. The setup is relatively straight-forward.

Let’s say a SAS user (we’ll call this user Mary) logs into her machine, and the standard Windows logon procedure obtains the Ticket-Granting Ticket (TGT) from the main corporate Active Directory. This step happens on all domain machines: Kerberos is tied into the standard deployment of Active Directory. Also, this step is completely isolated from Mary’s access to SAS.

At some later point in the day (perhaps after grabbing her morning cup of coffee!), Mary may open SAS Enterprise Guide. As we know, starting the SAS session makes a connection to the SAS Metadata Server, and her credentials (username and password) in the connection profile are authenticated by the Metadata Server. As we noted in previous posts, the SAS servers have been configured to use the same directory server as Hadoop, so this authentication step uses Pluggable Authentication Modules (PAM) to validate our user.

Next, SAS Enterprise Guide initiates Mary’s Workspace Session by connecting to the Object Spawner. The Object Spawner runs a shell script ( as “Mary” and spawns the Workspace Session. In this step, as her credentials are authenticated, the PAM stack on the server obtains a Ticket-Granting Ticket (TGT) for Mary. This TGT is going to be placed in her Ticket Cache, ready to be made use of later.

To Mary, all of this has happened as SAS Enterprise Guide was opened. She has not been required to perform any special actions to get to this stage. All the “magic” has been taking place behind the curtain.

So Mary can submit her SAS code to connect to Hadoop using the standard LIBNAME statement with the required options. (Remember username and password are not valid when connecting using Kerberos. They specify the Kerberos security principals instead.) Also as discussed last time, the step for connecting to Hadoop in a SAS session can be moved behind the curtain by ensuring the principals are in the configuration file used to make the connection to Hadoop. The Hadoop client libraries then use the TGT to request the Service Tickets for HIVE and/or HDFS, and SAS makes the connection to Hadoop using the Service Tickets. Our SAS user Mary is authenticated on the Hadoop side by validating the Service Tickets provided in the connection.

How connections are made in a SAS High-Performance Analytics session

Hadoop_HPA_thumbnailSo, this is the setup in a standard SAS session. What now happens if the environment uses something like SAS High-Performance Analytics or SAS Visual Analytics to make the connection to the secured Hadoop environment? Let’s look at the steps involved if our SAS user Mary wants to make a connection to SAS Visual Analytics. Click on the thumbnail to view a larger graphic showing these steps. Wow – that looks a bit more complicated under the covers! We’ll start with understanding how the connections are made and then look at some of the configuration options.

So, the starting points in this process are the same as before. We have the same steps occurring up to making the connection to Hadoop (step 7 in the diagram.) At this point, we’ll want to explore a little more detail about the connection that is made by the LIBNAME statement in Mary’s SAS code.

What wasn’t necessary to know when connecting with a standard SAS session is the fact that the XML configuration file is actually written to a temporary location in HDFS when Mary connects to Hadoop from SAS. This XML file will be used later by the distributed processes in the SAS High-Performance Analytics environment.

After submitting the LIBNAME statement to connect to Hadoop, Mary must now submits a PROC LASR or HP PROC statement to access the high-performance environment. Submitting these procedures initiates the SSH connection from the Workspace Session to the SASHigh-Performance Analytics Environment. Since Mary needs her SAS High Performance Analytics session to be Kerberos-aware, the SSH connection must be made using Kerberos. At this point, the SSH client uses the TGT to obtain a Service Ticket for HOST on the HPA General or LASR Root Node. SAS then uses the SSH client to start the HPA General and passes details of how to connect to HDFS.

The SSH Daemon (a server process running on the HPA General) generates a TGT for Mary on the HPA General as part of authenticating her credentials. This TGT on the HPA General is then used to request Service Tickets for all the worker nodes, and the parallel SSH connections are made to initialize the HPA Captains. With the HPA processes now running, the HPA General initiates the connection to Hadoop using the TGT—this time to request service tickets for HDFS and HIVE.

The HPA General connects to HDFS to retrieve the XML file placed there by the LIBNAME statement. Our SAS user Mary is authenticated using the Service Ticket for HDFS. The HPA General now submits a Map Reduce job. This Map Reduce job initiates the SAS Embedded Process (EP) running on the Hadoop nodes. The SAS Embedded Process connects first to the HPA General and then makes connections to the assigned HPA Captains using UNIX sockets. This process is not authenticated since the two sets of processes were already authenticated.

The SAS Embedded Process runs as a standard MapReduce job and has corresponding MapReduce tasks running on each node of the Hadoop environment. The MapReduce tasks connect as necessary to HDFS and HIVE using the standard Hadoop internal tokens. These tokens are used by the tasks of a MapReduce job rather than Kerberos tickets. More details about these internals of the Hadoop system can be found in the Hortonworks Technical Report:  Adding Security to Apache Hadoop. Each SAS Embedded Process then passes the data back and forth in parallel as required by the HPA processes.

Configuration requirements for SAS High-Performance Analytics

So while the diagram looks complicated, I believe we can distill this down into the following two requirements:

  1. The SAS Workspace Server must still have access to the users TGT.
  2. HPA General or LASR Root Node must have access to the users TGT.

The simplest method of ensuring that both the SAS Workspace Server and the HPA General or LASR Root Node have access to the user’s TGT is to configure SSH to use Kerberos and to ensure the following options are set.

GSSAPIAuthentication yes
GSSAPICleanupCredentials yes
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes

Setting these options on all SAS High-Performance Analytics Environment machines should ensure all SSH connections made using Kerberos will have a valid TGT obtained for each open session. Remember for SSH to use Kerberos, the HOST Service Principal must be registered in the Kerberos Key Distribution Center and the HOST keytab must be available on each machine (normally stored in /etc/krb5.keytab).

If you have questions about configuring SAS High-Performance Analytics to access a Kerberos-authenticated Hadoop environment or have other suggestion, please share them in the comment area below.

tags: Hadoop, Kerberos, SAS Administrators, security
10月 092014

hadoop-config1understanding the fundamentals of Kerberos authentication and how we can simplify processes by placing SAS and Hadoop in the same realm. For SAS applications to interact with a secure Hadoop environment, we must address the third key practice:

Ensure Kerberos prerequisites are met when installing and configuring SAS applications that interact with Hadoop.

The prerequisites must be met during installation and deployment of SAS software, specifically SAS/ACCESS Interface to Hadoop for SAS 9.4.

1) Make the correct versions of the Hadoop JAR files available to SAS.

If you’ve installed other SAS/ACCESS products before, you’ll find installing SAS/ACCESS Interface for Hadoop is different. For other SAS/ACCESS products, you generally install the RDBMS client application and then make parts of this client available via the LD_LIBRARY_PATH environment variable.

With SAS/ACCESS to Hadoop, the client is essentially a collection of JAR files. When you access Hadoop through SAS/ACCESS to Hadoop, these JAR files are loaded into memory. The SAS Foundation interacts with Java through the jproxy process, which loads the Hadoop JAR files.

You will find the instructions for copying the required Hadoop JAR files and setting the SAS_HADOOP_JAR_PATH environment variable in the following product documentation:

2) Make the appropriate configuration files available to SAS.

The configuration for the Hadoop client is provided via XML files. The cluster configuration is updated when Kerberos is enabled for the Hadoop cluster and you must remember to update the cluster configuration files when this is enabled.  The XML files contain properties specific to security, and the files required depend on the version of MapReduce being used in the Hadoop cluster. When Kerberos is enabled, it’s required that these XML configuration files contain all the appropriate options for SAS and Hadoop to properly connect.

  • If you are using MapReduce 1, you need the Hadoop core, Hadoop HDFS, and MapReduce configuration files.
  • If you are using MapReduce 2, you need the Hadoop core, Hadoop HDFS, MapReduce 2, and YARN configuration files.

The files are placed in a directory available to SAS Foundation and this location is set via the SAS_HADOOP_CONFIG_PATH environment variable.  The SAS® 9.4 Hadoop Configuration Guide for Base SAS® and SAS/ACCESS® describes how to make the cluster configuration files available to SAS Foundation.

3) Make the user’s Kerberos credentials available to SAS.

The SAS process will need to have access to the user’s Kerberos credentials for it to make a successful connection to the Hadoop cluster.  There are two different ways this can be achieved, but essentially SAS requires access to the user’s Kerberos Ticket-Granting-Ticket (TGT) via the Kerberos Ticket Cache.

Enable users to enter a kinit command interactively from the SAS server. My previous post Understanding Hadoop security described the steps required for a Hadoop user to access a Hadoop client:

  • launch a remote connection to a server
  • run a kinit command
  • then run the Hadoop client.

The same steps apply when you are accessing the client through SAS/ACCESS to Hadoop. You can make a remote SSH connection to the server where SAS is installed. Once logged into the system, you run the command kinit, which initiates your Kerberos credentials and prompts for your Kerberos password. This step obtains your TGT and places it in the Kerberos Ticket Cache. Once completed, you can start a SAS session and run SAS code containing SAS/ACCESSS to Hadoop statements. This method provides access to the secure Hadoop environment, and SAS will interact with Kerberos to provide the strong authentication of the user.

However, in reality, how many SAS users run their SAS code by first making a remote SSH connection to the server where SAS is installed? Clearly, the SAS clients such as SAS Enterprise Guide or the new SAS Studio do not function in this way: these are proper client-server applications. SAS software does not directly interact with Kerberos. Instead, SAS relies on the underlying operating system and APIs to make those connections. If you’re running a client-server application, the interactive shell environment isn’t available, and users cannot run the kinit command. SAS clients need the operating system to perform the kinit step for users automatically. This requirement means that the operating system itself must be integrated with Kerberos, providing the user’s Kerberos password to obtain a Kerberos-Ticket-Granting Ticket (TGT).

Integrate the operating system of the SAS server into the Kerberos realm for Hadoop. Integrating the operating system with Kerberos does not necessarily mean that the user accounts are stored in a directory server. You can configure Kerberos for authentication with local accounts. However, the user accounts must exist with all the same settings (UID, GID, etc.) on all of the hosts in the environment. This requirement includes the SAS server and the hosts used in the Hadoop environment.

Managing all these local user accounts across multiple machines will be considerable management overhead for the environment. As such, it makes sense to use a directory server such as LDAP to store the user details in one place. Then the operating system can be configured to use Kerberos for authentication and LDAP for user properties.

If SAS is running on Linux, you’d expect to use a PAM (Pluggable Authentication Module) configuration to perform this step, and the PAM should be configured to use Kerberos for authentication. This results in a TGT being generated as a user’s session is initialized.

The server where SAS code will be run must also be configured to use PAM, either through the SAS Deployment Wizard during the initial deployment or manually after the deployment is complete.  Both methods update the sasauth.conf file in the <SAS_HOME>/SASFoundation/9.4/utilities/bin and set the value of methods to “pam”.

This step is not sufficient for SAS to use PAM.  You must also make entries in the PAM configuration that describe what authentication services are used when sasauth performs an authentication.  Specifically, the “account” and “auth” module types are required.  The PAM configuration of the host is locked down to the root user, and you will need the support of your IT organization to complete this step. More details are found in the Configuration Guide for SAS 9.4 Foundation for UNIX Environments.

With this configuration in place, a Kerberos Ticket-Granting-Ticket should be generated as the user’s session is started by the SAS Object Spawner. The TGT will be automatically available for the client-server applications. On most Linux systems, this Kerberos TGT will be placed in the user’s Kerberos Ticket Cache, which is a file located, by default, in /tmp. The ticket cache normally has a name /tmp/krb5cc_<uid>_<rand>, where the last section of the filename is a set of random characters allowing for a user to log in multiple times and have separate Kerberos Ticket Caches.

Given that SAS does not know in advance what the full filename will be, the PAM configuration should define an environment variable KRB5CCNAME which points to the correct Kerberos Ticket Cache.  SAS and other processes use the environment variable to access the Kerberos Ticket Cache. Running the following code in a SAS session will print in the SAS log the value of the KRB5CCNAME environment variable:

%let krb5env=%sysget(KRB5CCNAME);
%put &KRB5ENV;

Which should put something like the following in the SAS log:

43         %let krb5env=%sysget(KRB5CCNAME);
44         %put &KRB5ENV;

Now that the Kerberos Ticket-Granting-Ticket is available to the SAS session running on the server, the end user is able to submit code using SAS/ACCESS to Hadoop statements that access a secure Hadoop environment.

In my next blog in the series, we will look at what happens when we connect to a secure Hadoop environment from a distributed High Performance Analytics Environment.

More information


tags: authentication, configuration, Hadoop, Kerberos, SAS Administrators, security
10月 012014

hadoop-topo1So, with the simple introduction in Understanding Hadoop security, configuring Kerberos with Hadoop alone looks relatively straightforward. Your Hadoop environment sits in isolation within a separate, independent Kerberos realm with its own Kerberos Key Distribution Center. End users can happily type commands as they log into a machine hosting the Hadoop clients. From the host machine they can run processing against the Hadoop services.

But how does SAS fit into this picture? Where will the SAS servers and clients be located in relation to the Hadoop Kerberos realm? This post provides more insight into second of the four key practices for securing a SAS-Hadoop environment:

Simplify Kerberos setup by placing SAS and Hadoop within the same topological realm.

After reading this next blog post, a coworker told me botanists have a term that fits this concept perfectly: monoecious, from the Greek meaning “one household”. Some trees like hollies and ginkos have male and female flowers on separate plants, but for most plants, the connections of life are made much simpler by being monoecious, by ensuring the important elements are in close proximity. Here’s why that works for SAS-Hadoop-Kerberos too!

What happens if SAS and Hadoop are in different realms

It’s unlikely that many SAS and Hadoop environments will be installed at the same time.  Often one or more already exists. If you have an SAS existing environment in your corporate realm and you’ve just followed the instructions from your Hadoop provider for configuring Kerberos, you’ll probably have the setup in Figure 1.  SAS server and user authentication will happen in the corporate realm, while access to the Hadoop realm is governed by the Kerberos Key Definition Center and will happen in the Hadoop realm.

However, the major thing missing from the customer’s environment is reflected in the green arrow at the top. In the diagram below, the Corporate Domain and the new Hadoop Realm contain the trust relationships. A domain administrator must create these trusts by mapping users between the two realms.    Without one-way trust, SAS is not going to be able to interact with Hadoop at all. This topology will be one of the more complex arrangements. SAS administrators and their IT departments will need to set up all the required domain trusts represented by that little green arrow.

Once trusts are established, there are additional steps to ensure back-end Kerberos authentication for SAS processes running in the Corporate Realm. Ideally, to access Hadoop Services while running SAS processes, the operating system should be configured to perform the kinit step to obtain the correct Ticket Granting Ticket (TGT). Unless the operating system is given this capability, the SAS processes will be unable to request the Service Ticket and so will be unable to authenticate.

The simplest option for SAS administrators is to perform this step on the host running the SAS process as part of the session initialization. In this instance, the SAS session will be launched normally. For example, within an Enterprise Guide session, the end-user still enter a valid user name and password into the connection profile. This action sets up a back-end Kerberos authentication between the SAS process and the Hadoop Services.


Placing SAS and Hadoop in the same realm

Now an alternative to setting up the domain trusts above would be to move the SAS Servers and SAS High Performance Analytics nodes into the same “household” as the Hadoop Key Distribution Center, as shown here in Figure 2. In this configuration, the end-user logs into the corporate realm and launches a SAS session by entering a user name and password into a SAS client. The same credentials used to start SAS Enterprise Guide, for example, are also valid in the Hadoop realm.

Authentication now takes place in the joint SAS-Hadoop realm without additional mapping required. The SAS servers and SAS High-Performance Analytics nodes can interact with the same Kerberos Key Distribution Center as the Hadoop services because all the components are within the same Kerberos realm.

This topology will greatly simplify the Kerberos setup for the SAS components. The Kerberos authentication within the Hadoop Realm will be straightforward, and the only complexity will be if the customer has a requirement for end-to-end Kerberos authentication in which the SAS session itself is launched using Kerberos and Kerberos authentication from the user’s desktop through to the Hadoop services.



Where to find more information

SAS provides architecture documents that offer guidelines for ensuring your SAS-Hadoop environment is not only secure, but also offers faster response times.

tags: authentication, Hadoop, Kerberos, SAS Administrators, security
9月 242014

Hadoop_logoA challenge for you – do a Google search for “Hadoop Security” and see what types of results you get. You’ll find a number of vendor-specific pages talking about a range of projects and products attempting to address the issue of Hadoop security. What you’ll soon learn is that security is quickly becoming a big issue for all organizations trying to use Hadoop.

Many of you may already be planning or involved in Hadoop deployments involving SAS software. As a result, technical architects at SAS often get questions around security, particularly around end-point security capabilities. While there are many options for end-point and user security in the technology industry, the Hadoop Open Source Community is currently leading with a third-party authentication protocol called Kerberos.

Four key practices for securing a SAS-Hadoop environment

To access to a fully operational and secure Hadoop environment, it is critical to understand the requirements, preparation and process around Kerberos enablement with key SAS products. There are four overall practices that help ensure your SAS-Hadoop connection is secure and that SAS performs well within the environment. Today’s post is the first in a series that will cover the details of each of these practices:

  1. Understand the fundamentals of Kerberos authentication and the best practices promoted by key Hadoop providers.
  2. Simplify Kerberos setup by placing SAS and Hadoop within the same topological realm.
  3. Ensure Kerberos prerequisites are met when installing and configuring SAS applications that interact with Hadoop.
  4. When configuring SAS and Hadoop jointly in a high-performance environment, ensure that all SAS servers are recognized by Kerberos.

What is Kerberos?

Kerberos authentication protocol works via TCP/IP networks and acts as a trusted arbitration service, enabling:

  • a user account to access a machine
  • one machine to access different machines, data and data applications on the network.

Put in the simplest terms, Kerberos can be viewed as a ticket for a special event that is valid for that event only or for a certain time period. When the event is over or the time period elapses, you need a new ticket to obtain access.

How difficult is Hadoop and Kerberos configuration?

Creating a Kerberos Key Distribution Center is the first step in securing the Hadoop environment. At the end of this blog post, I’ve listed sources for standard instructions from the top vendors of Hadoop. Following their instructions will result in creating a new Key Distribution Center, or KDC, that is used to authenticate both users and server processes specifically for the Hadoop environment.

For example, with Cloudera 4.5, the management tools include all the required scripts to configure Cloudera to use Kerberos. Simply running these scripts after registering an administrator principal will result in Cloudera using Kerberos. This process can be completed in minutes after the Kerberos Key Distribution Center has been installed and configured.

How does Kerberos authenticate end-users?

A non-secure Hadoop configuration relies on client-side libraries to send the client-side credentials as determined from the client-side operating system as part of the protocol. While users are not fully authenticated, this method is sufficient for many deployments that rely on physical security. Authorization checks through ACLs and file permissions are still performed against the client-supplied user ID.

Once Kerberos is configured, Kerberos authentication is used to validate the client-side credentials. This means the client must request a Service Ticket valid for the Hadoop environment and submit this Service Ticket as part of the client connection. Kerberos provides strong authentication where tickets are exchanged between client and server and validation is provided by a trusted third party in the form of the Kerberos Key Distribution Center.

Step 1. The end user obtains a Ticket Granting Ticket (TGT) through a client interface.

Step 2. Once the TGT is obtained, the end-user client application requests a Hadoop Service Ticket. These two steps don’t always occur in this sequence because there are different mechanisms that can be used to obtain the TGT. Some implementations will require users to run a kinit command after accessing the machine running the Hadoop clients. Other implementations will integrate the Kerberos configuration in the host operating system setup. In this case, simply logging into the machine running the Hadoop clients will generate the TGT.

Step 3. Once the user has a Ticket Granting Ticket, the client requests the Service Ticket (ST) corresponding to the Hadoop Service the user is accessing. The ST is then sent as part of the connection to the Hadoop Service.

Step 4. The corresponding Hadoop Service must then authenticate the user by decrypting the ST using the Service Key exchanged with the Kerberos Key Distribution Center. If this decryption is successful the end user is authenticated to the Hadoop Service.

Step 5. Results are returned from the Hadoop service.


Where to find more information about Hadoop and Kerberos

Providers such as Cloudera and Hortonworks have general guidelines around the role of Kerberos Authentication within their product offerings. You can access their suggested best practices via some of the key technology pages on their websites:

tags: Hadoop, Kerberos, SAS Administrators, SAS Professional Services, security
1月 222014

In the movie, The Matrix: Reloaded, our heroes and the KeyMaker frantically navigated from world to world through a series of doors and locks trying to escape the villains.  Fortunately for our heroes, the KeyMaker always had the right key on his ring, he just had to know what key to use and take the time to find it, thus making for a very dramatic escape scene.

For security reasons, many platform solutions with multiple applications and servers require users to provide the correct key or credentials when accessed.  For years, the SAS Intelligence Platform has supported a single-key access called single sign-on (SSO) for its applications and servers.  Refer to Single Sign-On in the SAS Intelligence Platform for more information on SAS support for this feature.

With the latest release, SAS Data Management Platform is fully integrated with the SAS Intelligence Platform, and single sign-on is now available for applications  and servers that comprise the Data Management Platform, such as Data Management Studio and SAS Data Management Console.  Once single sign-on is configured, users can log on once to access the platform applications and features without providing additional credentials.  Using domain-enabled connections, database access can also be configured within a single sign-on platform deployment.

There are a lot of benefits to using single sign-on with the Data Management Platform.  A few that come to mind are:

  • Reduce the number of times re-entering the same credentials within the same platform
  • Limit the number of user and password combinations across the platform applications
  • Remove the need to store database credentials on the desktop or server using domain-enabled connections
  • Increase overall productivity and improve perceived ease-of-use for the platform features

Of course, there are always legitimate concerns when allowing a user to become the KeyMaker, with the so-called “keys to the castle” to access to every application, report, or data file accessible from within the platform.  Fortunately, the SAS Intelligence Platform has additional security settings such as roles for users and groups to limit access, even with single sign-on enabled.   One other concern would be the negative impact on user productivity if the authentication system is unavailable to validate their credentials.

Had the KeyMaker in the movie been able to open the first door and have all of the other doors magically open, perhaps their escape from the villains would have been less dramatic.  By contrast, opening all the doors in the Data Management Platform using single sign-on authentication can be very dramatic, so be sure to consider both the required level of security within your organization and the desire security level for users.

tags: data management, SAS Administrators, security, single sign-on
1月 182014

For several releases, SAS has supported a cryptographic hash function called MD5, or "message digest". In SAS 9.4 Maintenance 1, the new SHA256 function can serve the same purpose with a better implementation.

The job of a hash function is to take some input (of any type and of any size) and distill it to a fixed-length series of bytes that we believe should be unique to that input. As a practical example, systems use this to check the integrity of file downloads. You can verify that the published hash matches the actual hash after downloading.

Sometimes a hash is used to track changes in records within a database. You first calculate a hash value for each data record based on all of the fields. Periodically, you recheck those calculations. If a hash value changes for a data record, you know that some part of that record has changed since the last time you looked.

Here's another common use: storing passwords in a database. Because you can't (theoretically) reverse the hash process, you can use a hash function to verify that a supplied password is the same as a value you've stored, without having to store the original clear-text version of the password. It's not the same as encryption, because there is no decryption method that would compromise the original supplied password value.

MD5 has known vulnerabilities, especially with regard to uniqueness. A malicious person can use a relatively low-powered computer to compute an input that produces an identical hash to one you've stored, thus compromising the algorithm's usefulness.

Enter the SHA256 algorithm. It's the same idea as MD5, but without the known vulnerabilities. Here's a program example:

data _null_;
  format hash $hex64.;
  hash = sha256("SHA256 is part of SAS 9.4m1!");
  put hash;

Output (formatted as hexadecimal so as to be easier on the eyes than 256 ones-and-zeros):


As the name implies, it produces a value that is 256 bits (32 bytes) in size, as compared to 128 bits from MD5. Here's a useful article that compares the effectiveness of hash algorithms.

The SHA256 function was added to SAS 9.4 Maintenance 1. If you've been wanting to hash your data in SAS, but you've been poo-pooing the MD5 function -- well, now is your chance!

tags: MD5, SAS 9.4, security, SHA256
11月 202013

If you’re not an expert on encryption, have no fear! SAS 9.4 has introduced ways to bring stronger encryption to your SAS deployment. The good news is that SAS/SECURE is now a part of Base SAS when you upgrade to SAS 9.4 and is not a separately licensed product anymore.

This is great news for our SAS administrators! But, what if you’re not an expert on encryption? Let’s take a look really quickly at the basics of encryption:

What is encryption?

Encryption refers to the process of protecting data. Encryption is the transformation of intelligible data (plaintext) into an unintelligible form (ciphertext) by means of a mathematical process. The ciphertext is translated back to plaintext when the appropriate key that is necessary for decrypting (unlocking) the ciphertext is applied. There are two primary forms of encryption:

  • Over-the-wire encryption protects data while it is in transit. Passwords in transit to and from SAS servers are encrypted or encoded.
  • On-disk encryption protects data at rest. Passwords in configuration files, metadata login passwords, and metadata internal account passwords are encrypted or encoded.

Cryptography refers to the science of encoding and decoding information to protect its confidentiality. Encryption is a type of cryptography.

Algorithm in encryption refers to the mathematical process that is applied to transform the plaintext into ciphertext. Examples of algorithms supported by SAS/SECURE include:

  • AES (Advanced Encryption Standard)
  • DES (Data Encryption Standard)
  • RC4 (a type of stream cipher, proprietary algorithm developed by RSA Data Security, Inc.).

AES is one of the most popular algorithms used in symmetric key cryptography and is newly available in SAS/SECURE over SAS 9.4. It is also the algorithm I will use in the examples below.

Why is SAS/SECURE important for SAS 9.4 users?

Now that you are an encryption expert, what can you do with it? Why should you be excited about SAS/SECURE being available with Base SAS in SAS 9.4? Here are a couple of key takeaways for you—including SAS/SECURE brings:

  • a strong level of encryption to all SAS deployments running UNIX, Windows, or Z/OS (except where prohibited by import restrictions).
  • a new encryption type for your stored passwords, SAS004 (AES encryption with 64-bit salt).

Please note that SAS/SECURE only refers to encryption, and not to other security features, such as authorization. For more, please read Encryption in SAS 9.4

Encoding a password in Base SAS

The PWENCODE procedure enables you to encode passwords. Here is the syntax for PROC PWENCODE:

Encoded passwords can be used in place of plaintext passwords in SAS programs that access relational database management systems and various servers (such as SAS/CONNECT servers, SAS/SHARE servers, and SAS IOM servers such as the SAS Metadata Server).

  1. If you submit the following PROC PWENCODE statement:
  2. The log file shows these results. Notice that each character of the password is replaced by an X in the SAS log file.
  3. Plan to reuse. You have many options for re-using this encrypted password. My favorite is creating a macro variable with the encrypted password. Make sure to include the macro in double quotes so that it resolves properly.

Protecting PDF output

PDF output is what many of our users tell me they use. Encryption of PDF files using ODS began in SAS 9.2. Since SAS/SECURE is now included in Base SAS 9.4, this has wider implications for more of our users. When your PDF file is not password protected, any user can use Acrobat to view and edit the PDF files. You can encrypt and password-protect your PDF output files by specifying the PDFSECURITY system option along with the PDFPASSWORD= option.   Here are the steps in the process:

  1. I start by viewing the security properties of a PDF file by opening the PDF file, right-clicking inside the document, selecting Document Properties from the menu, and then clicking Show Details. Here are my PDF properties before applying encryption:
  2. I can apply encryption and password protection to my ODS PDF file by simply adding an OPTIONS statement to your SAS program:
  3. Now when I try to open the PDF file, it prompts me for my password:
  4. Here are my PDF properties after applying encryption:

Using AES-encrypted data files

You must use both of the following options when you want to use AES encryption.

  • ENCRYPTKEY= data set option specifies a key value
  • ENCRYPT= data set option now supports AES encryption.

(Please note that AES encryption is not supported for the “tape” engine.  You can use ENCRYPT=YES for TAPE engine encryption, which uses the SAS Proprietary encryption algorithm that has been available with Base SAS since SAS 6.11).

  1. To use encrypted AES data files, you must use SAS 9.4 or later AND SAS/SECURE software. To copy an encrypted AES data file, the output engine must support AES encryption. Also, and this is very important, if you forget to record the ENCRYPTKEY= value, you lose your data. SAS cannot assist you in recovering the ENCRYPTKEY= value. Please see this example DATA step for where to specify these options.
  2. The resulting message in the log file below displays a warning that I cannot open the file or recover the data without the encryption key.
  3. Then I can use the key to work with that data- and I must use the ENCRYPTKEY= option when you are creating or accessing a SAS data set with AES encryption. This option only prevents access to the contents of the file. To protect the file from deletion or replacement, the file must also contain an ALTER= password.

Please let me know how encryption in Base SAS 9.4 will be useful for you!



tags: encryption, SAS Administrators, security
7月 172013

I love tables. As a writer, there's nothing more satisfying to me than distilling complicated information into neat rows and columns. That's one of the features that caught my eye in SAS User ID and Password Usage Rules. The other is its potential usefulness for SAS administrators who manage SAS servers and clients on multiple platforms.

This five-page guide defines the terms user, internal account and external account and provides brief explanations. Two tables summarize the rules for user IDs and passwords for each type of account across Windows, Unix and Linux operating systems. There are plenty of references to more detailed SAS software and operating system guides.

A short post about a short paper! But I'd like your feedback on how useful you find this kind of summary information, especially the tables.

tags: SAS Administrators, security
5月 302013

Reading Jan Bigalke’s SAS Global Forum paper on “Hardening a SAS® Installation on a multi tier installation on Linux" reminded me of baking apple stack cake with my mother.  Neither is a simple project.  Both are time-consuming, and their success depends on how skillfully you handle each layer.

Data security is a global concern, and configuring SAS in a distributed computing environment with enhanced security and regulatory controls is a challenge SAS administrators must face more frequently. To meet today’s more stringent requirements, SAS administrators must understand the different technologies available for securing individual components of the architectural stack—options for all SAS components as well as options for any third-party components and tools.  In his most recent paper, Bigalke offers these suggestions and documents his approach for securing a multi-tier installation of SAS software in a Linux environment:

  •  Understand the explicit security needs of the organization and the options available for meeting those needs.  Bigalke based his configuration on meeting FIPS 140-2 requirements of the US government computer security standard.  
  • Use single sign-on to minimize the need for providing user credentials. SAS Web applications and clients generally require users to enter credentials. 
  • Protect the Web components using reverse proxy and TLS/SSL signed certificates.  Web components are generally the most exposed, and these techniques will not only secure the connection but also be more convenient to the end-user.
  • Configure SAS clients, SAS metadata, Base SAS and third-party data sources using appropriate authentication options.  SAS 9.3 components that use WIP Services to connect to the SAS System offer direct LDAP authentication.  You may also want to explore JAVA-based versus standard SAS-based functions for securing connections using TSL/SSL protocols.  

 Other SAS Global Forum 2013 papers that cover security topics include:

For the more information on security and configuration options, here’s a handful of recently published SAS configuration guides:


Image provided by by creative commons

tags: papers & presentations, SAS Administrators, security