SAS administrators

2月 082018
 

In my last article, I worked with an example of using custom polygon data to create a regional geo map in SAS Visual Analytics 7.4. In this article, I will use almost the same example to illustrate the ease of implementing custom polygons to produce the same regional map in SAS Visual Analytics 8.2.

In this example, as in my last blog, the site has sales data for each sales region in the US and would like to display a geo map of the regions.

The six sales regions are:

Custom polygons in SAS Visual Analytics

We will again start with the MAPSGFK.US_STATES dataset, which contains the data required to overlay all states of the US on a VA region geomap and has these columns:

As in my last post, we will add the sales regions (REGION) column and values using data step code, and then use GREMOVE to remove the state boundaries, leaving the region boundary points.  For a look at that code, see my previous blog.

The following datastep adds the necessary columns/values to the polygon dataset so that the form of the data is what is expected by VA.  Note that the LAT and LONG columns are already in unprojected form, so we just assign those values to Y and X, so our column names will more closely match what we will see in the VA interface when creating the geographic data item.   We also create a SEQUENCE column, required by VA 8.2,  using the values of the internal variable, _n_.

data mydata.regions;
   set mydata.regions;
   sequence=_n_;
   id=region;
   x=long;
   y=lat;
   keep ID SEQUENCE SEGMENT X Y ;
   run;

The polygon table, REGIONS,  now has the following columns.

The dataset containing the region and measure data, REGIONSALES contains these columns:

Both datasets should be loaded into memory. Sign in to SAS Visual Analytics – Explore and Visualize Data and create a new report with data source REGIONSALES.

Create a new Geography data item from REGION as shown below, also specifying a New Polygon Provider with values shown on the next several screen shots.  Give the new provider a name and label, and specify the CAS server, library, and table name.

Scroll down to add the ID, Sequence, Segment, latitude and longitude columns.

The new geography data item, after clicking OK:

Now create a Geo Map of type Regions as shown:

Please Creating a regional map with custom polygons in SAS Visual Analytics 8.2 was published on SAS Users.

2月 082018
 

By default, SAS Visual Analytics 7.4 supports country and state level polygons for regional geomaps. In SAS Visual Analytics 7.4, custom shape files are now supported, as well. This means that if a site has their own custom polygon data that defines custom regions, it’s possible to create a region geomap that displays those regions.

Implementing the process requires completing some preparatory steps, explicitly execution of some SAS code, but the steps are explained in Appendix 2 of the SAS Visual Analytics 7.4: Administration Guide. The SAS program that completes the steps is provided for download at http://support.sas.com/rnd/datavisualization/vageo/va74polygons.sas.

Two examples using the program are provided in Appendix 2 for US counties and German provinces. The instructions in Appendix 2 assume that the custom polygon data is provided in ESRI shape file format, which is likely the most common use-case. The site will need access to a SAS programming environment and SAS/GRAPH software, and whoever completes the process will need access to the SAS Visual Analytics configuration directory and the ability to restart services—so an administrator-type person will be required.

One common request is to provide a regional geomap, where the regions are site-defined groups of states or provinces of a country. In this example problem, the site has sales data for each sales region in the US and would like to display a geo map of the regions.

Custom regional map in SAS Visual AnalyticsFor this type of region/province example, you will likely be able to use one of the maps already provided by SAS in the MAPSGFK library to produce your region boundaries. For more information on the datasets in the MAPSGFK library, see this paper. 

The MAPSGFK.US_STATES dataset contains the data required to overlay all states of the US on a VA region geomap and has these columns:

The highlighted columns, STATECODE, LONG, and LAT will be particularly useful, but first, the sales region (REGION) column and values must be added using simple data step code. The unnecessary FIPS code (STATE) can be dropped in the same DATA step.  Note that the region values are assigned in upper case, as these will later be converted to ID values, which VA expects to be in upper case.

data regions;
   length region $ 12;
   drop state;
   set mapsgfk.us_states;
      if statecode in ('AK','HI','PR') then delete;
      else if statecode in ('WA','MT','OR','ID','WY')
         then region='NORTHWEST';
      else if statecode in ('CA','NV','UT','AZ','CO','NM')
         then region='SOUTHWEST'; 
      else if statecode in ('ND','SD','NE','MN','WI','MI','IA','IL','IN')
         then region='NORTHCENTRAL'; 
      else if statecode in ('KS','OK','TX','MO','AR')
         then region='SOUTHCENTRAL'; 
      else if statecode in ('ME','NH','VT','MA','RI','CT','NY','PA','NJ','OH','DE',
'MD','DC')then region='NORTHEAST';
      else if statecode in ('KY','WV','VA','TN','NC','MS','AL','LA','GA','SC','FL')
         then region='SOUTHEAST';
      run;

The data is then sorted by the REGION values, a requirement of the SAS/GRAPH GREMOVE procedure, which is used to remove the internal state boundary data points, leaving the region boundary points only.

proc sort data=regions;
   by region;
 proc gremove data=regions out=mapscstm.regions1;
    by region;
    id statecode;
    run;

To complete the process, since the LAT and LONG values are already in the form that VA needs (unprojected) and we are using a SAS dataset rather than the ESRI shape file format, we’ll only use a part of the code from the downloadable program mentioned at the beginning of the blog.

First, create a mapscstm directory under /SASHome/SASFoundation/9.4 to store the custom polygon dataset.  Make sure that the library is accessible to the SAS session by including a libname statement in the appserver_autoexec_usermods.sas file, found in config/Lev1/SASApp, and then restarting the Object Spawner.

Example:

libname MAPSCSTM “SASHome/SASFoundation/9.4/mapscstm”;

Tip:  Be sure to back up the original ATTRLOOKUP and CENTLOOKUP datasets before running any additional code, as you will be modifying the originals.

To complete creation of the polygon dataset, you will need to execute only a part of the downloadable program to:
• Make sure that your polygon dataset has all of the columns expected by SAS Visual Analytics.
• Add the region attributes to the ATTRLOOKUP.
• Add the region center point locations to the CENTLOOKUP dataset.

%let REGION_LABEL=USRegions;   /* The label for the custom region */
 %let REGION_PREFIX=R1; /* unique ISO 2-Letter Code  */
 %let REGION_ISO=000; /* unique ISO Code  */
 %let REGION_DATASET=MAPSCSTM.REGIONS1;  /* Polygon data set to be 
              created - be sure to use suffix "1" */

Note that the downloadable program includes additional macro assignments and additional code, but since our data is already in the form of a SAS dataset, rather than ESRI shape file format, we won’t be using all of the code.

The following datastep adds the necessary columns/values to the polygon dataset so that the form of the data is what is expected by VA.  Note that the LAT and LONG columns are already in unprojected form, so we just assign the same values to X and Y.  (VA doesn’t actually use the X,Y columns from the polygon dataset.)

data &REGION_DATASET.;
   set &REGION_DATASET.;
   where density <= 3; 
   id=region;
   idname=region;
   x=long;  
   y=lat;
   ISO = "&REGION_ISO.";
   RESOLUTION = 1;
   LAKE = 0;
   ISOALPHA2 = "&REGION_PREFIX.";
   AdminType = "regions";
   keep ID SEGMENT IDNAME LONG LAT X Y ISO DENSITY RESOLUTION LAKE ISOALPHA2 AdminType;
   run;

Then PROC SQL steps are executed to add rows relative to the custom polygons to the ATTRLOOKUP and CENTLOOKUP datasets:

This step adds the USRegions row to ATTRLOOKUP:

proc sql;
   insert into valib.attrlookup
      values ( 
         "&REGION_LABEL.",         /* IDLABEL=State/Province Label */
         "&REGION_PREFIX.",        /* ID=SAS Map ID Value */
         "&REGION_LABEL.",         /* IDNAME=State/Province Name */
         "",                       /* ID1NAME=Country Name */
         "",                       /* ID2NAME */
         "&REGION_ISO.",           /* ISO=Country ISO Numeric Code */
         "&REGION_LABEL.",         /* ISONAME */
         "&REGION_LABEL.",         /* KEY */
         "",                       /* ID1=Country ISO 2-Letter Code */
         "",                       /* ID2 */
         "",                       /* ID3 */
         "",                       /* ID3NAME */
         0                         /* LEVEL (0=country level, 1=state level) */
         );
quit;

This step adds a row to ATTRLOOKUP for each individual region:

proc sql;
   insert into valib.attrlookup
      select distinct 
         IDNAME,            /* IDLABEL=State/Province Label */
         ID,                /* ID=SAS Map ID Value */
         IDNAME,            /* IDNAME=State/Province Name */
 
         "&REGION_LABEL.",  /* ID1NAME=Country Name */
         "",                /* ID2NAME */
         "&REGION_ISO.",    /* ISO=Country ISO Numeric Code */
         "&REGION_LABEL.",  /* ISONAME */
         trim(IDNAME) || "|&REGION_LABEL.",  /* KEY */
         "&REGION_PREFIX.",   /* ID1=Country ISO 2-Letter Code */
         "",                  /* ID2 */
         "",                  /* ID3 */
         "",                  /* ID3NAME */
         1                    /* LEVEL (1=state level) */
   from &REGION_DATASET.;
quit;

This step calculates and adds the central location point for each of the regions to the CENTLOOKUP dataset.   The site data contains only the 48 contiguous states (no Alaska or Hawaii). If Alaska and Hawaii had been included, a different algorithm would need to be used to calculate the central location.

proc sql;
   /* Add custom region */
   insert into valib.centlookup
      select distinct
         "&REGION_DATASET." as mapname,
         "&REGION_PREFIX." as ID,
         avg(x) as x,
         avg(y) as y
      from &REGION_DATASET.;
 
   /* Add custom provinces */
   insert into valib.centlookup
      select distinct
         "&REGION_DATASET." as mapname,
         ID as ID,
         avg(x) as x,
         avg(y) as y
      from &REGION_DATASET.
         group by id;
quit;

After executing the code above, you will need to restart the Web Application server, so that SAS Visual Analytics has access to the new polygons.

Code is also included in the downloadable program to create a dataset for validating your results. The validate dataset includes a column for the ID and IDNAME of the regions, in addition to two randomly calculated measures.  In our case, we will instead just use our original REGIONSALES dataset containing the regional sales data.

1. Sign into SAS Visual Analytics and create a new exploration with data source REGIONSALES.
2. Create a Geo data item from State: Right-click Regions, select Geography?Subdivision(State, Province) Names. From the Country or Region drop-down list, select the USRegions region label.
3. Create a geo map visualization. Select Regions for the map style, Regions for the Geography role, and salesamt for the Color role.

Your regions should display, similar to this:

You can also include the region data item in a hierarchy with the state data item to produce a drill-down region map:

Or a bubble or coordinate map:

I hope this example has been helpful to users of SAS Visual Analytics 7.4.  In my next blog, you will see that this process is tremendously simplified by new mapping features in SAS Visual Analytics 8.2.

Creating a custom regional map in SAS Visual Analytics 7.4 was published on SAS Users.

1月 172018
 

In this article, I want to give you an overview of the authentication options available with SAS Viya 3.3. SAS Viya 3.3, released in the second week of December 2017, and the second release with the new microservices architecture, presents more options for authentication than the previous releases. In future posts, we will delve in to more detail for a select option.

Types of Deployment

Before we look at the options for authentication we need to define some terms to help us describe the type of environment. The first of these is the type of deployment. With SAS Viya 3.3 we can have two different types of deployment:

  1. Full Deployment
  2. Programming Only

As the name suggests, the full deployment is a deployment of all the different components that make up the ordered SAS Viya 3.3 product or solution. This includes the SAS Viya runtime engine, CAS (Cloud Analytic Services), the microservices, stateful services, and foundation components used by SAS® Studio.

The programming only deployment more closely resembles the deployment we saw in an earlier release; so, this includes CAS and all the parts for SAS Studio to function. A programming only deployment does not include the microservices and stateful services. The only interaction with CAS is via SAS Studio and the code end-users run within this.

Types of Interfaces

Following on from the type of deployment, we can classify the end-user interfaces used to access SAS Viya 3.3.  The interface could be a visual interface or a programming interface. For a visual interface, we group all the SAS Viya 3.3 web applications, excluding SAS Studio. For a programming interface we mean SAS Studio. Equally within programming interface, when we say a programming interface accesses CAS we could also mean the Python, Lua, R or Java interfaces.

Similarly, as of the fifth maintenance release of SAS 9.4 we can interact directly with CAS. Previously, this interaction was based around the use of SAS/CONNECT® and remote submitting code to the SAS Viya programming interface. With SAS 9.4 M5, we can now directly connect to CAS from the SAS foundation. So, a third type of interface for us to consider is the SAS 9.4 M5 client.

Visual Interfaces Authentication

As we know with SAS Viya 3.3, the way the end-user authenticates to the visual interfaces is via the SAS® Logon Manager. The SAS Logon Manager is accessed via the HTTP Proxy. The following picture summarizes the options for authenticated to the SAS Logon Manager in SAS Viya 3.3.

SAS Viya 3.3 authentication options

The first thing to point out and something to always remember is the following:

The identities microservice always must connect to an LDAP provider to obtain user and group information.

This LDAP provider could be Microsoft Active Directory or any other LDAP provider such as OpenLDAP.

So, what are our options for authenticating the users accessing SAS Logon Manager? We have five options with the SAS Viya 3.3:

1.      LDAP Provider (the default option)
2.      Kerberos or Integrated Windows Authentication
3.      OAuth/OpenID Connect
4.      SAML
5.      Multi-factor Authentication (New with SAS Viya 3.3)

Option 1 is the default authentication mechanism enabled out-of-the-box for SAS Viya 3.3 is the LDAP Provider. The same connection details used by the identities microservice are used by SAS Logon Manager to authenticate the credentials the end-user enters in the logon form. From a security perspective, we need to be concerned about what network connections these end-user credentials will be sent over. First, we have the network connection between the browser and the HTTP proxy, which is secured by default with HTTPS in SAS Viya 3.3. Then we have the network connection between SAS Logon and the LDAP Provider, here we can support LDAPS to encapsulate the LDAP connection in standard TLS encryption.

Option 2, as shown in the diagram, is to configure SAS Logon Manager for Kerberos authentication. This provides the end-user with Single Sign-On from their desktop where the browser is running. This is sometimes referred to as Integrated Windows Authentication (IWA). This will enable the end-user to access the SAS Viya 3.3 visual interfaces without being prompted to enter any credentials. However, it is important to remember that the identities microservice will still be connecting to the LDAP provider. The Kerberos authentication option completely replaces the option to use the default LDAP provider for the SAS Logon Manager. Introduced with SAS Viya 3.3 is the option to delegate the credentials from SAS Logon Manager through to CAS; more on this option below.

Option 3 enables the SAS Logon Manager to be integrated with an alternative OAuth/OpenID Connect provider. This provider could be something internal to the customer’s wider environment or could be external to the customer, such as Google Auth of Facebook. When the OAuth/OpenID Connect option is configured this does not completely replace the default LDAP provider. Instead when the end-user accesses the SAS Logon Manager they are presented with a link to authenticate using OAuth/OpenID Connect and the standard login form using the LDAP provider. The end-user can then select which to use. This option can provide single sign-on from the OAuth/OpenID Connect provider;for example, sign into your Google account and access the SAS Viya 3.3 visual interfaces without further prompting for credentials. Custom code can be added to the SAS Logon Manager login form that automatically links to the external OAuth/OpenID Connect provider. This makes the single sign-on more seamless, since there is no need to select the link.

Option 4 supports configuring the SAS Logon Manager to be integrated with an external SAML Identity Provider. This SAML Identity Provider could be internal or external to the customer’s wider environment. If it is internal it could be something like Oracle Access Manager or Active Directory Federation Services, whilst if its external it could be something like salesforce.com. Again, like option 3, the use of SAML does not completely replace the default LDAP provider. End-users accessing the SAS Logon Manager will be able to choose SAML authentication or the default LDAP provider. Also, this option provides single sign-on with the third-party SAML provider. Custom code can be added to the SAS Logon Manager login form that automatically links to the external SAML provider, making the single sign-on more seamless, since there is no need to select the link.

Option 5 supports the use of Multi-factor authentication with SAS Logon Manager. This is a new option (with SAS Viya 3.3) and requires the configuration of a third-party Pluggable Authentication Module (PAM). This PAM module is the part of the system that integrates with the multi-factor authentication provider such as Symantec’s VIP. The PAM module authenticates the end-user by causing the third-party to push an out-of-band validation request to the end-user. Normally, this would be a push message to a smart phone application, approving the request forms the additional factor in the authentication of the end-user. When an end-user enters their username and password in the SAS Logon Manager form they are checked against the PAM provider. This means this option replaces the LDAP provider, just as with Kerberos.

For all five options listed above, the connection to CAS is performed using internal OAuth tokens generated by the SAS Logon Manager. In most cases the actual session started by the CAS Controller will now run on the operating system as the same user who launched the CAS operating system service. This account defaults to the name cas.

The exception to this is Option 2: Kerberos with delegation. In this case while an OAuth token is generated and initially used to connect to CAS  a second authentication takes place with the delegated Kerberos credentials. This means that the CAS session is started as the end-user and not the user who launched the CAS operating system service.

Programming Interfaces Authentication

Now we’ve looked at the visual interfaces for SAS Viya 3.3, what about the programming interfaces or SAS Studio? Unlike SAS 9.4, SAS Studio with SAS Viya 3.3 is not integrated with the SAS Logon Manager. The following diagram illustrates the case with SAS Studio.

SAS Viya 3.3 authentication options

SAS Studio in the full deployment is integrated with the HTTP Proxy, so with SAS Viya 3.3 end-users do not directly connect to the SAS Studio web application. However, the username and password entered into SAS Studio are not passed to the SAS Logon Manager to authenticate. Instead the SAS® Object Spawner uses the PAM configuration on the host to validate the username and password. This could be a local account on the host or, depending on the PAM configuration, an account in an LDAP Provider. This authentication is sufficient to start the SAS® Workspace Server where the code entered in SAS Studio will be run.

When the SAS Workspace Server connects to CAS it uses the username and password that were used to start the SAS Workspace Server. The CAS Controller uses its own PAM configuration to validate the end-user’s credentials and launch the session process running as the end-user.

Since CAS is integrated into the visual components, and the username and password are passed from the SAS Workspace Server, the CAS Controller uses them to obtain an internal OAuth token from the SAS Logon Manager. This means that the username and password must be valid in the provider configured for the SAS Logon Manager otherwise CAS will not be able to obtain an OAuth token and the session launch will fail.

Therefore, it makes sense in such a deployment for all the three components:

1.      PAM for SAS Studio (sasauth*)
2.      PAM for CAS (cas)
3.      SAS Logon Manager

to all use the same LDAP Provider. If these three components are not sending the username and password entered in SAS Studio to the same place we are likely to see errors when trying to connect.

Programming Only Deployment

For a programming only deployment, we have SAS Studio and CAS but we don’t have any microservices or stateful services. So here all authentication is via the PAM configuration for SAS Studio and CAS. Since CAS knows there are no microservices, it does not attempt to obtain an internal OAuth token from the SAS Logon Manager, the same type of setup we had for SAS Viya 3.1.

SAS 9.4 Maintenance 5 Integration

There are three main ways in which SAS 9.4 Maintenance 5 can integrate with CAS. First, if the SAS 9.4 M5 session has access to a Kerberos credential for the end-user, then Kerberos can be used for the authentication. For example, if Kerberos is used by the end-user to access the SAS 9.4 M5 client, such as a web application or SAS Enterprise Guide, the authentication can be delegated all the way through to CAS. Kerberos will then be used to authenticate to SAS Viya Logon Manager and obtain an OAuth token.

Second, if the SAS 9.4 M5 session has access to the end-user’s username and password; either from the cached credentials used to launch the session, an authinfo file, or from SAS 9.4 Metadata, then these credentials can be used to authenticate to CAS. The username and password will be used to launch the CAS and obtain an OAuth token from SAS Viya Logon Manager. This will be like the programming approach we detailed above.

Finally, for SAS 9.4 Maintenance 5 sessions which are not running as the end-user, we also have a solution. These sessions could be SAS® Stored Process or Pooled Workspace Server sessions, or even a SAS token launched workspace server. For these sessions, we can leverage the SAS® 9.4 Metadata Server to generate a one-time-password. This is the same way in which the SAS Stored Process itself is accessed. To be able to leverage the One-Time-Password with CAS, additional configuration is required in SAS Viya Logon Manager. SAS Viya Logon Manager must be configured with the details of the location of the URL for the SAS® 9.4 Web Infrastructure Platform. The services in the SAS 9.4 Web Infrastructure Platform will be used to validate the One-Time-Password. All this means that CAS can be accessed from a SAS 9.4 Metadata aware connection where end-user Operating System credentials are not available.

Conclusion

I hope that this overview has provided some context to the different types of authentication happening within and to a SAS Viya 3.3 deployment. Understanding the types of authentication available will be important for helping customers to select the best options for them. In future blog posts, we’ll look at the different new options in more detail.

SAS Viya 3.3 authentication options was published on SAS Users.

12月 202017
 

Running SAS programs in batchWhile SAS program development is usually done in an interactive SAS environment (SAS Enterprise Guide, SAS Display Manager, SAS Studio, etc.), when it comes to running SAS programs in a production or operations environment, it is routinely done in batch mode.

Why run SAS programs in batch mode?

First and foremost, this is done for automation, as the batch process does not require human participation at the time of run. It can be scheduled to run (using Operating System scheduler or other scheduling software) while we sleep, at any time of the day or at any time interval between two consecutive runs.

Running SAS programs in batch mode allows streamlining SAS processing by eliminating the possibility of human error, submitting multiple SAS jobs (programs) all at once or in a sequence securing programs and/or data dependencies.

SAS batch processing also takes care of self-documenting, as it automatically generates and stores SAS logs and outputs.

Imagine the following scenario. Every night, a SAS batch process “wakes up” at 3 a.m. and runs an ETL process on a SAS Application server that extracts multiple tables from a database, transforms, combines, and loads them into a SAS datamart; then moves some data tables across the network and loads them into SAS LASR server, so when you are back to work in the morning your SAS Visual Analytics application has all its data refreshed and ready to roll. Of course, the process schedule can be custom-tailored to your particular needs; your batch jobs may run every 15 minutes, once a week, every first Friday of the month – you name it.

What is a batch script file?

To submit a single SAS program in batch mode manually, you could submit an OS command that looks something like the following:

Unix/Linux

sas /sas/code/proj1/job1.sas -log /sas/code/proj1/job1.log

DOS/Windows

"C:\Program Files\SASHome\SASFoundation\9.4\Sas.exe" -SYSIN c:\proj1\job1.sas -NOSPLASH -ICON -LOG c:\proj1\job1.log

However, submitting an OS command manually has too many drawbacks: it’s too much typing, it only submits one SAS program at a time, and most importantly – it is manual, which means it is prone to human error.

Usually, these OS commands are packaged into so called batch files (shell scripts in Unix) that allow for sequential, parallel, as well as conditional execution of multiple OS line commands. They can be run either manually, or automatically – on schedule, or called by other batch scripts.

In a Windows/DOS Operating System, these script files are called batch files and have .bat filename extensions. In Unix-like operating systems, such as Linux, these script files are called shell scripts and have .sh filename extensions.

Since Windows batch files are similar, but slightly different from the Unix (and its open source cousin Linux) shell scripts, in the below examples we are going to use Unix/Linux shell scripts only, in order to avoid any confusion. And we are going to use terms Unix and Linux interchangeably.

Here is the typical content of a Linux shell script file to run a single SAS program:

#!/usr/bin/sh
dtstamp=$(date +%Y.%m.%d_%H.%M.%S)
pgmname="/sas/code/project1/program1.sas"
logname="/sas/code/project1/program1_$dtstamp.log"
/sas/SASHome/SASFoundation/9.4/sas $pgmname -log $logname

Note, that the shell script syntax allows for some basic programming features like current datetime function, formatting, and variables. It also provides some conditional processing similar to “if-then-else” logic. For detailed information on the shell scripting language you may refer to the following BASH shell script tutorial or any other source of many dialects or flavors of the shell scripting (C Shell, Korn Shell, etc.)

Let’s save the above shell script as the following file:
/sas/code/project1/program1.sh

How to submit a SAS program via Unix script

In order to run this shell script we would submit the following Linux command:
/sas/code/project1/program1.sh

Or, if we navigate to the directory first:
cd /sas/code/project1

then we can submit an abbreviated Linux command
./program1.sh
When run, this shell script not only executes a SAS program (program1.sas), but for every run it also creates and saves a uniquely named SAS Log file. You may create the SAS log file in the same directory where the SAS code is stored, as specified in the script shell above, or specify another directory of your choice.

For example, it creates the following SAS log file:
/sas/code/project1/program1_2017.12.06_09.15.20.log

The file name uniqueness is achieved by adding a date/time stamp suffix between the SAS program name and .log file name extension, in this particular case indicating that this SAS log file was created on December 6, 2017, at 09:15:20 (hours:minutes:seconds).

Unix script for submitting multiple SAS programs

Unix scripts may contain not only OS commands, but also other Unix script calls. You can mix-and-match OS commands and other script calls.

When scripts are created for each individual SAS program that you intend to run in a batch, you can easily combine them into a program flow by creating a flow script containing those single program scripts. For example, let’s create a script file /sas/code/project1/flow1.sh with the following contents:

/sas/code/project1/program1.sh
/sas/code/project1/program2.sh
/sas/code/project1/program3.sh

When submitted as

/sas/code/project1/flow1.sh

it will sequentially execute three scripts - program1.sh, program2.sh, and program3.sh, each of which will execute the corresponding SAS program - program1.sas, program2.sas, and program3.sas, and produce three SAS logs - program1.log, program2.log, and program3.log.

Unix script file permissions

In order to be executable, UNIX script files must have certain permissions. If you create the script file and want to execute it yourself only, the file permissions can be as follows:

-rwxr-----, or 740 in octal representation.

This means that you (the Owner of the script file) have Read (r), Write (w) and Execute (x) permission as indicated by the green highlighting; Group owning the script file has only Read (r) permission as indicated by yellow highlighting;  Others have no permissions to the script file at all as indicated by red highlighting.

If you want to give yourself (Owner) and Group execution permissions then your script file permissions can be as:

-rwxr-x---, or 750 in octal representation.

In this case, your group has Read (r) and Execute (x) permissions as highlighted in yellow.

In Unix, file permissions are assigned using the chmod Unix command.

Note, that in both examples above we do not give Others any permissions at all. Remember that file permissions are a security feature, and you should assign them at the minimum level necessary.

Conditional execution of scripts and SAS programs

Here is an example of a Unix script file that allows running multiple SAS programs and OS commands at different times.

#!/bin/sh

#1 extract data from a database
/sas/code/etl/etl.sh

&gt;#2 copy data to the Visual Analytics autoload directory
scp -B userid@sasAPPservername:/sas/data/*.sas7bdat userid@sasVAservername:/sas/config/.../AutoLoad

#3 run weekly, every Monday
dow=$(date +%w)
if [ $dow -eq 1 ]
then
   /sas/code/alerts_generation.sh
fi

#4 run monthly, first Friday of every month
dom=$(date +%d)
if [ $dow -eq 5 -a $dom -le 7 ]
then
   /sas/code/update_history.sh
   /sas/code/update_transactions.sh
fi

In this script, the following logical operators are used: -eq (equal), -le (less or equal), -a (logical and).

As you can see, the script logic takes care of branching to execute different SAS programs when certain timing conditions are met. With such an approach, you would need to schedule only this single script to run at a specified time/interval, say daily at 3 a.m.

In this case, the script will “wake up” every morning at 3 a.m. and execute its component scripts either unconditionally, or conditionally.

If one of the included programs needs to run at a different, lesser frequency (e.g. every Monday, or monthly on first Friday of every month) the script logic will trigger those executions at the appropriate times.

In the above script example steps #1 and #2 will execute every time (unconditionally) the script runs (daily). Step #1 runs ETL program to extract data from a database, step #2 copies the extracted data across the network from SAS Application server to the SAS LASR Analytic server’s drop zone from where they are automatically loaded (autoloaded) into the LASR.

Step #3 will run conditionally every Monday ( $dow -eq 1). Step #4 will run conditionally every first Friday of a month ($dow -eq 5 -a $dom -le 7).

For more information on how to format date for use in shell scripts please refer to this post.

Do you run your SAS programs in batch?

Please share your batch experiences in the comment section below. I am sure the rest of us will really appreciate it!

Running SAS programs in batch under Unix/Linux was published on SAS Users.

12月 192017
 

In my last article, Managing SAS Configuration Directory Security, we stepped through the process for granting specific users more access without opening up access to everyone. One example addressed how to modify security for autoload. There are several other aspects of SAS Visual Analytics that can benefit from a similar security model.

You can maintain a secure environment while still providing one or more select users the ability to:

  • start and stop a SAS LASR Analytic Server.
  • load data to a SAS LASR Analytic Server.
  • import data to a SAS LASR Analytic Server.

Requirements for these types of users fall into two areas: metadata and operating system.

The metadata requirements are very well documented and include:

  • an individual metadata identity.
  • membership in appropriate groups (for example: Visual Analytics Data Administrators for SAS Visual Analytics suite level administration; Visual Data Builder Administrators for data preparation tasks; SAS Administrators for platform level administration).
  • access to certain metadata (refer to the SAS Visual Analytics 7.3: Administration Guide for metadata permission requirements).

Operating System Requirements

Users who need to import data, load data, or start a SAS LASR Analytic Server need the ability to authenticate to the SAS LASR Analytic Server host and write access to some specific locations.

If the SAS LASR Analytic Server is distributed users need:

If the compute tier (the machine where the SAS Workspace Server runs) is on Windows, users need the Log on as a batch job user right on the compute machine.

In addition, users need write access to the signature files directory, the path for the last action logs for the SAS LASR Analytic Server, and the PIDs directory in the monitoring path for the SAS LASR Analytic Server.

Signature Files

There are two types of signature files: server signature files and table signature files. Server signature files are created when a SAS LASR Analytic Server is started. Table signature files are created when a table is loaded into memory. The location of the signature files for a specific SAS LASR Analytic Server can be found on the Advanced properties of the SAS LASR Analytic Server in SAS Management Console.

SAS Configuration Directory Security for SAS Visual Analytics

On Linux, if your signature files are in /tmp you may want to consider relocating them to a different location.

Last Action Logs and the Monitoring Path

In the SAS Visual Analytics Administrator application, logs of interactive actions for a SAS LASR Analytic Server are written to the designated last action log path. The standard location is on the middle tier host in <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator/Monitoring/Logs. The va.lastActionLogPath property is specified in the SAS Visual Analytics suite level properties. You can access the SAS Visual Analytics suite level properties in SAS Management Console under the Configuration Manager: expand SAS Applicaiton Infrastructure, right-click on Visual Analytics 7.3 to open the properties and select the Advanced tab.

The va.monitoringPath property specifies the location of certain monitoring process ID files and logs. The standard location is on the compute tier in <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator/Monitoring/. This location includes two subdirectories: Logs and PIDs. You can override the default monitoring path by adding the va.monitoringPath extended attribute to the SAS LASR Analytic Server properties.

Host Account and Group

For activities like starting the SAS LASR Analytic Server you might want to use a dedicated account such as lasradm or assign the access to existing users. If you opt to create the lasradm account, you will need to also create the related metadata identity.

For group level security on Linux, it is recommended that you create a new group, for example sasusers, to reserve the broader access provided by the sas group to only platform level administrators. Be sure to include in the membership of this sasusers group any users who need to start the SAS LASR Analytic Server or that need to load or import data to the SAS LASR Analytic Server.

Since the last action log path, the monitoring path, and the autoload scripts location all fall under <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator, you can modify the ownership of this folder to get the right access pattern.

A similar pattern can also be applied to the back-end store location for the data provider library that supports reload-on-start.

Don’t forget to change the ownership of your signature files location too!

SAS Admin Notebook: Managing SAS Configuration Directory Security for SAS Visual Analytics was published on SAS Users.

12月 072017
 

authorization in SAS ViyaAuthorization determines what a user can see and do in an application. An authorization system is used to define access control policies, and those policies are later enforced so that access requests are granted or denied. To secure resources in SAS Viya there are three authorization systems of which you need to be aware.

The General Authorization system secures folders within the SAS Viya environment and the content of folders, for example, reports or data plans. It can also secure access to functionality.

The CAS Authorization system secures CAS libraries, tables (including rows and columns) and CAS actions.

The File system authorization system secures files and directories on the OS, for example code or source tables for CAS libraries.

In this post, I will provide a high-level overview of the General and CAS authorization systems.  If you want to dig into more detail please see the SAS Viya Administration Guide Authorization section.

An important factor in authorization is the identity of the user making the request. In a visual deployment of SAS Viya the General and CAS authorization systems use the same identity provider, an LDAP server. The other common feature between the two authorization systems is that they are deny-based. In other words, access to resources is implicitly disallowed by default. This is important as it marks a change for those familiar with metadata authorization in SAS 9.4.

You can administer both general and CAS authorization from SAS Environment Manager. CAS authorization may also be administered from CAS Server Monitor and from the programming interfaces via the accessControl action set. In SAS Viya 3.3, command-line interfaces are available to set authorization for both systems.

General Authorization System

The general authorization system is used to administer authorization for folders, content and functionality. The general authorization system is an attribute based authorization system which determines authorization based on the attributes of the:

  • Subject: attributes that describe the user attempting the access e.g. user, group, department, role, job title etc.,
  • Action: attributes that describe the action being attempted e.g. read, delete, view, update etc.
  • Resource (or object): attributes that describe the object being accessed (e.g. the object type, location, etc.)
  • Context (environment): attributes that deal with time, location etc.

This attribute model supports Boolean logic, in which rules contain “IF, THEN” statements about who is making the request, the resource, the context and the action.

In the general authorization system, information about the requesting user, the target resource, and the environment can influence access. Each access request has a context that includes environmental data such as time and device type. Environmental constraints can be incorporated using conditions.

Permission inheritance in the general authorization system flows through a hierarchy of containers. Each container conveys settings to its child members. Each child member inherits settings from its parent container. Containers can be folders, or rest endpoints which contain functionality. For example, a folder will pass on its permissions to any children which can be additional sub-folders, or content such as reports or data plans.

In the general authorization system precedence, the way in which permission conflicts are resolved, is very simple, the only factor that affects precedence is the type of rule (grant or prohibit). Prohibit rules have absolute precedence. If there is a relevant prohibit rule, effective access is always prohibited.

So, a deny anywhere in the object or identity hierarchy for the principal (user making the request) means that access is denied.

CAS Authorization

The CAS authorization system mostly focuses on data. It makes sense therefore that it is implemented in the style of a database management system (DBMS). DBMS style authorization systems focus on securing access to data. The permissions relate to data for example, read, write, update, select etc., and some CAS-specific ones like promote and limited promote.

Permission inheritance in the general authorization system flows through a hierarchy of objects, The hierarchy is CASLIB > table > rows/columns.

In the CAS authorization system precedence, the way in which permission conflicts are resolved is determined by where an access control is set and who can access control is assigned to.   The precedence rules are:

  • Direct access controls have precedence over inherited settings.
  • Identity precedence from highest to lowest is user > groups > authenticated users.

To put this another way, if a direct access control is found on an object it will determine access. If multiple direct access controls are found, a control for a user will be used over a control for a group, which will be similarly be used over a control for all authenticated users.  If no direct access control is found on, for example, a table, settings will be inherited from the CASLIB in a similar manner.

A final note on CASLIBSs, there may be additional authorization that needs to be considered to provide access to data. For example, for a path based CASLIB if host-layer access requirements are not met, grants in the CAS authorization layer do not provide access.

This has been a brief look at the two authorization systems in a SAS VIYA environment. The table below summarizes some of the information in this blog.

authorization in SAS Viya

As I noted at the start, you can find much more detail in the SAS Viya Administration Guide An introduction to authorization in SAS Viya was published on SAS Users.

12月 072017
 

In SAS Viya, deployments identities are managed by the environments configured identity provider. In Visual SAS Viya deployments the identity provider must be an LDAP (Lightweight Directory Access Protocol)  server. Initial setup of a SAS Viya Deployment requires configuration to support reading the identity information (users and groups) from LDAP. SAS Viya 3.3 adds support for multi-tenancy which has implications for the way users and groups must be stored in LDAP. For the SAS Administrator, this means at least a rudimentary understanding of LDAP is required. In this blog post, I will review some key LDAP basics for the SAS Viya administrator.

A basic understanding of LDAP l ensures SAS Viya administrators can speak the same language as the LDAP administrator.

What is LDAP?

LDAP is a lightweight protocol for accessing directory servers. A directory server is a hierarchical object orientated database. An LDAP server can be used to organize anything. One of the most common uses is as an identity management system, to organize user and group membership.

LDAP directories are organized in a tree manner:

  • A directory is a tree of directory entries.
  • An entry contains a set of attributes.
  • An attribute has a name, and one or more values.

LDAP basics for the SAS Viya administrator

Below is an entry for a user Henrik. It has common attributes like:

  • uid User id
  • cn Common Name
  • L Location
  • DisplayName: name to display

The attribute value pairs are the details of the entry.

The objectclass attribute is worth a special mention. Every entry has at least one objectclass attribute and often more than one. The objectclass is used to provide the rules for the object including required and allowed attributes. For example, the inetorgperson object class specifies attributes about people who are part of an organization, including items such as uid, name, employeenumber etc.

LDAP Tree

Let’s now look at the organization of the tree. DC is the “domain component.” You will often see examples of LDAP structures that use DNS names for the domain component, such as: dc=sas,dc=com. This is not required, but since DNS itself often implies organizational boundaries, it usually makes sense to use the existing naming structure. In this example the domain component is “dc=geldemo,dc=com”. The next level down is the organizational unit (ou).  An organizational unit is a grouping or collection of entries. Organizational units can contain additional organizational units.

But how do we find the objects we want in the directory tree? Every entry in a directory has a unique identifier, called the Distinguished Name (DN). The distinguished name is the full path to the object in the directory tree. For example, the distinguished name of Henrik is uid=Henrik,ou=users, ou=gelcorp,dc=viyademo,dc=com. The distinguished name is the path to the object from lowest to highest (yes it seems backward to me to).

LDAP Queries and Filters

Like any other database LDAP can be queried and it has its own particular syntax for defining queries. LDAP queries are boolean expressions in the format

<em><strong>attribute operator value</strong></em>

<em><strong>uid = sasgnn</strong></em>

 

Attribute can be any valid LDAP attribute (e.g name, uid, city etc.) and value is the value that you wish to search for.  The usual operators are available, including:

Using LDAP filters, you can link two or more Boolean expressions together using the “filter choices” and/or. Unusually, the LDAP “filter choices” are always placed in front of the expressions. The search criteria must be put in parentheses and then the whole term has to be bracketed one more time. Here are some examples of LDAP queries that may make the syntax easier to follow:

  • (sn=Jones): return all entries with a surname equal to Jones.
  • (objectclass=inetorgperson) return entries that use the inegorgperson object class.
  • (mail=*): return all entries that have the mail attribute.
  • (&(objectclass=inetorgperson)(o=Orion)): return all entries that use the inetorgperson object class and that have the organization attribute equal to Orion (people in the Orion organization).
  • (&(objectclass=GroupofNames)(|(o=Orion)(o=Executive))) return all entries that use the groupofNames object class and that have the organization attribute equal to Orion OR the organization attribute equal to Executive (groups in the Orion or Executive organizations).

Why do I need to know this?

How will you apply this LDAP knowledge in SAS Viya? To enable SAS Viya to access your identity provider, you must update the SAS Identities service configuration. As an administrator, the most common items to change are:

  • BaseDN the entry in the tree from which the LDAP server starts it search.
  • ObjectFilter the filter used to identity and limit the users and groups returned.

There is a separate BaseDN and ObjectFilter for users and for groups.

To return users and groups to SASVIYA from our example LDAP server we would set:

sas.identities.providers.ldap.group.BasedN=ou=gelcorp,ou=groups,dc=viyademo,dc=com

sas.identities.providers.ldap.users.BasedN= ou=gelcorp,ou=users,dc=viyademo,dc=com

 

This would tell SASVIYA to begin its search for users and groups at those locations in the tree.

The object filter will then determine what entries are returned for users and groups from a search of the LDAP tree starting at the BaseDN. For example:

sas.identities.providers.ldap.group.objectFilter: 
(&amp;(objectClass=GroupOfNames)(o=GELCorp LTD))

sas.identities.providers.ldap.users.objectFilter: 
(&amp;(objectClass=inetOrgPerson)(o=GELCorp LTD))

 

There are a lot of LDAP clients available that will allow you to connect to an LDAP server and view, query, edit and update LDAP trees and their entries. In addition, the ldif file format is a text file format that includes data and commands that provide a simple way to communicate with a directory so as to read, write, rename, and delete entries.

This has been a high-level overview of LDAP. Here are some additional sources of information that may help.

Basic LDAP concepts

LDAP Query Basics

Quick Introduction to LDAP

How To Use LDIF Files to Make Changes to an OpenLDAP System

LDAP basics for the SAS Viya administrator was published on SAS Users.

12月 072017
 

In SAS Viya, deployments identities are managed by the environments configured identity provider. In Visual SAS Viya deployments the identity provider must be an LDAP (Lightweight Directory Access Protocol)  server. Initial setup of a SAS Viya Deployment requires configuration to support reading the identity information (users and groups) from LDAP. SAS Viya 3.3 adds support for multi-tenancy which has implications for the way users and groups must be stored in LDAP. For the SAS Administrator, this means at least a rudimentary understanding of LDAP is required. In this blog post, I will review some key LDAP basics for the SAS Viya administrator.

A basic understanding of LDAP l ensures SAS Viya administrators can speak the same language as the LDAP administrator.

What is LDAP?

LDAP is a lightweight protocol for accessing directory servers. A directory server is a hierarchical object orientated database. An LDAP server can be used to organize anything. One of the most common uses is as an identity management system, to organize user and group membership.

LDAP directories are organized in a tree manner:

  • A directory is a tree of directory entries.
  • An entry contains a set of attributes.
  • An attribute has a name, and one or more values.

LDAP basics for the SAS Viya administrator

Below is an entry for a user Henrik. It has common attributes like:

  • uid User id
  • cn Common Name
  • L Location
  • DisplayName: name to display

The attribute value pairs are the details of the entry.

The objectclass attribute is worth a special mention. Every entry has at least one objectclass attribute and often more than one. The objectclass is used to provide the rules for the object including required and allowed attributes. For example, the inetorgperson object class specifies attributes about people who are part of an organization, including items such as uid, name, employeenumber etc.

LDAP Tree

Let’s now look at the organization of the tree. DC is the “domain component.” You will often see examples of LDAP structures that use DNS names for the domain component, such as: dc=sas,dc=com. This is not required, but since DNS itself often implies organizational boundaries, it usually makes sense to use the existing naming structure. In this example the domain component is “dc=geldemo,dc=com”. The next level down is the organizational unit (ou).  An organizational unit is a grouping or collection of entries. Organizational units can contain additional organizational units.

But how do we find the objects we want in the directory tree? Every entry in a directory has a unique identifier, called the Distinguished Name (DN). The distinguished name is the full path to the object in the directory tree. For example, the distinguished name of Henrik is uid=Henrik,ou=users, ou=gelcorp,dc=viyademo,dc=com. The distinguished name is the path to the object from lowest to highest (yes it seems backward to me to).

LDAP Queries and Filters

Like any other database LDAP can be queried and it has its own particular syntax for defining queries. LDAP queries are boolean expressions in the format

<em><strong>attribute operator value</strong></em>

<em><strong>uid = sasgnn</strong></em>

 

Attribute can be any valid LDAP attribute (e.g name, uid, city etc.) and value is the value that you wish to search for.  The usual operators are available, including:

Using LDAP filters, you can link two or more Boolean expressions together using the “filter choices” and/or. Unusually, the LDAP “filter choices” are always placed in front of the expressions. The search criteria must be put in parentheses and then the whole term has to be bracketed one more time. Here are some examples of LDAP queries that may make the syntax easier to follow:

  • (sn=Jones): return all entries with a surname equal to Jones.
  • (objectclass=inetorgperson) return entries that use the inegorgperson object class.
  • (mail=*): return all entries that have the mail attribute.
  • (&(objectclass=inetorgperson)(o=Orion)): return all entries that use the inetorgperson object class and that have the organization attribute equal to Orion (people in the Orion organization).
  • (&(objectclass=GroupofNames)(|(o=Orion)(o=Executive))) return all entries that use the groupofNames object class and that have the organization attribute equal to Orion OR the organization attribute equal to Executive (groups in the Orion or Executive organizations).

Why do I need to know this?

How will you apply this LDAP knowledge in SAS Viya? To enable SAS Viya to access your identity provider, you must update the SAS Identities service configuration. As an administrator, the most common items to change are:

  • BaseDN the entry in the tree from which the LDAP server starts it search.
  • ObjectFilter the filter used to identity and limit the users and groups returned.

There is a separate BaseDN and ObjectFilter for users and for groups.

To return users and groups to SASVIYA from our example LDAP server we would set:

sas.identities.providers.ldap.group.BasedN=ou=gelcorp,ou=groups,dc=viyademo,dc=com

sas.identities.providers.ldap.users.BasedN= ou=gelcorp,ou=users,dc=viyademo,dc=com

 

This would tell SASVIYA to begin its search for users and groups at those locations in the tree.

The object filter will then determine what entries are returned for users and groups from a search of the LDAP tree starting at the BaseDN. For example:

sas.identities.providers.ldap.group.objectFilter: 
(&amp;(objectClass=GroupOfNames)(o=GELCorp LTD))

sas.identities.providers.ldap.users.objectFilter: 
(&amp;(objectClass=inetOrgPerson)(o=GELCorp LTD))

 

There are a lot of LDAP clients available that will allow you to connect to an LDAP server and view, query, edit and update LDAP trees and their entries. In addition, the ldif file format is a text file format that includes data and commands that provide a simple way to communicate with a directory so as to read, write, rename, and delete entries.

This has been a high-level overview of LDAP. Here are some additional sources of information that may help.

Basic LDAP concepts

LDAP Query Basics

Quick Introduction to LDAP

How To Use LDIF Files to Make Changes to an OpenLDAP System

LDAP basics for the SAS Viya administrator was published on SAS Users.

11月 162017
 
connect sas db2 via odbc

Illustration by Dooder on Freepik

A few weeks ago I helped a customer set up a successful connection between SAS and a remote DB2 database using ODBC. At first, I thought this case would be as simple as plugging in a cable. Why? Because I already had a fully-functional testing environment with direct interaction using SAS/ACCESS Interface to DB2. Besides, I had other RDBMS connected via ODBC. Therefore, it was just a matter of setting up my ODBC driver manager for DB2. No big deal right? Oh boy, was I wrong! It was not easy at all. It took me two full days to make it work in my lab, and only after some in-depth research and a lot of trial and error.

DB2 can be challenging, especially when most of the documentation I found was either incomplete, outdated, scarce or even unclear. I don't want you to undergo the same frustration I felt. As a blogger and SAS user, it is my duty to make your SAS life a little easier. That's why I decided to publish this blog post to share the solution with the community. This guide provides you with the steps to take to set up a successful connection between SAS and a remote DB2 database via ODBC on Linux, without tears.

Table of Contents

Keep in mind this is an extensive guide. I documented all the technical issues I came across in my lab, and also provided detailed explanations to overcome them. If you are beginning your journey, I recommend you to go through all the steps. However, for your convenience, if you want to jump right into a specific topic, use the following table of contents:

My Environment

My Environment

All commands throughout this guide are based on this table. You should replace the code with your settings when appropriate.

Prerequisites

Before we begin, make sure you meet all of these SAS and DB2 requirements:

System Requirements

It may sound obvious, but you must verify that your SAS release supports your DB2 database version. For instance, My Environment section shows that I'm using SAS 9.4 M4. This document recommends that I need DB2 10.5 or later. Considering I currently have DB2 11.1, I'm ready to go. To review the rest of the combinations visit the SAS System Requirements page.

Another evident must-have is an ODBC driver manager already installed. As you noticed, I use unixODBC version 2.3.1. It is always recommended to have the latest version, which nowadays is 2.3.4. Still, you should be fine if you decide to use any of the 2.3.X releases. In the past, I've seen users facing difficulties with unixODBC 2.2.X.

SAS/ACCESS Interface to ODBC licensed and installed

For this connection to work, it is essential to have SAS/ACCESS Interface to ODBC licensed and installed. First, to verify that this product is licensed, run this code in a SAS session:

proc setinit no alias;
run;

In the output log look for this line:

---SAS/ACCESS Interface to ODBC
                          06JAN2018

If the product is displayed, move on to the next step. If not, look for SAS/ACCESS Interface to DB2 instead. If you have the latter, consider yourself a lucky person. You don't have to continue with this guide because you can connect to DB2 directly. If you need assistance with this off-topic issue, send me a message, and I'll be glad to help you. If you don't have either one, you should get in touch with your sales representative.

Next, to verify that this product is installed, go to your !SASROOT/sasexe directory on your SAS server and list these files:

[sas@muse sasexe]$ ls -l *odb*
-r-xr-xr-x. 1 sas sas 630196 Nov  9  2016 sasiodbu
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb_u
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb_u
-r-xr-xr-x. 1 sas sas 102142 Nov  9  2016 sasodbcs
-r-xr-xr-x. 1 sas sas  71982 Nov  9  2016 sasodbdb
-r-xr-xr-x. 1 sas sas  71990 Nov  9  2016 sasodbdb_u
-r-xr-xr-x. 1 sas sas 202343 Nov  9  2016 sasodb
-r-xr-xr-x. 1 sas sas 201815 Nov  9  2016 sasodb_u
-r-xr-xr-x. 1 sas sas 443852 Nov  9  2016 tkeodbc.so

If you don't see the above list, then the product is not installed. Although, if your SAS license includes it, use your SAS depot to install it on your server.

Choosing the right DB2 client package

IBM offers five different client packages. If you don't know which package to choose, you can't go wrong with IBM Data Server Client, it's the package I use, it includes all the client tools and libraries available. Another good alternative is IBM Data Server Runtime Client, but you will have to create your DB2 instance and user yourself. I would not recommend IBM Data Server Driver for ODBC and CLI. During my research I found that people could not get it to connect to a remote DB2 database. For more information about each client package, please visit IBM data server client and driver types.

Testing your DB2 connection outside of SAS

This test is the most critical requirement before starting with the actual configuration process. Take my advice for any connection you plan to configure in the future. Firstly, you must establish a successful connection between your database client and database server outside of SAS, using your database vendor tools. This validation is a must, regardless of your RDBMS or SAS engine.

To test your connection to your remote DB2 server, run an equivalent command on your DB2 client:

[db2inst@muse ~]$ db2 connect to GSDB user db2inst using mypassword
 
   Database Connection Information
 
 Database server        = DB2/LINUXX8664 11.1.1
 SQL authorization ID   = DB2INST
 Local database alias   = GSDB

If you received a similar output you might continue; otherwise, you will have to catalog your DB2 server as a TCP/IP node, and then catalog the database on that node. Since this is beyond the scope of this guide, please review these IBM documents: CATALOG NODE command and CATALOG DATABASE command.

Setting up unixODBC with DB2

Considering that you have tested a successful connection between your database client and database server with the db2 command, it's time to set up a new one using an ODBC driver instead. Please bear with me. Things are about to get interesting.

As mentioned at the beginning, unixODBC is my ODBC driver manager of choice. I use it for all my RDBMS in my lab to test ODBC connections. Its usage is almost effortless, but one of the downsides is the documentation. The only official document available for DB2 is: Using IBM DB2 with unixODBC. Judge for yourself.

This driver manager has two configuration files: odbcinst.ini for database drivers and odbc.ini for database definitions. Typically with most databases, those two files are all that it takes to set up an ODBC connection. DB2 plays a different game though. It uses two extra files: db2cli.ini for DB2 definitions and an XML based file called db2dsdriver.cfg.

For now, let's take a closer look at the contents of both unixODBC files:

odbcinst.ini

[DB2]
Description     = IBM DB2 ODBC Driver
Driver          = /opt/ibm/db2/V11.1/lib32/libdb2.so
Driver64        = /opt/ibm/db2/V11.1/lib64/libdb2.so
FileUsage       = 1
DontDLClose     = 1

odbc.ini

[db2odbc]
Driver          = DB2
Description     = DB2 ODBC connection for SAS

At this point I was feeling confident and thought the connection would work, so I tested it out with the isql command, but to my surprise, I received this error:

[db2inst@muse ~]$ isql -v db2odbc db2inst mypassword
[08001][unixODBC][IBM][CLI Driver] SQL1013N  The database alias name or database name "DB2ODBC" could not be found.  SQLSTATE=42705
 
[ISQL]ERROR: Could not SQLConnect

db2cli.ini came to the rescue

As a result, I started looking up online documentation. After a while, I discovered the existence of the db2cli.ini file. To get you started, IBM provides a sample configuration file called db2cli.ini.sample, located in the installation_path/cfg directory. The DB2 documentation recommends creating a db2cli.ini file based on that sample file and place it in the same location. I followed their advice, and created my file with the same data source name:

[db2odbc]
Database        = GSDB
Protocol        = TCPIP
Hostname        = redhat
ServiceName     = 50000

Note: For IBM, the ServiceName keyword is the port number. The default port number for the DB2 default instance is 50000. Review the correct port with your DBA.

Having set up both unixODBC and DB2 files correctly, I was feeling lucky. Therefore, I attempted a connection again, but I received the same error. My frustration was on the rise.

db2cli validate tool is your best friend

I continued digging into the documentation until I found a powerful tool: db2cli with a particular parameter: validate. This command/parameter duo can help you verify the contents of your configuration files (db2cli.ini and db2dsdriver.cfg), list their location, test your data source names and databases, display invalid/duplicate keywords, among other functionalities. It basically is your new best friend.

db2cli validate helped me identify why the DB2 ODBC driver was unable to find my data source name. I ran this command to examine my environment, and the output was eye-opening:

[db2inst@muse ~]$ db2cli validate -dsn db2odbc
 
===============================================================================
Client information for the current copy:
===============================================================================
 
Client Package Type       : IBM Data Server Client
Client Version (level/bit): DB2 v11.1.0.0 (s1606081100/64-bit)
Client Platform           : Linux/X8664
Install/Instance Path     : /opt/ibm/db2/V11.1
DB2DSDRIVER_CFG_PATH value: 
db2dsdriver.cfg Path      : /home/db2inst/sqllib/cfg/db2dsdriver.cfg
DB2CLIINIPATH value       : 
db2cli.ini Path           : /home/db2inst/sqllib/cfg/db2cli.ini
db2diag.log Path          : /home/db2inst/sqllib/db2dump/db2diag.log
 
===============================================================================
db2dsdriver.cfg schema validation for the entire file:
===============================================================================
 
Note: The validation utility could not find the configuration file 
db2dsdriver.cfg. The file is searched at 
"/home/db2inst/sqllib/cfg/db2dsdriver.cfg".
 
===============================================================================
db2cli.ini validation for data source name "db2odbc":
===============================================================================
 
Note: The validation utility could not find the configuration file db2cli.ini. 
The file is searched at "/home/db2inst/sqllib/cfg/db2cli.ini".
 
===============================================================================
db2dsdriver.cfg validation for data source name "db2odbc":
===============================================================================
 
Note: The validation utility could not find the configuration file 
db2dsdriver.cfg. The file is searched at 
"/home/db2inst/sqllib/cfg/db2dsdriver.cfg".
 
===============================================================================
The validation is completed.
===============================================================================

As you can see, the validation tool didn't find my db2cli.ini. That file contained my data source name; consequently, the DB2 ODBC driver didn't find it either. As previously indicated, I created that file in the installation_path/cfg directory, which it was: /opt/ibm/db2/V11.1/cfg, but the DB2 ODBC driver was searching for it in: /home/db2inst/sqllib/cfg. I didn't choose that path randomly; I just followed this IBM document. Nevertheless, if you encounter the same issue, the solution is fairly simple. You can either create a copy in the required path or point to the original file using a symbolic link. I chose the latter to avoid having duplicates. To fix this misunderstanding just execute this command as root or sudo:

[root@muse ~]# ln -s /opt/ibm/db2/V11.1/cfg/db2cli.ini /home/db2inst/sqllib/cfg/db2cli.ini

Now you should have this symbolic link:

[db2inst@muse ~]$ ls -l /home/db2inst/sqllib/cfg/db2cli.ini 
lrwxrwxrwx. 1 root db2iadm 33 Sep 11 19:07 /home/db2inst/sqllib/cfg/db2cli.ini -> /opt/ibm/db2/V11.1/cfg/db2cli.ini

After this fix, I used my new friend again, and my environment changed. Below is a partial output:

[db2inst@muse ~]$ db2cli validate -dsn db2odbc
===============================================================================
db2cli.ini validation for data source name "db2odbc":
===============================================================================
 
[ Keywords used for the connection ]
 
Keyword                   Value
---------------------------------------------------------------------------
DATABASE                  GSDB
PROTOCOL                  TCPIP
HOSTNAME                  redhat
SERVICENAME               50000

At last, the DB2 ODBC driver was able to find the configuration file, read its contents, and validate the keywords. My frustration started to disappear.

It was time to see what unixODBC thought about this change. So I reran my isql command:

[db2inst@muse ~]$ isql -v db2odbc db2inst mypassword
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL>

It finally worked!

Do we need to use db2dsdriver.cfg?

I managed to set up a successful ODBC connection using only three files: db2cli.ini, along with odbc.ini, and odbcinst.ini. Thus, the answer is no; you don't need to use this additional DB2 configuration file.

For our scenario, I consider it as optional. Previously, I pointed out that it was an XML file, as opposed to db2cli.ini, which is a text file. Both can have the same four keywords I defined. Then why does DB2 provide two files? db2cli.ini is used to configure the behavior of CLI/ODBC applications, whereas db2dsdriver.cfg can be used for those as well, and for other types of applications including PHP, Perl, .NET, among others.

In short, this concludes the DB2 configuration process. Now it's time to cover the SAS territory.

SAS joins the party

So far we haven't touched any SAS files. We have only tested the DB2 connection first using the db2 command, then using unixODBC's isql command. These are mandatory steps before a successful connection with SAS.

Getting sasenv_local ready for DB2

If you have already connected SAS to other RDBMS in the past, you know that the most important file you need to care about is sasenv_local. It is located in the !SASROOT/bin directory. The code below shows a partial output with both ODBC and DB2 variables correctly defined:

# ODBC settings
export ODBCSYSINI=/etc
 
# DB2 settings
export INSTHOME=/opt/ibm/db2/V11.1
export DB2CLIINIPATH=/opt/ibm/db2/V11.1/cfg
 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/lib64:${INSTHOME}/lib64

Let's analyze each of the above settings. First, SAS needs to know where to locate both unixODBC configuration files. The ODBCSYSINI variable instructs SAS to look for them in the /etc directory.

Second, as a good practice, I defined an environment variable called INSTHOME to reference my DB2 installation directory. This variable is optional but highly recommended. Then, the DB2CLIINIPATH variable tells SAS the location of db2cli.ini. If you decide to use db2dsdriver.cfg instead, then replace that variable with DB2DSDRIVER_CFG_PATH.

Finally, you have to inform SAS where to find unixODBC and DB2 libraries. SAS uses the LD_LIBRARY_PATH environment variable on Linux. This variable points to /usr/lib64 because it's the default path for all unixODBC libraries. I also appended the $INSTHOME/lib64 directory, since it's the place where DB2 has all its libraries.

At this point, you might think that you are all set. I don't like being pessimistic, but there is a high possibility that your connection may not work. I didn't test all SAS and DB2 releases, but my customer and I experienced the same error when we tested our connection with a LIBNAME statement:

[sas@muse 9.4]$ ./sas -nodms
 
  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
ERROR:  An exception has been encountered.
Please contact technical support and provide them with the following
       traceback information:
 
The SAS task name is [Line Mod]
Segmentation Violation
# More lines...

Moreover, if you run this code in SAS Studio or SAS Enterprise Guide, you might not even get a response, SAS will just crash or hang for a long time.

Fixing the Segmentation Violation error

After intense research, I was able to fix this error. We need to link two ODBC libraries that SAS provides to enable 32-bit capability. Change to your !SASROOT/sasexe directory and backup both files:

[sas@muse sasexe]$ mv sasodb sasodb.orig
[sas@muse sasexe]$ mv sasioodb sasioodb.orig

Create two symbolic links that point to the 32-bit libraries:

[sas@muse sasexe]$ ln -s sasodb_u sasodb
[sas@muse sasexe]$ ln -s sasioodb_u sasioodb

List your files, and the output should resemble the following:

[sas@muse sasexe]$ ls -l *odb*
-r-xr-xr-x. 1 sas sas 630196 Nov  9  2016 sasiodbu
lrwxrwxrwx. 1 sas sas     10 Sep 11 21:51 sasioodb -> sasioodb_u
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb.orig
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb_u
lrwxrwxrwx. 1 sas sas      8 Sep 11 21:52 sasodb -> sasodb_u
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb_u
-r-xr-xr-x. 1 sas sas 102142 Nov  9  2016 sasodbcs
-r-xr-xr-x. 1 sas sas  71982 Nov  9  2016 sasodbdb
-r-xr-xr-x. 1 sas sas  71990 Nov  9  2016 sasodbdb_u
-r-xr-xr-x. 1 sas sas 202343 Nov  9  2016 sasodb.orig
-r-xr-xr-x. 1 sas sas 201815 Nov  9  2016 sasodb_u
-r-xr-xr-x. 1 sas sas 443852 Nov  9  2016 tkeodbc.so

Now your LIBNAME statement may work, but unlike my customer, you can also encounter this error:

  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
ERROR: Could not load /sas/Software/SASFoundation/9.4/sasexe/sasodb (35
       images loaded)
ERROR: libodbc.so.1: cannot open shared object file: No such file or directory
ERROR: The SAS/ACCESS Interface to ODBC cannot be loaded. The SASODB   code
       appendage could not be loaded.
ERROR: Error in the LIBNAME statement.

The error is self-explanatory. SAS couldn't find the file libodbc.so.1, which SASODB needs. To display all the required shared libraries, execute the ldd command:

[sas@muse sasexe]$ ldd sasodb
        linux-vdso.so.1 =>  (0x00007ffd4efe2000)
        libodbc.so.1 => not found
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9aa741d000)
        libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f9aa71e6000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f9aa6fe2000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f9aa6cdf000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f9aa691e000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f9aa6708000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f9aa7882000)
        libfreebl3.so => /lib64/libfreebl3.so (0x00007f9aa6504000)

Indeed, that library was not found, so to fix this missing dependency, create a symbolic link using root or sudo:

[root@muse ~]# ln -s /lib64/libodbc.so.2.0.0 /lib64/libodbc.so.1

Creating a library that works

Now that the previous errors are gone, run a final test using the SAS command line to assign a library and then SAS Studio to display some data in a nice-looking way:

[sas@muse 9.4]$ ./sas -nodms
  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
NOTE: Libref DB2LIB was successfully assigned as follows: 
      Engine:        ODBC 
      Physical Name: db2odbc

Note: Displaying actual data is an essential last step because if you are pointing to an incorrect DB2 ODBC driver (such as libdb2o.so instead of libdb2.so), you can still assign a library successfully, but you won't see any tables.

In SAS Studio I assigned the same library and ran a small query to retrieve some data:

proc print data=db2lib.department_lookup;
    var department_code department_en;
run;

Sample data in SAS Studio

Wrapping Up

I know it's a lot to digest. However, some users may not need to follow all steps to succeed, whereas others starting from scratch may require performing all steps carefully. Whatever your situation might be, I hope you find this guide useful and easy to follow. Did you struggle like me? Share your experience in the comments below or if you have any problems, don't hesitate to contact me.

Connecting SAS to a DB2 database via ODBC without tears was published on SAS Users.

11月 162017
 
connect sas db2 via odbc

Illustration by Dooder on Freepik

A few weeks ago I helped a customer set up a successful connection between SAS and a remote DB2 database using ODBC. At first, I thought this case would be as simple as plugging in a cable. Why? Because I already had a fully-functional testing environment with direct interaction using SAS/ACCESS Interface to DB2. Besides, I had other RDBMS connected via ODBC. Therefore, it was just a matter of setting up my ODBC driver manager for DB2. No big deal right? Oh boy, was I wrong! It was not easy at all. It took me two full days to make it work in my lab, and only after some in-depth research and a lot of trial and error.

DB2 can be challenging, especially when most of the documentation I found was either incomplete, outdated, scarce or even unclear. I don't want you to undergo the same frustration I felt. As a blogger and SAS user, it is my duty to make your SAS life a little easier. That's why I decided to publish this blog post to share the solution with the community. This guide provides you with the steps to take to set up a successful connection between SAS and a remote DB2 database via ODBC on Linux, without tears.

Table of Contents

Keep in mind this is an extensive guide. I documented all the technical issues I came across in my lab, and also provided detailed explanations to overcome them. If you are beginning your journey, I recommend you to go through all the steps. However, for your convenience, if you want to jump right into a specific topic, use the following table of contents:

My Environment

My Environment

All commands throughout this guide are based on this table. You should replace the code with your settings when appropriate.

Prerequisites

Before we begin, make sure you meet all of these SAS and DB2 requirements:

System Requirements

It may sound obvious, but you must verify that your SAS release supports your DB2 database version. For instance, My Environment section shows that I'm using SAS 9.4 M4. This document recommends that I need DB2 10.5 or later. Considering I currently have DB2 11.1, I'm ready to go. To review the rest of the combinations visit the SAS System Requirements page.

Another evident must-have is an ODBC driver manager already installed. As you noticed, I use unixODBC version 2.3.1. It is always recommended to have the latest version, which nowadays is 2.3.4. Still, you should be fine if you decide to use any of the 2.3.X releases. In the past, I've seen users facing difficulties with unixODBC 2.2.X.

SAS/ACCESS Interface to ODBC licensed and installed

For this connection to work, it is essential to have SAS/ACCESS Interface to ODBC licensed and installed. First, to verify that this product is licensed, run this code in a SAS session:

proc setinit no alias;
run;

In the output log look for this line:

---SAS/ACCESS Interface to ODBC
                          06JAN2018

If the product is displayed, move on to the next step. If not, look for SAS/ACCESS Interface to DB2 instead. If you have the latter, consider yourself a lucky person. You don't have to continue with this guide because you can connect to DB2 directly. If you need assistance with this off-topic issue, send me a message, and I'll be glad to help you. If you don't have either one, you should get in touch with your sales representative.

Next, to verify that this product is installed, go to your !SASROOT/sasexe directory on your SAS server and list these files:

[sas@muse sasexe]$ ls -l *odb*
-r-xr-xr-x. 1 sas sas 630196 Nov  9  2016 sasiodbu
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb_u
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb_u
-r-xr-xr-x. 1 sas sas 102142 Nov  9  2016 sasodbcs
-r-xr-xr-x. 1 sas sas  71982 Nov  9  2016 sasodbdb
-r-xr-xr-x. 1 sas sas  71990 Nov  9  2016 sasodbdb_u
-r-xr-xr-x. 1 sas sas 202343 Nov  9  2016 sasodb
-r-xr-xr-x. 1 sas sas 201815 Nov  9  2016 sasodb_u
-r-xr-xr-x. 1 sas sas 443852 Nov  9  2016 tkeodbc.so

If you don't see the above list, then the product is not installed. Although, if your SAS license includes it, use your SAS depot to install it on your server.

Choosing the right DB2 client package

IBM offers five different client packages. If you don't know which package to choose, you can't go wrong with IBM Data Server Client, it's the package I use, it includes all the client tools and libraries available. Another good alternative is IBM Data Server Runtime Client, but you will have to create your DB2 instance and user yourself. I would not recommend IBM Data Server Driver for ODBC and CLI. During my research I found that people could not get it to connect to a remote DB2 database. For more information about each client package, please visit IBM data server client and driver types.

Testing your DB2 connection outside of SAS

This test is the most critical requirement before starting with the actual configuration process. Take my advice for any connection you plan to configure in the future. Firstly, you must establish a successful connection between your database client and database server outside of SAS, using your database vendor tools. This validation is a must, regardless of your RDBMS or SAS engine.

To test your connection to your remote DB2 server, run an equivalent command on your DB2 client:

[db2inst@muse ~]$ db2 connect to GSDB user db2inst using mypassword
 
   Database Connection Information
 
 Database server        = DB2/LINUXX8664 11.1.1
 SQL authorization ID   = DB2INST
 Local database alias   = GSDB

If you received a similar output you might continue; otherwise, you will have to catalog your DB2 server as a TCP/IP node, and then catalog the database on that node. Since this is beyond the scope of this guide, please review these IBM documents: CATALOG NODE command and CATALOG DATABASE command.

Setting up unixODBC with DB2

Considering that you have tested a successful connection between your database client and database server with the db2 command, it's time to set up a new one using an ODBC driver instead. Please bear with me. Things are about to get interesting.

As mentioned at the beginning, unixODBC is my ODBC driver manager of choice. I use it for all my RDBMS in my lab to test ODBC connections. Its usage is almost effortless, but one of the downsides is the documentation. The only official document available for DB2 is: Using IBM DB2 with unixODBC. Judge for yourself.

This driver manager has two configuration files: odbcinst.ini for database drivers and odbc.ini for database definitions. Typically with most databases, those two files are all that it takes to set up an ODBC connection. DB2 plays a different game though. It uses two extra files: db2cli.ini for DB2 definitions and an XML based file called db2dsdriver.cfg.

For now, let's take a closer look at the contents of both unixODBC files:

odbcinst.ini

[DB2]
Description     = IBM DB2 ODBC Driver
Driver          = /opt/ibm/db2/V11.1/lib32/libdb2.so
Driver64        = /opt/ibm/db2/V11.1/lib64/libdb2.so
FileUsage       = 1
DontDLClose     = 1

odbc.ini

[db2odbc]
Driver          = DB2
Description     = DB2 ODBC connection for SAS

At this point I was feeling confident and thought the connection would work, so I tested it out with the isql command, but to my surprise, I received this error:

[db2inst@muse ~]$ isql -v db2odbc db2inst mypassword
[08001][unixODBC][IBM][CLI Driver] SQL1013N  The database alias name or database name "DB2ODBC" could not be found.  SQLSTATE=42705
 
[ISQL]ERROR: Could not SQLConnect

db2cli.ini came to the rescue

As a result, I started looking up online documentation. After a while, I discovered the existence of the db2cli.ini file. To get you started, IBM provides a sample configuration file called db2cli.ini.sample, located in the installation_path/cfg directory. The DB2 documentation recommends creating a db2cli.ini file based on that sample file and place it in the same location. I followed their advice, and created my file with the same data source name:

[db2odbc]
Database        = GSDB
Protocol        = TCPIP
Hostname        = redhat
ServiceName     = 50000

Note: For IBM, the ServiceName keyword is the port number. The default port number for the DB2 default instance is 50000. Review the correct port with your DBA.

Having set up both unixODBC and DB2 files correctly, I was feeling lucky. Therefore, I attempted a connection again, but I received the same error. My frustration was on the rise.

db2cli validate tool is your best friend

I continued digging into the documentation until I found a powerful tool: db2cli with a particular parameter: validate. This command/parameter duo can help you verify the contents of your configuration files (db2cli.ini and db2dsdriver.cfg), list their location, test your data source names and databases, display invalid/duplicate keywords, among other functionalities. It basically is your new best friend.

db2cli validate helped me identify why the DB2 ODBC driver was unable to find my data source name. I ran this command to examine my environment, and the output was eye-opening:

[db2inst@muse ~]$ db2cli validate -dsn db2odbc
 
===============================================================================
Client information for the current copy:
===============================================================================
 
Client Package Type       : IBM Data Server Client
Client Version (level/bit): DB2 v11.1.0.0 (s1606081100/64-bit)
Client Platform           : Linux/X8664
Install/Instance Path     : /opt/ibm/db2/V11.1
DB2DSDRIVER_CFG_PATH value: 
db2dsdriver.cfg Path      : /home/db2inst/sqllib/cfg/db2dsdriver.cfg
DB2CLIINIPATH value       : 
db2cli.ini Path           : /home/db2inst/sqllib/cfg/db2cli.ini
db2diag.log Path          : /home/db2inst/sqllib/db2dump/db2diag.log
 
===============================================================================
db2dsdriver.cfg schema validation for the entire file:
===============================================================================
 
Note: The validation utility could not find the configuration file 
db2dsdriver.cfg. The file is searched at 
"/home/db2inst/sqllib/cfg/db2dsdriver.cfg".
 
===============================================================================
db2cli.ini validation for data source name "db2odbc":
===============================================================================
 
Note: The validation utility could not find the configuration file db2cli.ini. 
The file is searched at "/home/db2inst/sqllib/cfg/db2cli.ini".
 
===============================================================================
db2dsdriver.cfg validation for data source name "db2odbc":
===============================================================================
 
Note: The validation utility could not find the configuration file 
db2dsdriver.cfg. The file is searched at 
"/home/db2inst/sqllib/cfg/db2dsdriver.cfg".
 
===============================================================================
The validation is completed.
===============================================================================

As you can see, the validation tool didn't find my db2cli.ini. That file contained my data source name; consequently, the DB2 ODBC driver didn't find it either. As previously indicated, I created that file in the installation_path/cfg directory, which it was: /opt/ibm/db2/V11.1/cfg, but the DB2 ODBC driver was searching for it in: /home/db2inst/sqllib/cfg. I didn't choose that path randomly; I just followed this IBM document. Nevertheless, if you encounter the same issue, the solution is fairly simple. You can either create a copy in the required path or point to the original file using a symbolic link. I chose the latter to avoid having duplicates. To fix this misunderstanding just execute this command as root or sudo:

[root@muse ~]# ln -s /opt/ibm/db2/V11.1/cfg/db2cli.ini /home/db2inst/sqllib/cfg/db2cli.ini

Now you should have this symbolic link:

[db2inst@muse ~]$ ls -l /home/db2inst/sqllib/cfg/db2cli.ini 
lrwxrwxrwx. 1 root db2iadm 33 Sep 11 19:07 /home/db2inst/sqllib/cfg/db2cli.ini -> /opt/ibm/db2/V11.1/cfg/db2cli.ini

After this fix, I used my new friend again, and my environment changed. Below is a partial output:

[db2inst@muse ~]$ db2cli validate -dsn db2odbc
===============================================================================
db2cli.ini validation for data source name "db2odbc":
===============================================================================
 
[ Keywords used for the connection ]
 
Keyword                   Value
---------------------------------------------------------------------------
DATABASE                  GSDB
PROTOCOL                  TCPIP
HOSTNAME                  redhat
SERVICENAME               50000

At last, the DB2 ODBC driver was able to find the configuration file, read its contents, and validate the keywords. My frustration started to disappear.

It was time to see what unixODBC thought about this change. So I reran my isql command:

[db2inst@muse ~]$ isql -v db2odbc db2inst mypassword
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL>

It finally worked!

Do we need to use db2dsdriver.cfg?

I managed to set up a successful ODBC connection using only three files: db2cli.ini, along with odbc.ini, and odbcinst.ini. Thus, the answer is no; you don't need to use this additional DB2 configuration file.

For our scenario, I consider it as optional. Previously, I pointed out that it was an XML file, as opposed to db2cli.ini, which is a text file. Both can have the same four keywords I defined. Then why does DB2 provide two files? db2cli.ini is used to configure the behavior of CLI/ODBC applications, whereas db2dsdriver.cfg can be used for those as well, and for other types of applications including PHP, Perl, .NET, among others.

In short, this concludes the DB2 configuration process. Now it's time to cover the SAS territory.

SAS joins the party

So far we haven't touched any SAS files. We have only tested the DB2 connection first using the db2 command, then using unixODBC's isql command. These are mandatory steps before a successful connection with SAS.

Getting sasenv_local ready for DB2

If you have already connected SAS to other RDBMS in the past, you know that the most important file you need to care about is sasenv_local. It is located in the !SASROOT/bin directory. The code below shows a partial output with both ODBC and DB2 variables correctly defined:

# ODBC settings
export ODBCSYSINI=/etc
 
# DB2 settings
export INSTHOME=/opt/ibm/db2/V11.1
export DB2CLIINIPATH=/opt/ibm/db2/V11.1/cfg
 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/lib64:${INSTHOME}/lib64

Let's analyze each of the above settings. First, SAS needs to know where to locate both unixODBC configuration files. The ODBCSYSINI variable instructs SAS to look for them in the /etc directory.

Second, as a good practice, I defined an environment variable called INSTHOME to reference my DB2 installation directory. This variable is optional but highly recommended. Then, the DB2CLIINIPATH variable tells SAS the location of db2cli.ini. If you decide to use db2dsdriver.cfg instead, then replace that variable with DB2DSDRIVER_CFG_PATH.

Finally, you have to inform SAS where to find unixODBC and DB2 libraries. SAS uses the LD_LIBRARY_PATH environment variable on Linux. This variable points to /usr/lib64 because it's the default path for all unixODBC libraries. I also appended the $INSTHOME/lib64 directory, since it's the place where DB2 has all its libraries.

At this point, you might think that you are all set. I don't like being pessimistic, but there is a high possibility that your connection may not work. I didn't test all SAS and DB2 releases, but my customer and I experienced the same error when we tested our connection with a LIBNAME statement:

[sas@muse 9.4]$ ./sas -nodms
 
  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
ERROR:  An exception has been encountered.
Please contact technical support and provide them with the following
       traceback information:
 
The SAS task name is [Line Mod]
Segmentation Violation
# More lines...

Moreover, if you run this code in SAS Studio or SAS Enterprise Guide, you might not even get a response, SAS will just crash or hang for a long time.

Fixing the Segmentation Violation error

After intense research, I was able to fix this error. We need to link two ODBC libraries that SAS provides to enable 32-bit capability. Change to your !SASROOT/sasexe directory and backup both files:

[sas@muse sasexe]$ mv sasodb sasodb.orig
[sas@muse sasexe]$ mv sasioodb sasioodb.orig

Create two symbolic links that point to the 32-bit libraries:

[sas@muse sasexe]$ ln -s sasodb_u sasodb
[sas@muse sasexe]$ ln -s sasioodb_u sasioodb

List your files, and the output should resemble the following:

[sas@muse sasexe]$ ls -l *odb*
-r-xr-xr-x. 1 sas sas 630196 Nov  9  2016 sasiodbu
lrwxrwxrwx. 1 sas sas     10 Sep 11 21:51 sasioodb -> sasioodb_u
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb.orig
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb_u
lrwxrwxrwx. 1 sas sas      8 Sep 11 21:52 sasodb -> sasodb_u
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb_u
-r-xr-xr-x. 1 sas sas 102142 Nov  9  2016 sasodbcs
-r-xr-xr-x. 1 sas sas  71982 Nov  9  2016 sasodbdb
-r-xr-xr-x. 1 sas sas  71990 Nov  9  2016 sasodbdb_u
-r-xr-xr-x. 1 sas sas 202343 Nov  9  2016 sasodb.orig
-r-xr-xr-x. 1 sas sas 201815 Nov  9  2016 sasodb_u
-r-xr-xr-x. 1 sas sas 443852 Nov  9  2016 tkeodbc.so

Now your LIBNAME statement may work, but unlike my customer, you can also encounter this error:

  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
ERROR: Could not load /sas/Software/SASFoundation/9.4/sasexe/sasodb (35
       images loaded)
ERROR: libodbc.so.1: cannot open shared object file: No such file or directory
ERROR: The SAS/ACCESS Interface to ODBC cannot be loaded. The SASODB   code
       appendage could not be loaded.
ERROR: Error in the LIBNAME statement.

The error is self-explanatory. SAS couldn't find the file libodbc.so.1, which SASODB needs. To display all the required shared libraries, execute the ldd command:

[sas@muse sasexe]$ ldd sasodb
        linux-vdso.so.1 =>  (0x00007ffd4efe2000)
        libodbc.so.1 => not found
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9aa741d000)
        libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f9aa71e6000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f9aa6fe2000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f9aa6cdf000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f9aa691e000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f9aa6708000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f9aa7882000)
        libfreebl3.so => /lib64/libfreebl3.so (0x00007f9aa6504000)

Indeed, that library was not found, so to fix this missing dependency, create a symbolic link using root or sudo:

[root@muse ~]# ln -s /lib64/libodbc.so.2.0.0 /lib64/libodbc.so.1

Creating a library that works

Now that the previous errors are gone, run a final test using the SAS command line to assign a library and then SAS Studio to display some data in a nice-looking way:

[sas@muse 9.4]$ ./sas -nodms
  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
NOTE: Libref DB2LIB was successfully assigned as follows: 
      Engine:        ODBC 
      Physical Name: db2odbc

Note: Displaying actual data is an essential last step because if you are pointing to an incorrect DB2 ODBC driver (such as libdb2o.so instead of libdb2.so), you can still assign a library successfully, but you won't see any tables.

In SAS Studio I assigned the same library and ran a small query to retrieve some data:

proc print data=db2lib.department_lookup;
    var department_code department_en;
run;

Sample data in SAS Studio

Wrapping Up

I know it's a lot to digest. However, some users may not need to follow all steps to succeed, whereas others starting from scratch may require performing all steps carefully. Whatever your situation might be, I hope you find this guide useful and easy to follow. Did you struggle like me? Share your experience in the comments below or if you have any problems, don't hesitate to contact me.

Connecting SAS to a DB2 database via ODBC without tears was published on SAS Users.