Tech

12月 072017
 

In SAS Viya, deployments identities are managed by the environments configured identity provider. In Visual SAS Viya deployments the identity provider must be an LDAP (Lightweight Directory Access Protocol)  server. Initial setup of a SAS Viya Deployment requires configuration to support reading the identity information (users and groups) from LDAP. SAS Viya 3.3 adds support for multi-tenancy which has implications for the way users and groups must be stored in LDAP. For the SAS Administrator, this means at least a rudimentary understanding of LDAP is required. In this blog post, I will review some key LDAP basics for the SAS Viya administrator.

A basic understanding of LDAP l ensures SAS Viya administrators can speak the same language as the LDAP administrator.

What is LDAP?

LDAP is a lightweight protocol for accessing directory servers. A directory server is a hierarchical object orientated database. An LDAP server can be used to organize anything. One of the most common uses is as an identity management system, to organize user and group membership.

LDAP directories are organized in a tree manner:

  • A directory is a tree of directory entries.
  • An entry contains a set of attributes.
  • An attribute has a name, and one or more values.

LDAP basics for the SAS Viya administrator

Below is an entry for a user Henrik. It has common attributes like:

  • uid User id
  • cn Common Name
  • L Location
  • DisplayName: name to display

The attribute value pairs are the details of the entry.

The objectclass attribute is worth a special mention. Every entry has at least one objectclass attribute and often more than one. The objectclass is used to provide the rules for the object including required and allowed attributes. For example, the inetorgperson object class specifies attributes about people who are part of an organization, including items such as uid, name, employeenumber etc.

LDAP Tree

Let’s now look at the organization of the tree. DC is the “domain component.” You will often see examples of LDAP structures that use DNS names for the domain component, such as: dc=sas,dc=com. This is not required, but since DNS itself often implies organizational boundaries, it usually makes sense to use the existing naming structure. In this example the domain component is “dc=geldemo,dc=com”. The next level down is the organizational unit (ou).  An organizational unit is a grouping or collection of entries. Organizational units can contain additional organizational units.

But how do we find the objects we want in the directory tree? Every entry in a directory has a unique identifier, called the Distinguished Name (DN). The distinguished name is the full path to the object in the directory tree. For example, the distinguished name of Henrik is uid=Henrik,ou=users, ou=gelcorp,dc=viyademo,dc=com. The distinguished name is the path to the object from lowest to highest (yes it seems backward to me to).

LDAP Queries and Filters

Like any other database LDAP can be queried and it has its own particular syntax for defining queries. LDAP queries are boolean expressions in the format

<em><strong>attribute operator value</strong></em>

<em><strong>uid = sasgnn</strong></em>

 

Attribute can be any valid LDAP attribute (e.g name, uid, city etc.) and value is the value that you wish to search for.  The usual operators are available, including:

Using LDAP filters, you can link two or more Boolean expressions together using the “filter choices” and/or. Unusually, the LDAP “filter choices” are always placed in front of the expressions. The search criteria must be put in parentheses and then the whole term has to be bracketed one more time. Here are some examples of LDAP queries that may make the syntax easier to follow:

  • (sn=Jones): return all entries with a surname equal to Jones.
  • (objectclass=inetorgperson) return entries that use the inegorgperson object class.
  • (mail=*): return all entries that have the mail attribute.
  • (&(objectclass=inetorgperson)(o=Orion)): return all entries that use the inetorgperson object class and that have the organization attribute equal to Orion (people in the Orion organization).
  • (&(objectclass=GroupofNames)(|(o=Orion)(o=Executive))) return all entries that use the groupofNames object class and that have the organization attribute equal to Orion OR the organization attribute equal to Executive (groups in the Orion or Executive organizations).

Why do I need to know this?

How will you apply this LDAP knowledge in SAS Viya? To enable SAS Viya to access your identity provider, you must update the SAS Identities service configuration. As an administrator, the most common items to change are:

  • BaseDN the entry in the tree from which the LDAP server starts it search.
  • ObjectFilter the filter used to identity and limit the users and groups returned.

There is a separate BaseDN and ObjectFilter for users and for groups.

To return users and groups to SASVIYA from our example LDAP server we would set:

sas.identities.providers.ldap.group.BasedN=ou=gelcorp,ou=groups,dc=viyademo,dc=com

sas.identities.providers.ldap.users.BasedN= ou=gelcorp,ou=users,dc=viyademo,dc=com

 

This would tell SASVIYA to begin its search for users and groups at those locations in the tree.

The object filter will then determine what entries are returned for users and groups from a search of the LDAP tree starting at the BaseDN. For example:

sas.identities.providers.ldap.group.objectFilter: 
(&amp;(objectClass=GroupOfNames)(o=GELCorp LTD))

sas.identities.providers.ldap.users.objectFilter: 
(&amp;(objectClass=inetOrgPerson)(o=GELCorp LTD))

 

There are a lot of LDAP clients available that will allow you to connect to an LDAP server and view, query, edit and update LDAP trees and their entries. In addition, the ldif file format is a text file format that includes data and commands that provide a simple way to communicate with a directory so as to read, write, rename, and delete entries.

This has been a high-level overview of LDAP. Here are some additional sources of information that may help.

Basic LDAP concepts

LDAP Query Basics

Quick Introduction to LDAP

How To Use LDIF Files to Make Changes to an OpenLDAP System

LDAP basics for the SAS Viya administrator was published on SAS Users.

12月 052017
 

With SAS Data Management, you can setup SAS Data Remediation to manage and correct data issues. SAS Data Remediation allows user- or role-based access to data exceptions.

Last time I talked about how to register and use a Data Remediation service. This time we will look at how to use SAS Workflow together with SAS Data Remediation. SAS Workflow comes as part of SAS Data Remediation and you can use it to control the flow of a remediation issue to ensure that the right people are involved at the appropriate various steps.

SAS Workflow is very powerful and offers a lot of features to process data. You can, for example, design workflows to:

  • Process tasks (workflow steps) in parallel.
  • Follow different routes through a workflow depending on certain conditions.
  • Set a timer to automatically route to a different user if an assigned user isn’t available for a certain time.
  • Call Web Services (REST/SOAP) to perform some work automatically.
  • Send email notifications if a certain workflow step is reached.
    …and more.

Invoke an external URL via a workflow

In this article, we are going to look at a workflow feature which is specific to SAS Data Remediation.

With the latest version of Data Remediation there is a new feature that allows you to put a URL behind a workflow button and invoke it by clicking on the button in the Remediation UI. In the following we are going to take a closer look at this new feature.

When you have registered a Data Remediation service you can design a workflow, using SAS Workflow Studio, and link it to a Remediation Service. Workflow Studio comes with some sample workflows. It’s a good idea to take one of these sample workflows as a starting point and adding additional steps to it as desired.

Workflow Task

Data Remediation issues using SAS WorkflowIn a workflow, you have tasks. A task is a single step of a workflow and is associated with data objects, statuses, policies and participants.

  • Data Objects are fields to store values.
  • A Status transitions from one task to the next.
  • Policies are rules that will be executes at a certain task event. For example, calling a Web Service at the beginning of a task or send an email at the end of a task.
  • Participants are users or groups who can execute a task; i.e. if you are not a user assigned to a task, you can’t open an issue in Data Remediation to work on it.

When you add a new task to a workflow you must connect it to another task using a status. In the Data Remediation UI, a status will show up as a Workflow Button to transition from one task to another.

Assigning a URL to Status

You can also use a status on a task to call a URL instead of transitioning to the next task. Therefore, you add a status to a task but don’t use it to connect to another task.

At task Open you have four statuses assigned but only Associate, Cancel and Reject connect to other tasks. Status Review is not connected and it can be used to call a URL.

Right mouse click on status Review/Edit to open a dialogue box with a button Attributes… Here, you need to add the Key Attribute with name: URL. The value of URL points to the fully qualified name of the URL to be called:

http://MyServer/showAddtionalInformation/?recordid={Field1}

 

 

 

 

 

The URL can take parameters, in curly brackets (i.e. {Field1}), pointing to the task’s data objects. When calling the URL, the parameters will be substituted with the appropriate data object value. This way the URL call can be dynamic.

Dynamic URL calls via parameters

When you create a Data Remediation issue you can pass up to three values from the issue record to Remediation. For example, the issue record key. When the workflow is called by Data Remediation these values are copied from Data Remediation to the workflow’s top-level fields: Field1, Field2, Field3.

To use these values in the workflow, to pass them to a URL, you need to create an appropriate data object (Field1) at the task where you want to call a URL. You also need to add a policy to the task to copy the data object value from the top-level data object to the task data object.

This will make the value available at the task and you can use it as a parameter in the URL call as shown above.

Link workflow to Data Remediation Service

When you have finished the workflow, you can save it and upload it to the workflow server.

Once it is uploaded you can link the workflow to the Remediation Service. At Data Remediation Administration, open the remediation service and go to Issue Types. Workflows are linked to Issue types, they are not compulsory, but you can link one or more workflows to an issue type, depending on your business requirements.

At Issue Types, under Templates, select your workflow and link it to an issue type. You can make workflows mandatory for certain issue types by ticking the check box: “Always require a task template…” In this case Data Remediation expects one of the nominated workflows to be used when creating a remediation issue for the appropriate issue type.

Creating a remediation issue

You now can create a remediation issue for the appropriate issue type and assign a workflow to it by submitting the workflow name via the field "workflowName" in JSON structure when creating the remediation issue. See “Improving data quality through SAS Data Remediation” for more about creating Data Remediation issues.

Call a URL via workflow button

When the Remediation issue is created you can open it in SAS Data Remediation where you will see the Review Button. When you click on the Review button the URL will be called that you have assigned to the URL Key attribute value for status Review.

By using workflows with SAS Data Remediation, you can better control the process addressing data issues. Being able to put a URL behind a workflow button and invoke it will enhance your capabilities around solving the issues and improving data quality.

Process and control Data Remediation issues using SAS Workflow was published on SAS Users.

12月 042017
 

During my 35 years of using SAS® software, I have found the CNTLIN and CNTLOUT options in the FORMAT procedure to be among the most useful features that I routinely suggest to other SAS users. The CNTLIN option enables you to create user-defined formats from a SAS data set (input control data set). The CNTLOUT option enables you to create a SAS data set (output control data set) containing format details from an entry in a SAS format catalog.

In this blog post, I provide a few examples demonstrating how to use the CNTLIN option. I also mention how to use the CNTLOUT option to store your format information in case you need to move to a new operating environment.

You can store all the format details from a SAS format catalog in a CNTLOUT data set and later restore them in a format catalog in your new operating environment using the CNTLIN option. For details, see SAS Usage Note 22194: “How to use the CNTLOUT= and CNTLIN= options in PROC FORMAT to move formats from one platform to another.”

A data set for the CNTLIN option contains variables that give specific information about ranges and values. At a minimum, the data set must contain the following variables:

FMTNAME specifies a character variable whose value is the format or informat name.
START specifies a variable that gives the range's starting value.
LABEL specifies a variable whose value is associated with a format or an informat.

For details about input and output control data sets, see the “FORMAT Procedure” section of Base SAS® 9.4 Procedures Guide, Seventh Edition.

Create a Numeric Format

The following simple example using the CNTLIN option creates a numeric format named respf:

 data test;                                         
    input response desc $20.;                       
 datalines;                                         
 1  Strongly Disagree                               
 2  Disagree                                        
 3  Neutral                                         
 4  Agree                                           
 5  Stongly Agree                                   
 ;                                                  
 run;                                               
 
 data crfmt;                                        
    set test;                                       
    start=response;                                 
    label=desc;                                     
    fmtname='respf';                                
 run;                                               
 
 proc format library=work cntlin=crfmt fmtlib;      
    select respf;                                   
 run;

Controlling Your Formats

Reveal Data Set Variables

To see the other variables that are included in data sets created by the CNTLIN and CNTLOUT options, use CNTLOUT to create a data set for the respf format created above:

 proc format library=work cntlout=outfmt;       
    select respf;                               
 run;                                      
 proc print data=outfmt;                        
 run;

Add Additional Ranges

To add another range to the respf format, you can use DATA step processing with the data set created by the CNTLOUT option. Then, re-create the format using the CNTLIN option:

data infmt;                                               
    set outfmt end=last;                                   
    output;                                                
    if last then do;                                       
       HLO='O';  /* indicates a special other range  */      
       label='NA';                                         
       output;                                             
    end;                                                   
 run;                                                     
 
 proc format library=work cntlin=infmt fmtlib;             
    select respf;                                          
 run;

Convert a State Name to Its Postal Abbreviation

One use for the CNTLIN option is to create a format that converts a state name to its 2-letter postal abbreviation. For example, this option can convert 'North Carolina' to 'NC'.  Because SAS does not have a function or format to convert state names to postal abbreviations, this is an excellent use of the CNTLIN option.

We can use data from the SASHELP.ZIPCODE data set to create a user-defined format using the CNTLIN option, as shown below:

proc sql noprint;                               
    create table crfmt as                        
    select distinct statename as start,          
           statecode as label,                   
           '$mystate' as fmtname                 
    from sashelp.zipcode;                        
 quit;                                           
 
 proc format library=work cntlin=crfmt fmtlib;   
    select $mystate;                             
 run;

Identify State Capitals

In a similar manner, we can use the MAPS.USCITY data set to create a user-defined format that identifies state capitals from the 2-letter state abbreviation. See the sample code and partial results below:

proc sql noprint;                                 
   create table crfmt as                          
   select distinct statecode as start,            
          city as label,                          
          '$mycity' as fmtname                    
   from maps.uscity                               
   where capital='Y';                             
 quit;                                            
 
proc format library=work cntlin=crfmt fmtlib;     
   select $mycity;                                
run;

Use External Data Sources

You can gather information from external data sources and read that information into a data set created by the CNTLIN option to create user-defined formats.

The following example uses ICD10 medical diagnosis codes. I downloaded a list of ICD10 codes and their descriptions into a Microsoft Excel file from the Center for Medicare & Medicaid Services website. Then, I created a user-defined format from the first 25 records:
Note: You can also download the codes as a text file.

/* This code reads in the Excel file.   */                                                                  
proc import out==myicd10                                              
   datafile= "C:\Section111ValidICD10-2017.xlsx"   
   dbms=excelcs replace;                                                
   range="'Valid ICD10 2017 &amp; NF Exclude$'";                            
   scantext=yes;                                                        
   usedate=yes;                                                         
   scantime=yes;                                                        
run;                                                                    
 
 
data crfmt;                                         
   set myicd10 (obs=25);                         
   fmtname='$myicd';                                
   start=code;                                      
   label=short_description;                         
run;  
 
title1 'ICD10 Format';                                                      
title3 'FMTLIB results only display the first 40 characters of the label';  
proc format library=work cntlin=crfmt fmtlib;       
   select $myicd;                                   
run;

A more complicated example that uses other data set variables created by the CNTLIN option is included in the linked sample program in Sample 47312: “Create a user-defined format containing decile ranges from PROC UNIVARIATE results.”

If you can think of a scenario in which the CNTLIN format would be helpful, give it a try. If you have questions, you can ask via  SAS Communities or contact us in SAS Technical Support.

Controlling your formats was published on SAS Users.

11月 292017
 

The CAS procedure (PROC CAS) enables us to interact with SAS Cloud Analytic Services (CAS) from the SAS client based on the CASL (the scripting language of CAS) specification. CASL supports a variety of data types including INT, DOUBLE, STRING, TABLE, LIST, BLOB, and others. The result of a CAS action could be any of the data types mentioned above. In this article, we will use the CAS procedure to explore techniques to interact with the results produced from CAS actions.

PROC CAS enables us to run CAS actions in SAS Viya; CAS actions are at the heart of how the CAS server receives requests, performs the relevant tasks and returns results back to the user. CAS actions representing common functionality constitute CAS action sets.  Consider the CAS action set TABLE. Within the TABLE action set, we will currently find 24 different actions relevant to various tasks that can be performed on a table. For example, the COLUMNINFO action within the TABLE action set will inform the user of the contents of a CAS in-memory table, including column names, lengths, data types, and so on. Let’s take a look at the following code to understand how this works:

proc cas;
	table.columnInfo  / table='HMEQ';
	simple.summary    / table={name='HMEQ' groupby='BAD'};
run;

In the code above, the table called ‘HMEQ’ is a distributed in-memory CAS table that contains data about applicants who were granted credit for a certain home equity loan. The categorical binary-valued variable ‘BAD’ identifies a client who has either defaulted or repaid their home equity loan. Since PROC CAS is an interactive procedure, we can invoke multiple statements within a single run block. In the code above we have executed the COLUMNINFO action from the TABLE actionset and the SUMMARY action from the SIMPLE action set within the same run block. Notice that we are able to obtain a summary statistic of all the interval variables in the dataset grouped by the binary variable ‘BAD’ without having to first sort the data by that variable. Below is a snapshot of the resulting output.

PROC CAS

Fig1: Output from table.columninfo

Fig 2: Output from executing the summary actionset with a BY group option

Let’s dig into this a little deeper by considering the following statements:

proc cas;
simple.summary result=S /table={name='hmeq'};
describe S;
run;

In the code snippet above, the result of the summary action is returned to the user in a variable ‘S’ which is a dictionary. How do we know? To find that, we have invoked the DESCRIBE statement to help us understand exactly what the form of this result, S, is. The DESCRIBE statement is used to display the data type of a variable. Let’s take a look at the log file:

The log file informs the user that ‘S’ is a dictionary with one entry of type table named “Summary.” In the above example the column names and the attributes are shown on the log file. If we want to print the entire summary table or a subset, we would use the code below:

proc cas;
simple.summary result=S /table={name='hmeq'};
print s[“Summary”];
print s[“summary”, 3:5];
run;

The first PRINT statement will fetch the entire summary table; the second PRINT statement will fetch rows 3 through 5. We can also use WHERE expression processing to create a new table with rows that match the WHERE expression. The output of the second PRINT statements above are shown in the figure below:

The result of an action could also be more complex in nature; it could be a dictionary containing dictionaries, arrays, and lists, or the result could be a list containing lists and arrays and tables etc. Therefore, it is important to understand these concepts through some simple cases. Let’s consider another example where the result is slightly more complex.

proc cas;
simple.summary result=s /table={name='hmeq', groupby='bad'};
describe s;
print s["ByGroup1.Summary", 3:5]; run;

In the example above, we are executing a summary using a BY group on a binary variable. The log shows that the result in this case is a dictionary with three entries, all of which are tables. Below is a snapshot of the log file as well as the output of PRINT statement looking at the summary for the first BY group for row 3 through 5.

If we are interested in saving the output of the summary action as a SAS data set (sas7bdat file), we will execute the SAVERESULT statement. The code below saves the summary statistics of the first BY group in the work library as a SAS dataset.

proc cas;
simple.summary result=s /table={name='hmeq', groupby='bad'};
describe s;
print s["ByGroup1.Summary", 3:5]; 
saveresult s["ByGroup1.Summary"] dataout=work.data;
run;

A helpful function we can often use is findtable. This function will search the given value for the first table it finds.

proc cas;
simple.summary result=s /table={name='hmeq'};
val = findtable(s);
saveresult val dataout=work.summary;
run;

In the example above, I used findtable to locate the first table produced within the dictionary, S, and save it under ‘Val,’ which is then directly invoked with SAVERESULT statement to save the output as a SAS dataset.

Finally, we also have the option to save the output of a CAS action in another CAS Table. The table summary1 in the code below is an in-memory CAS table that contains the output of the summary action. The code and output are shown below:

proc cas;
simple.summary /table={name='hmeq'} casout={name='mylib.summary1'}; 
fetch /table={name='mylib.summary1'};
run;

In this post, we saw several ways of interacting with the results of a CAS action using the CAS procedure. Depending on what our end goal is, we can use any of these options to either view the results or save the data for further processing.

Interacting with the results of PROC CAS was published on SAS Users.

11月 212017
 

SAS Viya provides a robust, scalable, cloud-ready, distributed runtime engine. This engine is driven by CAS (Cloud Analytic Services), providing fast processing for many data management techniques that run distributive, i.e. using all threads on all defined compute nodes.

Why

PROC APPEND is a common technique used in SAS processes. This technique will concatenate two data sets together. However, PROC APPEND will produce an ERROR if the target CAS table exists prior to the PROC APPEND.

Simulating PROC APPEND

Figure 1. SAS Log with the PROC APPEND ERROR message

Now what?

How

To explain how to simulate PROC APPEND we first need to create two CAS tables. The first CAS table is named CASUSER.APPEND_TARGET. Notice the variables table, row and variable in figure 2.

Figure 2. Creating the CAS table we need to append rows to

The second CAS table is called CASUSER.TABLE_TWO and in figure 3 we can review the variables table, row, and variable.

Figure 3. Creating the table with the rows that we need to append to the existing CAS table

To simulate PROC APPEND we will use a DATA Step. Notice on line 77 in figure 4 we will overwrite an existing CAS table, i.e. CASUSER.APEND_TARGET. On Line 78, we see the first table on the SET statement is CASUSER.APPEND_TARGET, followed by CASUSER.TABLE_TWO. When this DATA Step runs, all of the rows in CASUSER.APPEND_TARGET will be processed first, followed by the rows in CASUSER.TABLE_TWO. Also, note we are not limited to two tables on the SET statement with DATA Step; we can use as many as we need to solve our business problems.

Figure 4. SAS log validating the DATA Step ran in CAS i.e. distributed

The result table created by the DATA Step is shown in figure 5.

Figure 5. Result table from simulating PROC APPEND using a DATA Step

Conclusion

SAS Viya’s CAS processing allows us to stage data for downstream consumption by leveraging robust SAS programming techniques that run distributed, i.e. fast. PROC APPEND is a common procedure used in SAS processes. To simulate PROC APPEND when using CAS tables as source and target tables to the procedure use DATA Step.

How to simulate PROC APPEND in CAS was published on SAS Users.

11月 202017
 

Many SAS users have inquired about SAS Cloud Analytic Services’ (CAS) Distributed Network File System (Learn more about CAS.)

The “NFS” in “DNFS”

Let’s start at the beginning. The “NFS” in DNFS stands for “Network File System” and refers to the ability to share files across a network. As the picture below illustrates, a network file system lets numerous remote hosts access another host’s files.

Understanding DNFS

NFS

There are numerous network file system protocols that can be used for file sharing – e.g. CIFS, Ceph, Lustre – but the most common on Linux is NFS. While NFS is the dominant file-sharing protocol, the “NFS” part of the DNFS does not correspond to the NFS protocol. Currently all the DNFS supported file systems are based on NFS, but DNFS may support file systems based on other protocols in the future. So, it’s best to think of the “NFS” part of “DNFS” as a generic “network file system” (clustered file system) and not the specific NFS protocol.

The “D” in “DNFS”

The “D” in DNFS stands for “Distributed” but it does not refer to the network file system. By definition, that is already distributed since the file system is external to the machines accessing it. The “Distributed” in DNFS refers to CAS’ ability to use a network file system in a massively parallel way. With a supported file system mounted identically on each CAS node, CAS can access (both write) the file system’s CSV and SASHDAT files from every worker node in parallel.

This parallel file access is not an attribute of the file system, it is a capability of CAS. By definition, network file systems facilitate access at the file level, not the block level. With DNFS, CAS actively manages parallel block level I/O to the network file system, making sure file seek locations and block I/O operations do not overlap, etc.

DNFS

 

DNFS as CAS Backing Store

Not only can CAS perform multi-machine parallel I/O from network file systems, it can also memory-map NFS SASHDAT files directly into CAS. Thus, SASHDAT files on DNFS act as both the CASlib data source as well as the virtual memory “backing store,” often negating the need for CAS to utilize memory mapping (mmap()).

Note 1: Data transformations on load, such as row filtering and field selection, as well as encryption can trigger CAS_DISK_CACHE usage. Since the data must be transformed (subset and/or decrypted), CAS copies the transformed data into CAS_DISK_CACHE to support CAS processing.

Note 2: It is possible to use DNFS atop an encrypting file system or hardware storage device. Here, the HDAT blocks are stored encrypted but transmitted to the CAS I/O layer decrypted. Assuming no other transformations, CAS_DISK_CACHE will not be used in this scenario.

DNFS Memory Mapping

Performance Considerations

DNFS-based CAS loading will only be as fast as the slowest component involved. The chosen NFS architecture (hardware and CAS connectivity) should support I/O throughput commensurate with the CAS installation and in-line with the implementation’s service level agreements. So, while NetApp ONTAP clustering architecture. A different file system technology might look a little different but the same basic ideas will apply.

DNFS w/ Multi Machine File System

As described earlier, CAS manages the parallel I/O operations. Requests from CAS are sent to the appliance and handled by the NFS metadata server. The storage device implementing the NFS protocol points CAS DNFS to the proper file and block locations on the NFS data servers which pass the data to the CAS worker nodes directly.

Understanding DNFS was published on SAS Users.

11月 182017
 

Have you heard?  The ODS Destination for PowerPoint Has a New Option

It’s true.  The ODS destination for PowerPoint now has the STARTPAGE= option, which provides you with greater control and flexibility when creating presentations.

Added to the ODS POWERPOINT statement in SAS® 9.4TS1M4, the STARTPAGE= option enables you to force the creation of a new slide between procedures and between ODS layout containers.  Inserting a slide break between layout containers is one of the most impactful ways that you can use this option.

A new layout container does not automatically trigger a new slide within the presentation.  A new slide is started when the current slide is full.  That is the default, but the new STARTPAGE= option gives you the ability to start a new slide between containers even if the current slide is not full.

Examples

Shown below are four procedures placed within three layout containers.

  • The first PROC ODSTEXT step is placed in the first layout container.  Its purpose is to generate a slide of text, with that text roughly centered in the slide.
  • The second PROC ODSTEXT step is in the second container.  Its purpose is to provide useful information about the table and graph.
  • The PROC TABULATE and SGPLOT steps make up the third container.  They are the results of the analysis and, as such, need to be displayed side by side.

Default Behavior of ODS POWERPOINT, Without STARTPAGE=

Let’s look at the default behavior.  In this example, the STARTPAGE= option is not used.

ods powerpoint file='example1.pptx' options(backgroundimage="saslogo_pptx.png");
title;
ods layout gridded x=10% y=25%;
proc odstext;
	p "Have you heard?" /style=[just=c fontsize=42pt color=RoyalBlue];
	p "The STARTPAGE= option has been added to the ODS POWERPOINT statement!" /style=[just=c fontsize=24pt];
run;
ods layout end;
 
ods layout gridded rows=1 columns=1;
   ods region;
   proc odstext;
      p 'Table Shows Total Runs and Hits for Each League';
      p 'The Graph Contains One Bubble for Each Player.  The Size of Each Bubble Represents the Magnitude of the RBIs.';
   run;
ods layout end;
 
ods graphics / width=4.5in height=4in;
ods layout gridded columns=2 column_widths=(47% 47%) column_gutter=1pct;
   ods region;
   proc tabulate data=sashelp.baseball;
      class league;
      var nruns nhits nrbi;
      tables league='', sum='Totals'*(nruns nhits)*f=comma12.;
   run;
 
   ods region;
   proc sgplot data=sashelp.baseball;
      bubble x=nhits y=nruns size=nrbi/ group=league transparency=.3;
   run;
ods layout end;
ods powerpoint close;

Here is the resulting slide output:

ODS Destination for PowerPoint

Those results are not what we hoped they would be.  The output from the second PROC ODSTEXT step, which is to provide information about the table and graph, is on the first slide.  So is the graph!!!  And the graph does not look good because it is the wrong size.  The table is by itself on the second slide.  This is not the desired output at all.

Here Is Where STARTPAGE= Helps!

In this example, an ODS POWERPOINT statement with the STARTPAGE= option is added.  It is placed after the ODS LAYOUT END statement for the first container.

ods powerpoint file='example2.pptx' options(backgroundimage="saslogo_pptx.png");
title;
ods layout gridded x=10% y=25%;
proc odstext;
	p "Have you heard?" /style=[just=c fontsize=42pt color=RoyalBlue];
	p "The STARTPAGE= option has been added to the ODS POWERPOINT statement!" /style=[just=c fontsize=24pt];
run;
ods layout end;
 
ods powerpoint startpage=now; /* <---- Triggers a new slide */
 
ods layout gridded rows=1 columns=1;
   ods region;
   proc odstext;
      p 'Table Shows Total Runs and Hits for Each League';
      p 'The Graph Contains One Bubble for Each Player.  The Size of Each Bubble Represents the Magnitude of the RBIs.';
   run;
ods layout end;
 
ods graphics / width=4.5in height=4in;
ods layout gridded columns=2 column_widths=(47% 47%) column_gutter=1pct;
   ods region;
   proc tabulate data=sashelp.baseball;
      class league;
      var nruns nhits nrbi;
      tables league='', sum='Totals'*(nruns nhits)*f=comma12.;
   run;
 
   ods region;
   proc sgplot data=sashelp.baseball;
      bubble x=nhits y=nruns size=nrbi/ group=league transparency=.3;
   run;
ods layout end;
ods powerpoint close;

The STARTPAGE= option gave us exactly what we need.  The first slide contains just the text (from the first layout container).  The second slide contains more text along with the table and graph (from the second and third layout containers).

Use It Wisely!

The most important thing to know about using the STARTPAGE= option with layout containers is that it has to be placed between containers.  It cannot be placed within a container.  So you cannot put the statement between the ODS LAYOUT GRIDDED and ODS LAYOUT END statements.

For more information about ODS destination for PowerPoint and all of its options, visit this The Dynamic Duo: ODS Layout and the ODS Destination for PowerPoint.  Take a peek at it for more examples of using ODS Layout with the ODS destination for PowerPoint.

The ODS Destination for PowerPoint Has a New Option was published on SAS Users.

11月 162017
 
connect sas db2 via odbc

Illustration by Dooder on Freepik

A few weeks ago I helped a customer set up a successful connection between SAS and a remote DB2 database using ODBC. At first, I thought this case would be as simple as plugging in a cable. Why? Because I already had a fully-functional testing environment with direct interaction using SAS/ACCESS Interface to DB2. Besides, I had other RDBMS connected via ODBC. Therefore, it was just a matter of setting up my ODBC driver manager for DB2. No big deal right? Oh boy, was I wrong! It was not easy at all. It took me two full days to make it work in my lab, and only after some in-depth research and a lot of trial and error.

DB2 can be challenging, especially when most of the documentation I found was either incomplete, outdated, scarce or even unclear. I don't want you to undergo the same frustration I felt. As a blogger and SAS user, it is my duty to make your SAS life a little easier. That's why I decided to publish this blog post to share the solution with the community. This guide provides you with the steps to take to set up a successful connection between SAS and a remote DB2 database via ODBC on Linux, without tears.

Table of Contents

Keep in mind this is an extensive guide. I documented all the technical issues I came across in my lab, and also provided detailed explanations to overcome them. If you are beginning your journey, I recommend you to go through all the steps. However, for your convenience, if you want to jump right into a specific topic, use the following table of contents:

My Environment

My Environment

All commands throughout this guide are based on this table. You should replace the code with your settings when appropriate.

Prerequisites

Before we begin, make sure you meet all of these SAS and DB2 requirements:

System Requirements

It may sound obvious, but you must verify that your SAS release supports your DB2 database version. For instance, My Environment section shows that I'm using SAS 9.4 M4. This document recommends that I need DB2 10.5 or later. Considering I currently have DB2 11.1, I'm ready to go. To review the rest of the combinations visit the SAS System Requirements page.

Another evident must-have is an ODBC driver manager already installed. As you noticed, I use unixODBC version 2.3.1. It is always recommended to have the latest version, which nowadays is 2.3.4. Still, you should be fine if you decide to use any of the 2.3.X releases. In the past, I've seen users facing difficulties with unixODBC 2.2.X.

SAS/ACCESS Interface to ODBC licensed and installed

For this connection to work, it is essential to have SAS/ACCESS Interface to ODBC licensed and installed. First, to verify that this product is licensed, run this code in a SAS session:

proc setinit no alias;
run;

In the output log look for this line:

---SAS/ACCESS Interface to ODBC
                          06JAN2018

If the product is displayed, move on to the next step. If not, look for SAS/ACCESS Interface to DB2 instead. If you have the latter, consider yourself a lucky person. You don't have to continue with this guide because you can connect to DB2 directly. If you need assistance with this off-topic issue, send me a message, and I'll be glad to help you. If you don't have either one, you should get in touch with your sales representative.

Next, to verify that this product is installed, go to your !SASROOT/sasexe directory on your SAS server and list these files:

[sas@muse sasexe]$ ls -l *odb*
-r-xr-xr-x. 1 sas sas 630196 Nov  9  2016 sasiodbu
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb_u
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb_u
-r-xr-xr-x. 1 sas sas 102142 Nov  9  2016 sasodbcs
-r-xr-xr-x. 1 sas sas  71982 Nov  9  2016 sasodbdb
-r-xr-xr-x. 1 sas sas  71990 Nov  9  2016 sasodbdb_u
-r-xr-xr-x. 1 sas sas 202343 Nov  9  2016 sasodb
-r-xr-xr-x. 1 sas sas 201815 Nov  9  2016 sasodb_u
-r-xr-xr-x. 1 sas sas 443852 Nov  9  2016 tkeodbc.so

If you don't see the above list, then the product is not installed. Although, if your SAS license includes it, use your SAS depot to install it on your server.

Choosing the right DB2 client package

IBM offers five different client packages. If you don't know which package to choose, you can't go wrong with IBM Data Server Client, it's the package I use, it includes all the client tools and libraries available. Another good alternative is IBM Data Server Runtime Client, but you will have to create your DB2 instance and user yourself. I would not recommend IBM Data Server Driver for ODBC and CLI. During my research I found that people could not get it to connect to a remote DB2 database. For more information about each client package, please visit IBM data server client and driver types.

Testing your DB2 connection outside of SAS

This test is the most critical requirement before starting with the actual configuration process. Take my advice for any connection you plan to configure in the future. Firstly, you must establish a successful connection between your database client and database server outside of SAS, using your database vendor tools. This validation is a must, regardless of your RDBMS or SAS engine.

To test your connection to your remote DB2 server, run an equivalent command on your DB2 client:

[db2inst@muse ~]$ db2 connect to GSDB user db2inst using mypassword
 
   Database Connection Information
 
 Database server        = DB2/LINUXX8664 11.1.1
 SQL authorization ID   = DB2INST
 Local database alias   = GSDB

If you received a similar output you might continue; otherwise, you will have to catalog your DB2 server as a TCP/IP node, and then catalog the database on that node. Since this is beyond the scope of this guide, please review these IBM documents: CATALOG NODE command and CATALOG DATABASE command.

Setting up unixODBC with DB2

Considering that you have tested a successful connection between your database client and database server with the db2 command, it's time to set up a new one using an ODBC driver instead. Please bear with me. Things are about to get interesting.

As mentioned at the beginning, unixODBC is my ODBC driver manager of choice. I use it for all my RDBMS in my lab to test ODBC connections. Its usage is almost effortless, but one of the downsides is the documentation. The only official document available for DB2 is: Using IBM DB2 with unixODBC. Judge for yourself.

This driver manager has two configuration files: odbcinst.ini for database drivers and odbc.ini for database definitions. Typically with most databases, those two files are all that it takes to set up an ODBC connection. DB2 plays a different game though. It uses two extra files: db2cli.ini for DB2 definitions and an XML based file called db2dsdriver.cfg.

For now, let's take a closer look at the contents of both unixODBC files:

odbcinst.ini

[DB2]
Description     = IBM DB2 ODBC Driver
Driver          = /opt/ibm/db2/V11.1/lib32/libdb2.so
Driver64        = /opt/ibm/db2/V11.1/lib64/libdb2.so
FileUsage       = 1
DontDLClose     = 1

odbc.ini

[db2odbc]
Driver          = DB2
Description     = DB2 ODBC connection for SAS

At this point I was feeling confident and thought the connection would work, so I tested it out with the isql command, but to my surprise, I received this error:

[db2inst@muse ~]$ isql -v db2odbc db2inst mypassword
[08001][unixODBC][IBM][CLI Driver] SQL1013N  The database alias name or database name "DB2ODBC" could not be found.  SQLSTATE=42705
 
[ISQL]ERROR: Could not SQLConnect

db2cli.ini came to the rescue

As a result, I started looking up online documentation. After a while, I discovered the existence of the db2cli.ini file. To get you started, IBM provides a sample configuration file called db2cli.ini.sample, located in the installation_path/cfg directory. The DB2 documentation recommends creating a db2cli.ini file based on that sample file and place it in the same location. I followed their advice, and created my file with the same data source name:

[db2odbc]
Database        = GSDB
Protocol        = TCPIP
Hostname        = redhat
ServiceName     = 50000

Note: For IBM, the ServiceName keyword is the port number. The default port number for the DB2 default instance is 50000. Review the correct port with your DBA.

Having set up both unixODBC and DB2 files correctly, I was feeling lucky. Therefore, I attempted a connection again, but I received the same error. My frustration was on the rise.

db2cli validate tool is your best friend

I continued digging into the documentation until I found a powerful tool: db2cli with a particular parameter: validate. This command/parameter duo can help you verify the contents of your configuration files (db2cli.ini and db2dsdriver.cfg), list their location, test your data source names and databases, display invalid/duplicate keywords, among other functionalities. It basically is your new best friend.

db2cli validate helped me identify why the DB2 ODBC driver was unable to find my data source name. I ran this command to examine my environment, and the output was eye-opening:

[db2inst@muse ~]$ db2cli validate -dsn db2odbc
 
===============================================================================
Client information for the current copy:
===============================================================================
 
Client Package Type       : IBM Data Server Client
Client Version (level/bit): DB2 v11.1.0.0 (s1606081100/64-bit)
Client Platform           : Linux/X8664
Install/Instance Path     : /opt/ibm/db2/V11.1
DB2DSDRIVER_CFG_PATH value: 
db2dsdriver.cfg Path      : /home/db2inst/sqllib/cfg/db2dsdriver.cfg
DB2CLIINIPATH value       : 
db2cli.ini Path           : /home/db2inst/sqllib/cfg/db2cli.ini
db2diag.log Path          : /home/db2inst/sqllib/db2dump/db2diag.log
 
===============================================================================
db2dsdriver.cfg schema validation for the entire file:
===============================================================================
 
Note: The validation utility could not find the configuration file 
db2dsdriver.cfg. The file is searched at 
"/home/db2inst/sqllib/cfg/db2dsdriver.cfg".
 
===============================================================================
db2cli.ini validation for data source name "db2odbc":
===============================================================================
 
Note: The validation utility could not find the configuration file db2cli.ini. 
The file is searched at "/home/db2inst/sqllib/cfg/db2cli.ini".
 
===============================================================================
db2dsdriver.cfg validation for data source name "db2odbc":
===============================================================================
 
Note: The validation utility could not find the configuration file 
db2dsdriver.cfg. The file is searched at 
"/home/db2inst/sqllib/cfg/db2dsdriver.cfg".
 
===============================================================================
The validation is completed.
===============================================================================

As you can see, the validation tool didn't find my db2cli.ini. That file contained my data source name; consequently, the DB2 ODBC driver didn't find it either. As previously indicated, I created that file in the installation_path/cfg directory, which it was: /opt/ibm/db2/V11.1/cfg, but the DB2 ODBC driver was searching for it in: /home/db2inst/sqllib/cfg. I didn't choose that path randomly; I just followed this IBM document. Nevertheless, if you encounter the same issue, the solution is fairly simple. You can either create a copy in the required path or point to the original file using a symbolic link. I chose the latter to avoid having duplicates. To fix this misunderstanding just execute this command as root or sudo:

[root@muse ~]# ln -s /opt/ibm/db2/V11.1/cfg/db2cli.ini /home/db2inst/sqllib/cfg/db2cli.ini

Now you should have this symbolic link:

[db2inst@muse ~]$ ls -l /home/db2inst/sqllib/cfg/db2cli.ini 
lrwxrwxrwx. 1 root db2iadm 33 Sep 11 19:07 /home/db2inst/sqllib/cfg/db2cli.ini -> /opt/ibm/db2/V11.1/cfg/db2cli.ini

After this fix, I used my new friend again, and my environment changed. Below is a partial output:

[db2inst@muse ~]$ db2cli validate -dsn db2odbc
===============================================================================
db2cli.ini validation for data source name "db2odbc":
===============================================================================
 
[ Keywords used for the connection ]
 
Keyword                   Value
---------------------------------------------------------------------------
DATABASE                  GSDB
PROTOCOL                  TCPIP
HOSTNAME                  redhat
SERVICENAME               50000

At last, the DB2 ODBC driver was able to find the configuration file, read its contents, and validate the keywords. My frustration started to disappear.

It was time to see what unixODBC thought about this change. So I reran my isql command:

[db2inst@muse ~]$ isql -v db2odbc db2inst mypassword
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL>

It finally worked!

Do we need to use db2dsdriver.cfg?

I managed to set up a successful ODBC connection using only three files: db2cli.ini, along with odbc.ini, and odbcinst.ini. Thus, the answer is no; you don't need to use this additional DB2 configuration file.

For our scenario, I consider it as optional. Previously, I pointed out that it was an XML file, as opposed to db2cli.ini, which is a text file. Both can have the same four keywords I defined. Then why does DB2 provide two files? db2cli.ini is used to configure the behavior of CLI/ODBC applications, whereas db2dsdriver.cfg can be used for those as well, and for other types of applications including PHP, Perl, .NET, among others.

In short, this concludes the DB2 configuration process. Now it's time to cover the SAS territory.

SAS joins the party

So far we haven't touched any SAS files. We have only tested the DB2 connection first using the db2 command, then using unixODBC's isql command. These are mandatory steps before a successful connection with SAS.

Getting sasenv_local ready for DB2

If you have already connected SAS to other RDBMS in the past, you know that the most important file you need to care about is sasenv_local. It is located in the !SASROOT/bin directory. The code below shows a partial output with both ODBC and DB2 variables correctly defined:

# ODBC settings
export ODBCSYSINI=/etc
 
# DB2 settings
export INSTHOME=/opt/ibm/db2/V11.1
export DB2CLIINIPATH=/opt/ibm/db2/V11.1/cfg
 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/lib64:${INSTHOME}/lib64

Let's analyze each of the above settings. First, SAS needs to know where to locate both unixODBC configuration files. The ODBCSYSINI variable instructs SAS to look for them in the /etc directory.

Second, as a good practice, I defined an environment variable called INSTHOME to reference my DB2 installation directory. This variable is optional but highly recommended. Then, the DB2CLIINIPATH variable tells SAS the location of db2cli.ini. If you decide to use db2dsdriver.cfg instead, then replace that variable with DB2DSDRIVER_CFG_PATH.

Finally, you have to inform SAS where to find unixODBC and DB2 libraries. SAS uses the LD_LIBRARY_PATH environment variable on Linux. This variable points to /usr/lib64 because it's the default path for all unixODBC libraries. I also appended the $INSTHOME/lib64 directory, since it's the place where DB2 has all its libraries.

At this point, you might think that you are all set. I don't like being pessimistic, but there is a high possibility that your connection may not work. I didn't test all SAS and DB2 releases, but my customer and I experienced the same error when we tested our connection with a LIBNAME statement:

[sas@muse 9.4]$ ./sas -nodms
 
  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
ERROR:  An exception has been encountered.
Please contact technical support and provide them with the following
       traceback information:
 
The SAS task name is [Line Mod]
Segmentation Violation
# More lines...

Moreover, if you run this code in SAS Studio or SAS Enterprise Guide, you might not even get a response, SAS will just crash or hang for a long time.

Fixing the Segmentation Violation error

After intense research, I was able to fix this error. We need to link two ODBC libraries that SAS provides to enable 32-bit capability. Change to your !SASROOT/sasexe directory and backup both files:

[sas@muse sasexe]$ mv sasodb sasodb.orig
[sas@muse sasexe]$ mv sasioodb sasioodb.orig

Create two symbolic links that point to the 32-bit libraries:

[sas@muse sasexe]$ ln -s sasodb_u sasodb
[sas@muse sasexe]$ ln -s sasioodb_u sasioodb

List your files, and the output should resemble the following:

[sas@muse sasexe]$ ls -l *odb*
-r-xr-xr-x. 1 sas sas 630196 Nov  9  2016 sasiodbu
lrwxrwxrwx. 1 sas sas     10 Sep 11 21:51 sasioodb -> sasioodb_u
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb.orig
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb_u
lrwxrwxrwx. 1 sas sas      8 Sep 11 21:52 sasodb -> sasodb_u
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb_u
-r-xr-xr-x. 1 sas sas 102142 Nov  9  2016 sasodbcs
-r-xr-xr-x. 1 sas sas  71982 Nov  9  2016 sasodbdb
-r-xr-xr-x. 1 sas sas  71990 Nov  9  2016 sasodbdb_u
-r-xr-xr-x. 1 sas sas 202343 Nov  9  2016 sasodb.orig
-r-xr-xr-x. 1 sas sas 201815 Nov  9  2016 sasodb_u
-r-xr-xr-x. 1 sas sas 443852 Nov  9  2016 tkeodbc.so

Now your LIBNAME statement may work, but unlike my customer, you can also encounter this error:

  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
ERROR: Could not load /sas/Software/SASFoundation/9.4/sasexe/sasodb (35
       images loaded)
ERROR: libodbc.so.1: cannot open shared object file: No such file or directory
ERROR: The SAS/ACCESS Interface to ODBC cannot be loaded. The SASODB   code
       appendage could not be loaded.
ERROR: Error in the LIBNAME statement.

The error is self-explanatory. SAS couldn't find the file libodbc.so.1, which SASODB needs. To display all the required shared libraries, execute the ldd command:

[sas@muse sasexe]$ ldd sasodb
        linux-vdso.so.1 =>  (0x00007ffd4efe2000)
        libodbc.so.1 => not found
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9aa741d000)
        libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f9aa71e6000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f9aa6fe2000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f9aa6cdf000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f9aa691e000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f9aa6708000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f9aa7882000)
        libfreebl3.so => /lib64/libfreebl3.so (0x00007f9aa6504000)

Indeed, that library was not found, so to fix this missing dependency, create a symbolic link using root or sudo:

[root@muse ~]# ln -s /lib64/libodbc.so.2.0.0 /lib64/libodbc.so.1

Creating a library that works

Now that the previous errors are gone, run a final test using the SAS command line to assign a library and then SAS Studio to display some data in a nice-looking way:

[sas@muse 9.4]$ ./sas -nodms
  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
NOTE: Libref DB2LIB was successfully assigned as follows: 
      Engine:        ODBC 
      Physical Name: db2odbc

Note: Displaying actual data is an essential last step because if you are pointing to an incorrect DB2 ODBC driver (such as libdb2o.so instead of libdb2.so), you can still assign a library successfully, but you won't see any tables.

In SAS Studio I assigned the same library and ran a small query to retrieve some data:

proc print data=db2lib.department_lookup;
    var department_code department_en;
run;

Sample data in SAS Studio

Wrapping Up

I know it's a lot to digest. However, some users may not need to follow all steps to succeed, whereas others starting from scratch may require performing all steps carefully. Whatever your situation might be, I hope you find this guide useful and easy to follow. Did you struggle like me? Share your experience in the comments below or if you have any problems, don't hesitate to contact me.

Connecting SAS to a DB2 database via ODBC without tears was published on SAS Users.

11月 162017
 
connect sas db2 via odbc

Illustration by Dooder on Freepik

A few weeks ago I helped a customer set up a successful connection between SAS and a remote DB2 database using ODBC. At first, I thought this case would be as simple as plugging in a cable. Why? Because I already had a fully-functional testing environment with direct interaction using SAS/ACCESS Interface to DB2. Besides, I had other RDBMS connected via ODBC. Therefore, it was just a matter of setting up my ODBC driver manager for DB2. No big deal right? Oh boy, was I wrong! It was not easy at all. It took me two full days to make it work in my lab, and only after some in-depth research and a lot of trial and error.

DB2 can be challenging, especially when most of the documentation I found was either incomplete, outdated, scarce or even unclear. I don't want you to undergo the same frustration I felt. As a blogger and SAS user, it is my duty to make your SAS life a little easier. That's why I decided to publish this blog post to share the solution with the community. This guide provides you with the steps to take to set up a successful connection between SAS and a remote DB2 database via ODBC on Linux, without tears.

Table of Contents

Keep in mind this is an extensive guide. I documented all the technical issues I came across in my lab, and also provided detailed explanations to overcome them. If you are beginning your journey, I recommend you to go through all the steps. However, for your convenience, if you want to jump right into a specific topic, use the following table of contents:

My Environment

My Environment

All commands throughout this guide are based on this table. You should replace the code with your settings when appropriate.

Prerequisites

Before we begin, make sure you meet all of these SAS and DB2 requirements:

System Requirements

It may sound obvious, but you must verify that your SAS release supports your DB2 database version. For instance, My Environment section shows that I'm using SAS 9.4 M4. This document recommends that I need DB2 10.5 or later. Considering I currently have DB2 11.1, I'm ready to go. To review the rest of the combinations visit the SAS System Requirements page.

Another evident must-have is an ODBC driver manager already installed. As you noticed, I use unixODBC version 2.3.1. It is always recommended to have the latest version, which nowadays is 2.3.4. Still, you should be fine if you decide to use any of the 2.3.X releases. In the past, I've seen users facing difficulties with unixODBC 2.2.X.

SAS/ACCESS Interface to ODBC licensed and installed

For this connection to work, it is essential to have SAS/ACCESS Interface to ODBC licensed and installed. First, to verify that this product is licensed, run this code in a SAS session:

proc setinit no alias;
run;

In the output log look for this line:

---SAS/ACCESS Interface to ODBC
                          06JAN2018

If the product is displayed, move on to the next step. If not, look for SAS/ACCESS Interface to DB2 instead. If you have the latter, consider yourself a lucky person. You don't have to continue with this guide because you can connect to DB2 directly. If you need assistance with this off-topic issue, send me a message, and I'll be glad to help you. If you don't have either one, you should get in touch with your sales representative.

Next, to verify that this product is installed, go to your !SASROOT/sasexe directory on your SAS server and list these files:

[sas@muse sasexe]$ ls -l *odb*
-r-xr-xr-x. 1 sas sas 630196 Nov  9  2016 sasiodbu
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb_u
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb_u
-r-xr-xr-x. 1 sas sas 102142 Nov  9  2016 sasodbcs
-r-xr-xr-x. 1 sas sas  71982 Nov  9  2016 sasodbdb
-r-xr-xr-x. 1 sas sas  71990 Nov  9  2016 sasodbdb_u
-r-xr-xr-x. 1 sas sas 202343 Nov  9  2016 sasodb
-r-xr-xr-x. 1 sas sas 201815 Nov  9  2016 sasodb_u
-r-xr-xr-x. 1 sas sas 443852 Nov  9  2016 tkeodbc.so

If you don't see the above list, then the product is not installed. Although, if your SAS license includes it, use your SAS depot to install it on your server.

Choosing the right DB2 client package

IBM offers five different client packages. If you don't know which package to choose, you can't go wrong with IBM Data Server Client, it's the package I use, it includes all the client tools and libraries available. Another good alternative is IBM Data Server Runtime Client, but you will have to create your DB2 instance and user yourself. I would not recommend IBM Data Server Driver for ODBC and CLI. During my research I found that people could not get it to connect to a remote DB2 database. For more information about each client package, please visit IBM data server client and driver types.

Testing your DB2 connection outside of SAS

This test is the most critical requirement before starting with the actual configuration process. Take my advice for any connection you plan to configure in the future. Firstly, you must establish a successful connection between your database client and database server outside of SAS, using your database vendor tools. This validation is a must, regardless of your RDBMS or SAS engine.

To test your connection to your remote DB2 server, run an equivalent command on your DB2 client:

[db2inst@muse ~]$ db2 connect to GSDB user db2inst using mypassword
 
   Database Connection Information
 
 Database server        = DB2/LINUXX8664 11.1.1
 SQL authorization ID   = DB2INST
 Local database alias   = GSDB

If you received a similar output you might continue; otherwise, you will have to catalog your DB2 server as a TCP/IP node, and then catalog the database on that node. Since this is beyond the scope of this guide, please review these IBM documents: CATALOG NODE command and CATALOG DATABASE command.

Setting up unixODBC with DB2

Considering that you have tested a successful connection between your database client and database server with the db2 command, it's time to set up a new one using an ODBC driver instead. Please bear with me. Things are about to get interesting.

As mentioned at the beginning, unixODBC is my ODBC driver manager of choice. I use it for all my RDBMS in my lab to test ODBC connections. Its usage is almost effortless, but one of the downsides is the documentation. The only official document available for DB2 is: Using IBM DB2 with unixODBC. Judge for yourself.

This driver manager has two configuration files: odbcinst.ini for database drivers and odbc.ini for database definitions. Typically with most databases, those two files are all that it takes to set up an ODBC connection. DB2 plays a different game though. It uses two extra files: db2cli.ini for DB2 definitions and an XML based file called db2dsdriver.cfg.

For now, let's take a closer look at the contents of both unixODBC files:

odbcinst.ini

[DB2]
Description     = IBM DB2 ODBC Driver
Driver          = /opt/ibm/db2/V11.1/lib32/libdb2.so
Driver64        = /opt/ibm/db2/V11.1/lib64/libdb2.so
FileUsage       = 1
DontDLClose     = 1

odbc.ini

[db2odbc]
Driver          = DB2
Description     = DB2 ODBC connection for SAS

At this point I was feeling confident and thought the connection would work, so I tested it out with the isql command, but to my surprise, I received this error:

[db2inst@muse ~]$ isql -v db2odbc db2inst mypassword
[08001][unixODBC][IBM][CLI Driver] SQL1013N  The database alias name or database name "DB2ODBC" could not be found.  SQLSTATE=42705
 
[ISQL]ERROR: Could not SQLConnect

db2cli.ini came to the rescue

As a result, I started looking up online documentation. After a while, I discovered the existence of the db2cli.ini file. To get you started, IBM provides a sample configuration file called db2cli.ini.sample, located in the installation_path/cfg directory. The DB2 documentation recommends creating a db2cli.ini file based on that sample file and place it in the same location. I followed their advice, and created my file with the same data source name:

[db2odbc]
Database        = GSDB
Protocol        = TCPIP
Hostname        = redhat
ServiceName     = 50000

Note: For IBM, the ServiceName keyword is the port number. The default port number for the DB2 default instance is 50000. Review the correct port with your DBA.

Having set up both unixODBC and DB2 files correctly, I was feeling lucky. Therefore, I attempted a connection again, but I received the same error. My frustration was on the rise.

db2cli validate tool is your best friend

I continued digging into the documentation until I found a powerful tool: db2cli with a particular parameter: validate. This command/parameter duo can help you verify the contents of your configuration files (db2cli.ini and db2dsdriver.cfg), list their location, test your data source names and databases, display invalid/duplicate keywords, among other functionalities. It basically is your new best friend.

db2cli validate helped me identify why the DB2 ODBC driver was unable to find my data source name. I ran this command to examine my environment, and the output was eye-opening:

[db2inst@muse ~]$ db2cli validate -dsn db2odbc
 
===============================================================================
Client information for the current copy:
===============================================================================
 
Client Package Type       : IBM Data Server Client
Client Version (level/bit): DB2 v11.1.0.0 (s1606081100/64-bit)
Client Platform           : Linux/X8664
Install/Instance Path     : /opt/ibm/db2/V11.1
DB2DSDRIVER_CFG_PATH value: 
db2dsdriver.cfg Path      : /home/db2inst/sqllib/cfg/db2dsdriver.cfg
DB2CLIINIPATH value       : 
db2cli.ini Path           : /home/db2inst/sqllib/cfg/db2cli.ini
db2diag.log Path          : /home/db2inst/sqllib/db2dump/db2diag.log
 
===============================================================================
db2dsdriver.cfg schema validation for the entire file:
===============================================================================
 
Note: The validation utility could not find the configuration file 
db2dsdriver.cfg. The file is searched at 
"/home/db2inst/sqllib/cfg/db2dsdriver.cfg".
 
===============================================================================
db2cli.ini validation for data source name "db2odbc":
===============================================================================
 
Note: The validation utility could not find the configuration file db2cli.ini. 
The file is searched at "/home/db2inst/sqllib/cfg/db2cli.ini".
 
===============================================================================
db2dsdriver.cfg validation for data source name "db2odbc":
===============================================================================
 
Note: The validation utility could not find the configuration file 
db2dsdriver.cfg. The file is searched at 
"/home/db2inst/sqllib/cfg/db2dsdriver.cfg".
 
===============================================================================
The validation is completed.
===============================================================================

As you can see, the validation tool didn't find my db2cli.ini. That file contained my data source name; consequently, the DB2 ODBC driver didn't find it either. As previously indicated, I created that file in the installation_path/cfg directory, which it was: /opt/ibm/db2/V11.1/cfg, but the DB2 ODBC driver was searching for it in: /home/db2inst/sqllib/cfg. I didn't choose that path randomly; I just followed this IBM document. Nevertheless, if you encounter the same issue, the solution is fairly simple. You can either create a copy in the required path or point to the original file using a symbolic link. I chose the latter to avoid having duplicates. To fix this misunderstanding just execute this command as root or sudo:

[root@muse ~]# ln -s /opt/ibm/db2/V11.1/cfg/db2cli.ini /home/db2inst/sqllib/cfg/db2cli.ini

Now you should have this symbolic link:

[db2inst@muse ~]$ ls -l /home/db2inst/sqllib/cfg/db2cli.ini 
lrwxrwxrwx. 1 root db2iadm 33 Sep 11 19:07 /home/db2inst/sqllib/cfg/db2cli.ini -> /opt/ibm/db2/V11.1/cfg/db2cli.ini

After this fix, I used my new friend again, and my environment changed. Below is a partial output:

[db2inst@muse ~]$ db2cli validate -dsn db2odbc
===============================================================================
db2cli.ini validation for data source name "db2odbc":
===============================================================================
 
[ Keywords used for the connection ]
 
Keyword                   Value
---------------------------------------------------------------------------
DATABASE                  GSDB
PROTOCOL                  TCPIP
HOSTNAME                  redhat
SERVICENAME               50000

At last, the DB2 ODBC driver was able to find the configuration file, read its contents, and validate the keywords. My frustration started to disappear.

It was time to see what unixODBC thought about this change. So I reran my isql command:

[db2inst@muse ~]$ isql -v db2odbc db2inst mypassword
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL>

It finally worked!

Do we need to use db2dsdriver.cfg?

I managed to set up a successful ODBC connection using only three files: db2cli.ini, along with odbc.ini, and odbcinst.ini. Thus, the answer is no; you don't need to use this additional DB2 configuration file.

For our scenario, I consider it as optional. Previously, I pointed out that it was an XML file, as opposed to db2cli.ini, which is a text file. Both can have the same four keywords I defined. Then why does DB2 provide two files? db2cli.ini is used to configure the behavior of CLI/ODBC applications, whereas db2dsdriver.cfg can be used for those as well, and for other types of applications including PHP, Perl, .NET, among others.

In short, this concludes the DB2 configuration process. Now it's time to cover the SAS territory.

SAS joins the party

So far we haven't touched any SAS files. We have only tested the DB2 connection first using the db2 command, then using unixODBC's isql command. These are mandatory steps before a successful connection with SAS.

Getting sasenv_local ready for DB2

If you have already connected SAS to other RDBMS in the past, you know that the most important file you need to care about is sasenv_local. It is located in the !SASROOT/bin directory. The code below shows a partial output with both ODBC and DB2 variables correctly defined:

# ODBC settings
export ODBCSYSINI=/etc
 
# DB2 settings
export INSTHOME=/opt/ibm/db2/V11.1
export DB2CLIINIPATH=/opt/ibm/db2/V11.1/cfg
 
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/lib64:${INSTHOME}/lib64

Let's analyze each of the above settings. First, SAS needs to know where to locate both unixODBC configuration files. The ODBCSYSINI variable instructs SAS to look for them in the /etc directory.

Second, as a good practice, I defined an environment variable called INSTHOME to reference my DB2 installation directory. This variable is optional but highly recommended. Then, the DB2CLIINIPATH variable tells SAS the location of db2cli.ini. If you decide to use db2dsdriver.cfg instead, then replace that variable with DB2DSDRIVER_CFG_PATH.

Finally, you have to inform SAS where to find unixODBC and DB2 libraries. SAS uses the LD_LIBRARY_PATH environment variable on Linux. This variable points to /usr/lib64 because it's the default path for all unixODBC libraries. I also appended the $INSTHOME/lib64 directory, since it's the place where DB2 has all its libraries.

At this point, you might think that you are all set. I don't like being pessimistic, but there is a high possibility that your connection may not work. I didn't test all SAS and DB2 releases, but my customer and I experienced the same error when we tested our connection with a LIBNAME statement:

[sas@muse 9.4]$ ./sas -nodms
 
  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
ERROR:  An exception has been encountered.
Please contact technical support and provide them with the following
       traceback information:
 
The SAS task name is [Line Mod]
Segmentation Violation
# More lines...

Moreover, if you run this code in SAS Studio or SAS Enterprise Guide, you might not even get a response, SAS will just crash or hang for a long time.

Fixing the Segmentation Violation error

After intense research, I was able to fix this error. We need to link two ODBC libraries that SAS provides to enable 32-bit capability. Change to your !SASROOT/sasexe directory and backup both files:

[sas@muse sasexe]$ mv sasodb sasodb.orig
[sas@muse sasexe]$ mv sasioodb sasioodb.orig

Create two symbolic links that point to the 32-bit libraries:

[sas@muse sasexe]$ ln -s sasodb_u sasodb
[sas@muse sasexe]$ ln -s sasioodb_u sasioodb

List your files, and the output should resemble the following:

[sas@muse sasexe]$ ls -l *odb*
-r-xr-xr-x. 1 sas sas 630196 Nov  9  2016 sasiodbu
lrwxrwxrwx. 1 sas sas     10 Sep 11 21:51 sasioodb -> sasioodb_u
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb.orig
-r-xr-xr-x. 1 sas sas 603114 Nov  9  2016 sasioodb_u
lrwxrwxrwx. 1 sas sas      8 Sep 11 21:52 sasodb -> sasodb_u
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb
-r-xr-xr-x. 1 sas sas  59977 Nov  9  2016 sasodbcb_u
-r-xr-xr-x. 1 sas sas 102142 Nov  9  2016 sasodbcs
-r-xr-xr-x. 1 sas sas  71982 Nov  9  2016 sasodbdb
-r-xr-xr-x. 1 sas sas  71990 Nov  9  2016 sasodbdb_u
-r-xr-xr-x. 1 sas sas 202343 Nov  9  2016 sasodb.orig
-r-xr-xr-x. 1 sas sas 201815 Nov  9  2016 sasodb_u
-r-xr-xr-x. 1 sas sas 443852 Nov  9  2016 tkeodbc.so

Now your LIBNAME statement may work, but unlike my customer, you can also encounter this error:

  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
ERROR: Could not load /sas/Software/SASFoundation/9.4/sasexe/sasodb (35
       images loaded)
ERROR: libodbc.so.1: cannot open shared object file: No such file or directory
ERROR: The SAS/ACCESS Interface to ODBC cannot be loaded. The SASODB   code
       appendage could not be loaded.
ERROR: Error in the LIBNAME statement.

The error is self-explanatory. SAS couldn't find the file libodbc.so.1, which SASODB needs. To display all the required shared libraries, execute the ldd command:

[sas@muse sasexe]$ ldd sasodb
        linux-vdso.so.1 =>  (0x00007ffd4efe2000)
        libodbc.so.1 => not found
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9aa741d000)
        libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f9aa71e6000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f9aa6fe2000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f9aa6cdf000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f9aa691e000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f9aa6708000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f9aa7882000)
        libfreebl3.so => /lib64/libfreebl3.so (0x00007f9aa6504000)

Indeed, that library was not found, so to fix this missing dependency, create a symbolic link using root or sudo:

[root@muse ~]# ln -s /lib64/libodbc.so.2.0.0 /lib64/libodbc.so.1

Creating a library that works

Now that the previous errors are gone, run a final test using the SAS command line to assign a library and then SAS Studio to display some data in a nice-looking way:

[sas@muse 9.4]$ ./sas -nodms
  1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;
 
NOTE: Libref DB2LIB was successfully assigned as follows: 
      Engine:        ODBC 
      Physical Name: db2odbc

Note: Displaying actual data is an essential last step because if you are pointing to an incorrect DB2 ODBC driver (such as libdb2o.so instead of libdb2.so), you can still assign a library successfully, but you won't see any tables.

In SAS Studio I assigned the same library and ran a small query to retrieve some data:

proc print data=db2lib.department_lookup;
    var department_code department_en;
run;

Sample data in SAS Studio

Wrapping Up

I know it's a lot to digest. However, some users may not need to follow all steps to succeed, whereas others starting from scratch may require performing all steps carefully. Whatever your situation might be, I hope you find this guide useful and easy to follow. Did you struggle like me? Share your experience in the comments below or if you have any problems, don't hesitate to contact me.

Connecting SAS to a DB2 database via ODBC without tears was published on SAS Users.

11月 162017
 

As a SAS Viya user, you may be wondering whether it is possible to execute data append and data update concurrently to a global Cloud Analytic Services (CAS) table from two or more CAS sessions. (Learn more about CAS.) How would this impact the report view while data append or data update is running on a global CAS table? These questions are even more important for those using the programming interface to load and update data in CAS. This post discusses data append, data update, and concurrency in CAS.

Two or more CAS sessions can simultaneously submit a data append and data update process to a CAS table, but only one process at a time can run against the same CAS table. The multiple append and update processes execute in serial, one after another, never running in a concurrent fashion. Whichever CAS session is first to acquire the write lock on a global CAS table prevails, appending or updating the data first. The other append and update processes must wait in a queue to acquire the write lock.

During the data append process, the appended data is not available to end users or reports until all rows are inserted and committed into the CAS table. While data append is running, users can still render reports against the CAS table using the original data, but excluding the appended rows.

Similarly, during the data update process, the updated data is not available to users or reports until the update process is complete. However, CAS lets you render reports using the original (non-updated) data, as the CAS table is always available for the read process. During the data update process, CAS makes additional copies into memory of the to-be-updated blocks containing rows in order to perform the update statement. Once the update process is complete, the additional and now obsolete copies of blocks, are removed from CAS. Data updates to a global CAS table is an expensive operation in terms of CPU and memory usage. You have to factor in the additional overhead memory or CAS_CACHE space to support the updates. The space requirement depends on the number of rows being affected by the update process.

At any given time, there could be only one active write process (append/update) against a global CAS table. However, there could be many concurrent active read processes against a global CAS table. A global CAS table is always available for read processes, even when an append or update process is running on the same CAS table.

The following log example describes two simultaneous CAS sessions executing data appends to a CAS table. Both append processes were submitted to CAS with a gap of a few seconds. Notice the execution time for the second CAS session MYSESSION1 is double the time that it took the first CAS session to append the same size of data to the CAS table. This shows that both appends were executing one after another. The amount of memory used and the CAS_CACHE location also shows that both processes were running one after another in a serial fashion.

Log from simultaneous CAS session MYSESSION submitting APPEND

58 proc casutil ;
NOTE: The UUID '1411b6f2-e678-f546-b284-42b6907260e9' is connected using session MYSESSION.
59 load data=mydata.big_prdsale
60 outcaslib="caspath" casout="big_PRDSALE" append ;
NOTE: MYDATA.BIG_PRDSALE was successfully added to the "caspath" caslib as "big_PRDSALE".
61 quit ;
NOTE: PROCEDURE CASUTIL used (Total process time):
real time 49.58 seconds
cpu time 5.05 seconds

Log from simultaneous CAS session MYSESSION1 submitting APPEND

58 proc casutil ;
NOTE: The UUID 'a20a246e-e0cc-da4d-8691-c0ff2a222dfd' is connected using session MYSESSION1.
59 load data=mydata.big_prdsale1
60 outcaslib="caspath" casout="big_PRDSALE" append ;
NOTE: MYDATA.BIG_PRDSALE1 was successfully added to the "caspath" caslib as "big_PRDSALE".
61 quit ;
NOTE: PROCEDURE CASUTIL used (Total process time):
real time 1:30.33
cpu time 4.91 seconds

 

When the data append process from MYSESSION1 was submitted alone (no simultaneous process), the execution time is around the same as for the first session MYSESSION. This also shows that when two simultaneous append processes were submitted against the CAS table, one was waiting for the other to finish. At one time, only one process was running the data APPEND action to the CAS table (no concurrent append).

Log from a lone CAS session MYSESSION1 submitting APPEND

58 proc casutil ;
NOTE: The UUID 'a20a246e-e0cc-da4d-8691-c0ff2a222dfd' is connected using session MYSESSION1.
59 load data=mydata.big_prdsale1
60 outcaslib="caspath" casout="big_PRDSALE" append ;
NOTE: MYDATA.BIG_PRDSALE1 was successfully added to the "caspath" caslib as "big_PRDSALE".
61 quit ;
NOTE: PROCEDURE CASUTIL used (Total process time):
real time 47.63 seconds
cpu time 4.94 seconds

 

The following log example describes two simultaneous CAS sessions submitting data updates on a CAS table. Both update processes were submitted to CAS in a span of a few seconds. Notice the execution time for the second CAS session MYSESSION1 is double the time it took the first session to update the same number of rows. The amount of memory used and the CAS_CACHE location also shows that both processes were running one after another in a serial fashion. While the update process was running, memory and CAS_CACHE space increased, which suggests that the update process makes copies of to-be-updated data rows/blocks. Once the update process is complete, the space usage in memory/CAS_CACHE returned to normal.

When the data UPDATE action from MYSESSION1 was submitted alone (no simultaneous process), the execution time is around the same as for the first CAS session.

Log from a simultaneous CAS session MYSESSION submitting UPDATE

58 proc cas ;
59 table.update /
60 set={
61 {var="actual",value="22222"},
62 {var="country",value="'FRANCE'"}
63 },
64 table={
65 caslib="caspath",
66 name="big_prdsale",
67 where="index in(10,20,30,40,50,60,70,80,90,100 )"
68 }
69 ;
70 quit ;
NOTE: Active Session now MYSESSION.
{tableName=BIG_PRDSALE,rowsUpdated=86400}
NOTE: PROCEDURE CAS used (Total process time):
real time 4:37.68
cpu time 0.05 seconds

 

Log from a simultaneous CAS session MYSESSION1 submitting UPDATE

57 proc cas ;
58 table.update /
59 set={
60 {var="actual",value="22222"},
61 {var="country",value="'FRANCE'"}
62 },
63 table={
64 caslib="caspath",
65 name="big_prdsale",
66 where="index in(110,120,130,140,150,160,170,180,190,1100 )"
67 }
68 ;
69 quit ;
NOTE: Active Session now MYSESSION1.
{tableName=BIG_PRDSALE,rowsUpdated=86400}
NOTE: PROCEDURE CAS used (Total process time):
real time 8:56.38
cpu time 0.09 seconds

 

The following memory usage snapshot from one of the CAS nodes describes the usage of memory before and during the CAS table update. Notice the values for “used” and “buff/cache” columns before and during the CAS table update.

Memory usage on a CAS node before starting a CAS table UPDATE

Memory usage on a CAS node during CAS table UDPATE

Summary

When simultaneous data append and data update requests are submitted against a global CAS table from two or more CAS sessions, they execute in a serial fashion (no concurrent process execution). To execute data updates on a CAS table, you need an additional overhead memory/CAS_CACHE space. While the CAS table is going through the data append or data update process, the CAS table is still accessible to rendering reports.

Concurrent data append and update to a global CAS table was published on SAS Users.