Just last week, Walmart announced that they'll be testing inventory management robots. These robots will cruise store aisles, scanning shelves to identify out-of-stock products and other issues. According Reuters, Walmart is testing these camera-equipped robots in a handful of stores, but plans to expand the test to 50 stores. We [...]

This article shows how to simulate beta-binomial data in SAS and how to compute the density function (PDF). The beta-binomial distribution is a discrete compound distribution. The "binomial" part of the name means that the discrete random variable X follows a binomial distribution with parameters N (number of trials) and p, but there is a twist: The parameter p is not a constant value but is a random variable that follows the Beta(a, b) distribution.

The beta-binomial distribution is used to model count data where the counts are "almost binomial" but have more variance than can be explained by a binomial model. Therefore this article also compares the binomial and beta-binomial distributions.

### Simulate data from the beta-binomial distribution

To generate a random value from the beta-binomial distribution, use a two-step process. The first step is to draw p randomly from the Beta(a, b) distribution. Then you draw x from the binomial distribution Bin(p, N). The beta-binomial distribution is not natively supported by the RAND function SAS, but you can call the RAND function twice to simulate beta-binomial data, as follows:

```/* simulate a random sample from the beta-binomial distribution */ %let SampleSize = 1000; data BetaBin; a = 6; b = 4; nTrials = 10; /* parameters */ call streaminit(4321); do i = 1 to &SampleSize; p = rand("Beta", a, b); /* p[i] ~ Beta(a,b) */ x = rand("Binomial", p, nTrials); /* x[i] ~ Bin(p[i], nTrials) */ output; end; keep x; run;```

The result of the simulation is shown in the following bar chart. The expected values are overlaid. The next section shows how to compute the expected values.

### The PDF of the beta-binomial distribution

The Wikipedia article about the beta-binomial distribution contains a formula for the PDF of the distribution. Since the distribution is discrete, some references prefer to use "PMF" (probability mass function) instead of PDF. Regardless, if X is a random variable that follows the beta-binomial distribution then the probability that X=x is given by

where B is the complete beta function.

The binomial coefficients ("N choose x") and the beta function are defined in terms of factorials and gamma functions, which get big fast. For numerical computations, it is usually more stable to compute the log-transform of the quantities and then exponentiate the result. The following DATA step computes the PDF of the beta-binomial distribution. For easy comparison with the distribution of the simulated data, the DATA step also computes the expected count for each value in a random sample of size N. The PDF and the simulated data are merged and plotted on the same graph by using the VBARBASIC statement in SAS 9.4M3. The graph was shown in the previous section.

```data PDFBetaBinom; /* PMF function */ a = 6; b = 4; nTrials = 10; /* parameters */ do x = 0 to nTrials; logPMF = lcomb(nTrials, x) + logbeta(x + a, nTrials - x + b) - logbeta(a, b); PMF = exp(logPMF); /* probability that X=x */ EX = &SampleSize * PMF; /* expected value in random sample */ output; end; keep x PMF EX; run;   /* Merge simulated data and PMF. Overlay PMF on data distribution. */ data All; merge BetaBin PDFBetaBinom(rename=(x=t)); run;   title "The Beta-Binomial Distribution"; title2 "Sample Size = &SampleSize"; proc sgplot data=All; vbarbasic x / barwidth=1 legendlabel='Simulated Sample'; /* requires SAS 9.4M3 */ scatter x=t y=EX / legendlabel='Expected Value' markerattrs=GraphDataDefault(symbol=CIRCLEFILLED size=10); inset "nTrials = 10" "a = 6" "b = 4" / position=topleft border; yaxis grid; xaxis label="x" integer type=linear; /* force TYPE=LINEAR */ run;```

### Compare the binomial and beta-binomial distributions

One application of the beta-binomial distribution is to model count data that are approximately binomial but have more variance ("thicker tails") than the binomial model predicts. The expected value of a Beta(a, b) distribution is a/(a + b), so let's compare the beta-binomial distribution to the binomial distribution with p = a/(a + b).

The following graph overlays the two PDFs for a = 6, b = 4, and nTrials = 10. The blue distribution is the binomial distribution with p = 6/(6 + 4) = 0.6. The pink distribution is the beta-binomial. You can see that the beta-binomial distribution has a shorter peak and thicker tails than the corresponding binomial distribution. The expected value for both distributions is 6, but the variance of the beta-binomial distribution is greater. Thus you can use the beta-binomial distribution as an alternative to the binomial distribution when the data exhibit greater variance than expected under the binomial model (a phenomenon known as overdispersion).

### Summary

The beta-binomial distribution is an example of a compound distribution. You can simulate data from a compound distribution by randomly drawing the parameters from some distribution and then using those random parameters to draw the data. For the beta-binomial distribution, the probability parameter p is drawn from a beta distribution and then used to draw x from a binomial distribution where the probability of success is the value of p. You can use the beta-binomial distribution to model data that have greater variance than expected under the binomial model.

The post Simulate data from the beta-binomial distribution in SAS appeared first on The DO Loop.

Information Dashboards were the hot topic a few years ago, but the hype seems to have died down lately. A good dashboard is still a very useful way to summarize, analyze, and share data - so I thought I'd re-visit the topic, and try to improve an old dashboard. Did [...]

The post Does your dashboard measure up? appeared first on SAS Learning Post.

Have you heard?  The ODS Destination for PowerPoint Has a New Option

It’s true.  The ODS destination for PowerPoint now has the STARTPAGE= option, which provides you with greater control and flexibility when creating presentations.

Added to the ODS POWERPOINT statement in SAS® 9.4TS1M4, the STARTPAGE= option enables you to force the creation of a new slide between procedures and between ODS layout containers.  Inserting a slide break between layout containers is one of the most impactful ways that you can use this option.

A new layout container does not automatically trigger a new slide within the presentation.  A new slide is started when the current slide is full.  That is the default, but the new STARTPAGE= option gives you the ability to start a new slide between containers even if the current slide is not full.

### Examples

Shown below are four procedures placed within three layout containers.

• The first PROC ODSTEXT step is placed in the first layout container.  Its purpose is to generate a slide of text, with that text roughly centered in the slide.
• The second PROC ODSTEXT step is in the second container.  Its purpose is to provide useful information about the table and graph.
• The PROC TABULATE and SGPLOT steps make up the third container.  They are the results of the analysis and, as such, need to be displayed side by side.

### Default Behavior of ODS POWERPOINT, Without STARTPAGE=

Let’s look at the default behavior.  In this example, the STARTPAGE= option is not used.

```ods powerpoint file='example1.pptx' options(backgroundimage="saslogo_pptx.png"); title; ods layout gridded x=10% y=25%; proc odstext; p "Have you heard?" /style=[just=c fontsize=42pt color=RoyalBlue]; p "The STARTPAGE= option has been added to the ODS POWERPOINT statement!" /style=[just=c fontsize=24pt]; run; ods layout end;   ods layout gridded rows=1 columns=1; ods region; proc odstext; p 'Table Shows Total Runs and Hits for Each League'; p 'The Graph Contains One Bubble for Each Player. The Size of Each Bubble Represents the Magnitude of the RBIs.'; run; ods layout end;   ods graphics / width=4.5in height=4in; ods layout gridded columns=2 column_widths=(47% 47%) column_gutter=1pct; ods region; proc tabulate data=sashelp.baseball; class league; var nruns nhits nrbi; tables league='', sum='Totals'*(nruns nhits)*f=comma12.; run;   ods region; proc sgplot data=sashelp.baseball; bubble x=nhits y=nruns size=nrbi/ group=league transparency=.3; run; ods layout end; ods powerpoint close;```

Here is the resulting slide output:

Those results are not what we hoped they would be.  The output from the second PROC ODSTEXT step, which is to provide information about the table and graph, is on the first slide.  So is the graph!!!  And the graph does not look good because it is the wrong size.  The table is by itself on the second slide.  This is not the desired output at all.

### Here Is Where STARTPAGE= Helps!

In this example, an ODS POWERPOINT statement with the STARTPAGE= option is added.  It is placed after the ODS LAYOUT END statement for the first container.

```ods powerpoint file='example2.pptx' options(backgroundimage="saslogo_pptx.png"); title; ods layout gridded x=10% y=25%; proc odstext; p "Have you heard?" /style=[just=c fontsize=42pt color=RoyalBlue]; p "The STARTPAGE= option has been added to the ODS POWERPOINT statement!" /style=[just=c fontsize=24pt]; run; ods layout end;   ods powerpoint startpage=now; /* <---- Triggers a new slide */   ods layout gridded rows=1 columns=1; ods region; proc odstext; p 'Table Shows Total Runs and Hits for Each League'; p 'The Graph Contains One Bubble for Each Player. The Size of Each Bubble Represents the Magnitude of the RBIs.'; run; ods layout end;   ods graphics / width=4.5in height=4in; ods layout gridded columns=2 column_widths=(47% 47%) column_gutter=1pct; ods region; proc tabulate data=sashelp.baseball; class league; var nruns nhits nrbi; tables league='', sum='Totals'*(nruns nhits)*f=comma12.; run;   ods region; proc sgplot data=sashelp.baseball; bubble x=nhits y=nruns size=nrbi/ group=league transparency=.3; run; ods layout end; ods powerpoint close;```

The STARTPAGE= option gave us exactly what we need.  The first slide contains just the text (from the first layout container).  The second slide contains more text along with the table and graph (from the second and third layout containers).

### Use It Wisely!

The most important thing to know about using the STARTPAGE= option with layout containers is that it has to be placed between containers.  It cannot be placed within a container.  So you cannot put the statement between the ODS LAYOUT GRIDDED and ODS LAYOUT END statements.

For more information about ODS destination for PowerPoint and all of its options, visit this The Dynamic Duo: ODS Layout and the ODS Destination for PowerPoint.  Take a peek at it for more examples of using ODS Layout with the ODS destination for PowerPoint.

The ODS Destination for PowerPoint Has a New Option was published on SAS Users.

Blockchain technology and payment integrity in health care Blockchain technology has arrived in the health care space, bringing anticipation of revolutionary change in operational efficiency, data management, security, fraud prevention, disease prevention, and perhaps even in payments. While these are not new goals, I am ever the optimist. As almost [...]

Brother, can you spare a Healthcoin? was published on SAS Voices by Ben Wright

Illustration by Dooder on Freepik

A few weeks ago I helped a customer set up a successful connection between SAS and a remote DB2 database using ODBC. At first, I thought this case would be as simple as plugging in a cable. Why? Because I already had a fully-functional testing environment with direct interaction using SAS/ACCESS Interface to DB2. Besides, I had other RDBMS connected via ODBC. Therefore, it was just a matter of setting up my ODBC driver manager for DB2. No big deal right? Oh boy, was I wrong! It was not easy at all. It took me two full days to make it work in my lab, and only after some in-depth research and a lot of trial and error.

DB2 can be challenging, especially when most of the documentation I found was either incomplete, outdated, scarce or even unclear. I don't want you to undergo the same frustration I felt. As a blogger and SAS user, it is my duty to make your SAS life a little easier. That's why I decided to publish this blog post to share the solution with the community. This guide provides you with the steps to take to set up a successful connection between SAS and a remote DB2 database via ODBC on Linux, without tears.

Keep in mind this is an extensive guide. I documented all the technical issues I came across in my lab, and also provided detailed explanations to overcome them. If you are beginning your journey, I recommend you to go through all the steps. However, for your convenience, if you want to jump right into a specific topic, use the following table of contents:

### My Environment

All commands throughout this guide are based on this table. You should replace the code with your settings when appropriate.

### Prerequisites

Before we begin, make sure you meet all of these SAS and DB2 requirements:

#### System Requirements

It may sound obvious, but you must verify that your SAS release supports your DB2 database version. For instance, My Environment section shows that I'm using SAS 9.4 M4. This document recommends that I need DB2 10.5 or later. Considering I currently have DB2 11.1, I'm ready to go. To review the rest of the combinations visit the SAS System Requirements page.

Another evident must-have is an ODBC driver manager already installed. As you noticed, I use unixODBC version 2.3.1. It is always recommended to have the latest version, which nowadays is 2.3.4. Still, you should be fine if you decide to use any of the 2.3.X releases. In the past, I've seen users facing difficulties with unixODBC 2.2.X.

#### SAS/ACCESS Interface to ODBC licensed and installed

For this connection to work, it is essential to have SAS/ACCESS Interface to ODBC licensed and installed. First, to verify that this product is licensed, run this code in a SAS session:

```proc setinit no alias; run;```

In the output log look for this line:

```---SAS/ACCESS Interface to ODBC 06JAN2018```

If the product is displayed, move on to the next step. If not, look for SAS/ACCESS Interface to DB2 instead. If you have the latter, consider yourself a lucky person. You don't have to continue with this guide because you can connect to DB2 directly. If you need assistance with this off-topic issue, send me a message, and I'll be glad to help you. If you don't have either one, you should get in touch with your sales representative.

Next, to verify that this product is installed, go to your `!SASROOT/sasexe` directory on your SAS server and list these files:

```[sas@muse sasexe]\$ ls -l *odb* -r-xr-xr-x. 1 sas sas 630196 Nov 9 2016 sasiodbu -r-xr-xr-x. 1 sas sas 603114 Nov 9 2016 sasioodb -r-xr-xr-x. 1 sas sas 603114 Nov 9 2016 sasioodb_u -r-xr-xr-x. 1 sas sas 59977 Nov 9 2016 sasodbcb -r-xr-xr-x. 1 sas sas 59977 Nov 9 2016 sasodbcb_u -r-xr-xr-x. 1 sas sas 102142 Nov 9 2016 sasodbcs -r-xr-xr-x. 1 sas sas 71982 Nov 9 2016 sasodbdb -r-xr-xr-x. 1 sas sas 71990 Nov 9 2016 sasodbdb_u -r-xr-xr-x. 1 sas sas 202343 Nov 9 2016 sasodb -r-xr-xr-x. 1 sas sas 201815 Nov 9 2016 sasodb_u -r-xr-xr-x. 1 sas sas 443852 Nov 9 2016 tkeodbc.so```

If you don't see the above list, then the product is not installed. Although, if your SAS license includes it, use your SAS depot to install it on your server.

#### Choosing the right DB2 client package

IBM offers five different client packages. If you don't know which package to choose, you can't go wrong with IBM Data Server Client, it's the package I use, it includes all the client tools and libraries available. Another good alternative is IBM Data Server Runtime Client, but you will have to create your DB2 instance and user yourself. I would not recommend IBM Data Server Driver for ODBC and CLI. During my research I found that people could not get it to connect to a remote DB2 database. For more information about each client package, please visit IBM data server client and driver types.

#### Testing your DB2 connection outside of SAS

This test is the most critical requirement before starting with the actual configuration process. Take my advice for any connection you plan to configure in the future. Firstly, you must establish a successful connection between your database client and database server outside of SAS, using your database vendor tools. This validation is a must, regardless of your RDBMS or SAS engine.

To test your connection to your remote DB2 server, run an equivalent command on your DB2 client:

```[db2inst@muse ~]\$ db2 connect to GSDB user db2inst using mypassword   Database Connection Information   Database server = DB2/LINUXX8664 11.1.1 SQL authorization ID = DB2INST Local database alias = GSDB```

If you received a similar output you might continue; otherwise, you will have to catalog your DB2 server as a TCP/IP node, and then catalog the database on that node. Since this is beyond the scope of this guide, please review these IBM documents: CATALOG NODE command and CATALOG DATABASE command.

### Setting up unixODBC with DB2

Considering that you have tested a successful connection between your database client and database server with the `db2` command, it's time to set up a new one using an ODBC driver instead. Please bear with me. Things are about to get interesting.

As mentioned at the beginning, unixODBC is my ODBC driver manager of choice. I use it for all my RDBMS in my lab to test ODBC connections. Its usage is almost effortless, but one of the downsides is the documentation. The only official document available for DB2 is: Using IBM DB2 with unixODBC. Judge for yourself.

This driver manager has two configuration files: `odbcinst.ini` for database drivers and `odbc.ini` for database definitions. Typically with most databases, those two files are all that it takes to set up an ODBC connection. DB2 plays a different game though. It uses two extra files: `db2cli.ini` for DB2 definitions and an XML based file called `db2dsdriver.cfg`.

For now, let's take a closer look at the contents of both unixODBC files:

`odbcinst.ini`

```[DB2] Description = IBM DB2 ODBC Driver Driver = /opt/ibm/db2/V11.1/lib32/libdb2.so Driver64 = /opt/ibm/db2/V11.1/lib64/libdb2.so FileUsage = 1 DontDLClose = 1```

`odbc.ini`

```[db2odbc] Driver = DB2 Description = DB2 ODBC connection for SAS```

At this point I was feeling confident and thought the connection would work, so I tested it out with the `isql` command, but to my surprise, I received this error:

```[db2inst@muse ~]\$ isql -v db2odbc db2inst mypassword [08001][unixODBC][IBM][CLI Driver] SQL1013N The database alias name or database name "DB2ODBC" could not be found. SQLSTATE=42705   [ISQL]ERROR: Could not SQLConnect```

#### db2cli.ini came to the rescue

As a result, I started looking up online documentation. After a while, I discovered the existence of the `db2cli.ini` file. To get you started, IBM provides a sample configuration file called `db2cli.ini.sample`, located in the `installation_path/cfg` directory. The DB2 documentation recommends creating a `db2cli.ini` file based on that sample file and place it in the same location. I followed their advice, and created my file with the same data source name:

```[db2odbc] Database = GSDB Protocol = TCPIP Hostname = redhat ServiceName = 50000```

Note: For IBM, the `ServiceName` keyword is the port number. The default port number for the DB2 default instance is 50000. Review the correct port with your DBA.

Having set up both unixODBC and DB2 files correctly, I was feeling lucky. Therefore, I attempted a connection again, but I received the same error. My frustration was on the rise.

#### db2cli validate tool is your best friend

I continued digging into the documentation until I found a powerful tool: `db2cli` with a particular parameter: `validate`. This command/parameter duo can help you verify the contents of your configuration files (`db2cli.ini` and `db2dsdriver.cfg`), list their location, test your data source names and databases, display invalid/duplicate keywords, among other functionalities. It basically is your new best friend.

`db2cli validate` helped me identify why the DB2 ODBC driver was unable to find my data source name. I ran this command to examine my environment, and the output was eye-opening:

```[db2inst@muse ~]\$ db2cli validate -dsn db2odbc   =============================================================================== Client information for the current copy: ===============================================================================   Client Package Type : IBM Data Server Client Client Version (level/bit): DB2 v11.1.0.0 (s1606081100/64-bit) Client Platform : Linux/X8664 Install/Instance Path : /opt/ibm/db2/V11.1 DB2DSDRIVER_CFG_PATH value: db2dsdriver.cfg Path : /home/db2inst/sqllib/cfg/db2dsdriver.cfg DB2CLIINIPATH value : db2cli.ini Path : /home/db2inst/sqllib/cfg/db2cli.ini db2diag.log Path : /home/db2inst/sqllib/db2dump/db2diag.log   =============================================================================== db2dsdriver.cfg schema validation for the entire file: ===============================================================================   Note: The validation utility could not find the configuration file db2dsdriver.cfg. The file is searched at "/home/db2inst/sqllib/cfg/db2dsdriver.cfg".   =============================================================================== db2cli.ini validation for data source name "db2odbc": ===============================================================================   Note: The validation utility could not find the configuration file db2cli.ini. The file is searched at "/home/db2inst/sqllib/cfg/db2cli.ini".   =============================================================================== db2dsdriver.cfg validation for data source name "db2odbc": ===============================================================================   Note: The validation utility could not find the configuration file db2dsdriver.cfg. The file is searched at "/home/db2inst/sqllib/cfg/db2dsdriver.cfg".   =============================================================================== The validation is completed. ===============================================================================```

As you can see, the validation tool didn't find my `db2cli.ini`. That file contained my data source name; consequently, the DB2 ODBC driver didn't find it either. As previously indicated, I created that file in the `installation_path/cfg` directory, which it was: `/opt/ibm/db2/V11.1/cfg`, but the DB2 ODBC driver was searching for it in: `/home/db2inst/sqllib/cfg`. I didn't choose that path randomly; I just followed this IBM document. Nevertheless, if you encounter the same issue, the solution is fairly simple. You can either create a copy in the required path or point to the original file using a symbolic link. I chose the latter to avoid having duplicates. To fix this misunderstanding just execute this command as root or sudo:

`[root@muse ~]# ln -s /opt/ibm/db2/V11.1/cfg/db2cli.ini /home/db2inst/sqllib/cfg/db2cli.ini`

Now you should have this symbolic link:

```[db2inst@muse ~]\$ ls -l /home/db2inst/sqllib/cfg/db2cli.ini lrwxrwxrwx. 1 root db2iadm 33 Sep 11 19:07 /home/db2inst/sqllib/cfg/db2cli.ini -> /opt/ibm/db2/V11.1/cfg/db2cli.ini```

After this fix, I used my new friend again, and my environment changed. Below is a partial output:

```[db2inst@muse ~]\$ db2cli validate -dsn db2odbc =============================================================================== db2cli.ini validation for data source name "db2odbc": ===============================================================================   [ Keywords used for the connection ]   Keyword Value --------------------------------------------------------------------------- DATABASE GSDB PROTOCOL TCPIP HOSTNAME redhat SERVICENAME 50000```

At last, the DB2 ODBC driver was able to find the configuration file, read its contents, and validate the keywords. My frustration started to disappear.

It was time to see what unixODBC thought about this change. So I reran my `isql` command:

```[db2inst@muse ~]\$ isql -v db2odbc db2inst mypassword +---------------------------------------+ | Connected! | | | | sql-statement | | help [tablename] | | quit | | | +---------------------------------------+ SQL>```

It finally worked!

#### Do we need to use db2dsdriver.cfg?

I managed to set up a successful ODBC connection using only three files: `db2cli.ini`, along with `odbc.ini`, and `odbcinst.ini`. Thus, the answer is no; you don't need to use this additional DB2 configuration file.

For our scenario, I consider it as optional. Previously, I pointed out that it was an XML file, as opposed to `db2cli.ini`, which is a text file. Both can have the same four keywords I defined. Then why does DB2 provide two files? `db2cli.ini` is used to configure the behavior of CLI/ODBC applications, whereas `db2dsdriver.cfg` can be used for those as well, and for other types of applications including PHP, Perl, .NET, among others.

In short, this concludes the DB2 configuration process. Now it's time to cover the SAS territory.

### SAS joins the party

So far we haven't touched any SAS files. We have only tested the DB2 connection first using the `db2` command, then using unixODBC's `isql` command. These are mandatory steps before a successful connection with SAS.

#### Getting sasenv_local ready for DB2

If you have already connected SAS to other RDBMS in the past, you know that the most important file you need to care about is `sasenv_local`. It is located in the `!SASROOT/bin` directory. The code below shows a partial output with both ODBC and DB2 variables correctly defined:

```# ODBC settings export ODBCSYSINI=/etc   # DB2 settings export INSTHOME=/opt/ibm/db2/V11.1 export DB2CLIINIPATH=/opt/ibm/db2/V11.1/cfg   export LD_LIBRARY_PATH=\${LD_LIBRARY_PATH}:/usr/lib64:\${INSTHOME}/lib64```

Let's analyze each of the above settings. First, SAS needs to know where to locate both unixODBC configuration files. The `ODBCSYSINI` variable instructs SAS to look for them in the `/etc` directory.

Second, as a good practice, I defined an environment variable called `INSTHOME` to reference my DB2 installation directory. This variable is optional but highly recommended. Then, the `DB2CLIINIPATH` variable tells SAS the location of `db2cli.ini`. If you decide to use `db2dsdriver.cfg` instead, then replace that variable with `DB2DSDRIVER_CFG_PATH`.

Finally, you have to inform SAS where to find unixODBC and DB2 libraries. SAS uses the `LD_LIBRARY_PATH` environment variable on Linux. This variable points to `/usr/lib64` because it's the default path for all unixODBC libraries. I also appended the `\$INSTHOME/lib64` directory, since it's the place where DB2 has all its libraries.

At this point, you might think that you are all set. I don't like being pessimistic, but there is a high possibility that your connection may not work. I didn't test all SAS and DB2 releases, but my customer and I experienced the same error when we tested our connection with a LIBNAME statement:

```[sas@muse 9.4]\$ ./sas -nodms   1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;   ERROR: An exception has been encountered. Please contact technical support and provide them with the following traceback information:   The SAS task name is [Line Mod] Segmentation Violation # More lines...```

Moreover, if you run this code in SAS Studio or SAS Enterprise Guide, you might not even get a response, SAS will just crash or hang for a long time.

#### Fixing the Segmentation Violation error

After intense research, I was able to fix this error. We need to link two ODBC libraries that SAS provides to enable 32-bit capability. Change to your `!SASROOT/sasexe` directory and backup both files:

```[sas@muse sasexe]\$ mv sasodb sasodb.orig [sas@muse sasexe]\$ mv sasioodb sasioodb.orig```

Create two symbolic links that point to the 32-bit libraries:

```[sas@muse sasexe]\$ ln -s sasodb_u sasodb [sas@muse sasexe]\$ ln -s sasioodb_u sasioodb```

List your files, and the output should resemble the following:

```[sas@muse sasexe]\$ ls -l *odb* -r-xr-xr-x. 1 sas sas 630196 Nov 9 2016 sasiodbu lrwxrwxrwx. 1 sas sas 10 Sep 11 21:51 sasioodb -> sasioodb_u -r-xr-xr-x. 1 sas sas 603114 Nov 9 2016 sasioodb.orig -r-xr-xr-x. 1 sas sas 603114 Nov 9 2016 sasioodb_u lrwxrwxrwx. 1 sas sas 8 Sep 11 21:52 sasodb -> sasodb_u -r-xr-xr-x. 1 sas sas 59977 Nov 9 2016 sasodbcb -r-xr-xr-x. 1 sas sas 59977 Nov 9 2016 sasodbcb_u -r-xr-xr-x. 1 sas sas 102142 Nov 9 2016 sasodbcs -r-xr-xr-x. 1 sas sas 71982 Nov 9 2016 sasodbdb -r-xr-xr-x. 1 sas sas 71990 Nov 9 2016 sasodbdb_u -r-xr-xr-x. 1 sas sas 202343 Nov 9 2016 sasodb.orig -r-xr-xr-x. 1 sas sas 201815 Nov 9 2016 sasodb_u -r-xr-xr-x. 1 sas sas 443852 Nov 9 2016 tkeodbc.so```

Now your `LIBNAME` statement may work, but unlike my customer, you can also encounter this error:

``` 1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;   ERROR: Could not load /sas/Software/SASFoundation/9.4/sasexe/sasodb (35 images loaded) ERROR: libodbc.so.1: cannot open shared object file: No such file or directory ERROR: The SAS/ACCESS Interface to ODBC cannot be loaded. The SASODB code appendage could not be loaded. ERROR: Error in the LIBNAME statement.```

The error is self-explanatory. SAS couldn't find the file `libodbc.so.1`, which SASODB needs. To display all the required shared libraries, execute the `ldd` command:

```[sas@muse sasexe]\$ ldd sasodb linux-vdso.so.1 => (0x00007ffd4efe2000) libodbc.so.1 => not found libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9aa741d000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f9aa71e6000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f9aa6fe2000) libm.so.6 => /lib64/libm.so.6 (0x00007f9aa6cdf000) libc.so.6 => /lib64/libc.so.6 (0x00007f9aa691e000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f9aa6708000) /lib64/ld-linux-x86-64.so.2 (0x00007f9aa7882000) libfreebl3.so => /lib64/libfreebl3.so (0x00007f9aa6504000)```

Indeed, that library was not found, so to fix this missing dependency, create a symbolic link using root or sudo:

`[root@muse ~]# ln -s /lib64/libodbc.so.2.0.0 /lib64/libodbc.so.1`

#### Creating a library that works

Now that the previous errors are gone, run a final test using the SAS command line to assign a library and then SAS Studio to display some data in a nice-looking way:

```[sas@muse 9.4]\$ ./sas -nodms 1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;   NOTE: Libref DB2LIB was successfully assigned as follows: Engine: ODBC Physical Name: db2odbc```

Note: Displaying actual data is an essential last step because if you are pointing to an incorrect DB2 ODBC driver (such as `libdb2o.so` instead of `libdb2.so`), you can still assign a library successfully, but you won't see any tables.

In SAS Studio I assigned the same library and ran a small query to retrieve some data:

```proc print data=db2lib.department_lookup; var department_code department_en; run;```

### Wrapping Up

I know it's a lot to digest. However, some users may not need to follow all steps to succeed, whereas others starting from scratch may require performing all steps carefully. Whatever your situation might be, I hope you find this guide useful and easy to follow. Did you struggle like me? Share your experience in the comments below or if you have any problems, don't hesitate to contact me.

Connecting SAS to a DB2 database via ODBC without tears was published on SAS Users.

Illustration by Dooder on Freepik

A few weeks ago I helped a customer set up a successful connection between SAS and a remote DB2 database using ODBC. At first, I thought this case would be as simple as plugging in a cable. Why? Because I already had a fully-functional testing environment with direct interaction using SAS/ACCESS Interface to DB2. Besides, I had other RDBMS connected via ODBC. Therefore, it was just a matter of setting up my ODBC driver manager for DB2. No big deal right? Oh boy, was I wrong! It was not easy at all. It took me two full days to make it work in my lab, and only after some in-depth research and a lot of trial and error.

DB2 can be challenging, especially when most of the documentation I found was either incomplete, outdated, scarce or even unclear. I don't want you to undergo the same frustration I felt. As a blogger and SAS user, it is my duty to make your SAS life a little easier. That's why I decided to publish this blog post to share the solution with the community. This guide provides you with the steps to take to set up a successful connection between SAS and a remote DB2 database via ODBC on Linux, without tears.

Keep in mind this is an extensive guide. I documented all the technical issues I came across in my lab, and also provided detailed explanations to overcome them. If you are beginning your journey, I recommend you to go through all the steps. However, for your convenience, if you want to jump right into a specific topic, use the following table of contents:

### My Environment

All commands throughout this guide are based on this table. You should replace the code with your settings when appropriate.

### Prerequisites

Before we begin, make sure you meet all of these SAS and DB2 requirements:

#### System Requirements

It may sound obvious, but you must verify that your SAS release supports your DB2 database version. For instance, My Environment section shows that I'm using SAS 9.4 M4. This document recommends that I need DB2 10.5 or later. Considering I currently have DB2 11.1, I'm ready to go. To review the rest of the combinations visit the SAS System Requirements page.

Another evident must-have is an ODBC driver manager already installed. As you noticed, I use unixODBC version 2.3.1. It is always recommended to have the latest version, which nowadays is 2.3.4. Still, you should be fine if you decide to use any of the 2.3.X releases. In the past, I've seen users facing difficulties with unixODBC 2.2.X.

#### SAS/ACCESS Interface to ODBC licensed and installed

For this connection to work, it is essential to have SAS/ACCESS Interface to ODBC licensed and installed. First, to verify that this product is licensed, run this code in a SAS session:

```proc setinit no alias; run;```

In the output log look for this line:

```---SAS/ACCESS Interface to ODBC 06JAN2018```

If the product is displayed, move on to the next step. If not, look for SAS/ACCESS Interface to DB2 instead. If you have the latter, consider yourself a lucky person. You don't have to continue with this guide because you can connect to DB2 directly. If you need assistance with this off-topic issue, send me a message, and I'll be glad to help you. If you don't have either one, you should get in touch with your sales representative.

Next, to verify that this product is installed, go to your `!SASROOT/sasexe` directory on your SAS server and list these files:

```[sas@muse sasexe]\$ ls -l *odb* -r-xr-xr-x. 1 sas sas 630196 Nov 9 2016 sasiodbu -r-xr-xr-x. 1 sas sas 603114 Nov 9 2016 sasioodb -r-xr-xr-x. 1 sas sas 603114 Nov 9 2016 sasioodb_u -r-xr-xr-x. 1 sas sas 59977 Nov 9 2016 sasodbcb -r-xr-xr-x. 1 sas sas 59977 Nov 9 2016 sasodbcb_u -r-xr-xr-x. 1 sas sas 102142 Nov 9 2016 sasodbcs -r-xr-xr-x. 1 sas sas 71982 Nov 9 2016 sasodbdb -r-xr-xr-x. 1 sas sas 71990 Nov 9 2016 sasodbdb_u -r-xr-xr-x. 1 sas sas 202343 Nov 9 2016 sasodb -r-xr-xr-x. 1 sas sas 201815 Nov 9 2016 sasodb_u -r-xr-xr-x. 1 sas sas 443852 Nov 9 2016 tkeodbc.so```

If you don't see the above list, then the product is not installed. Although, if your SAS license includes it, use your SAS depot to install it on your server.

#### Choosing the right DB2 client package

IBM offers five different client packages. If you don't know which package to choose, you can't go wrong with IBM Data Server Client, it's the package I use, it includes all the client tools and libraries available. Another good alternative is IBM Data Server Runtime Client, but you will have to create your DB2 instance and user yourself. I would not recommend IBM Data Server Driver for ODBC and CLI. During my research I found that people could not get it to connect to a remote DB2 database. For more information about each client package, please visit IBM data server client and driver types.

#### Testing your DB2 connection outside of SAS

This test is the most critical requirement before starting with the actual configuration process. Take my advice for any connection you plan to configure in the future. Firstly, you must establish a successful connection between your database client and database server outside of SAS, using your database vendor tools. This validation is a must, regardless of your RDBMS or SAS engine.

To test your connection to your remote DB2 server, run an equivalent command on your DB2 client:

```[db2inst@muse ~]\$ db2 connect to GSDB user db2inst using mypassword   Database Connection Information   Database server = DB2/LINUXX8664 11.1.1 SQL authorization ID = DB2INST Local database alias = GSDB```

If you received a similar output you might continue; otherwise, you will have to catalog your DB2 server as a TCP/IP node, and then catalog the database on that node. Since this is beyond the scope of this guide, please review these IBM documents: CATALOG NODE command and CATALOG DATABASE command.

### Setting up unixODBC with DB2

Considering that you have tested a successful connection between your database client and database server with the `db2` command, it's time to set up a new one using an ODBC driver instead. Please bear with me. Things are about to get interesting.

As mentioned at the beginning, unixODBC is my ODBC driver manager of choice. I use it for all my RDBMS in my lab to test ODBC connections. Its usage is almost effortless, but one of the downsides is the documentation. The only official document available for DB2 is: Using IBM DB2 with unixODBC. Judge for yourself.

This driver manager has two configuration files: `odbcinst.ini` for database drivers and `odbc.ini` for database definitions. Typically with most databases, those two files are all that it takes to set up an ODBC connection. DB2 plays a different game though. It uses two extra files: `db2cli.ini` for DB2 definitions and an XML based file called `db2dsdriver.cfg`.

For now, let's take a closer look at the contents of both unixODBC files:

`odbcinst.ini`

```[DB2] Description = IBM DB2 ODBC Driver Driver = /opt/ibm/db2/V11.1/lib32/libdb2.so Driver64 = /opt/ibm/db2/V11.1/lib64/libdb2.so FileUsage = 1 DontDLClose = 1```

`odbc.ini`

```[db2odbc] Driver = DB2 Description = DB2 ODBC connection for SAS```

At this point I was feeling confident and thought the connection would work, so I tested it out with the `isql` command, but to my surprise, I received this error:

```[db2inst@muse ~]\$ isql -v db2odbc db2inst mypassword [08001][unixODBC][IBM][CLI Driver] SQL1013N The database alias name or database name "DB2ODBC" could not be found. SQLSTATE=42705   [ISQL]ERROR: Could not SQLConnect```

#### db2cli.ini came to the rescue

As a result, I started looking up online documentation. After a while, I discovered the existence of the `db2cli.ini` file. To get you started, IBM provides a sample configuration file called `db2cli.ini.sample`, located in the `installation_path/cfg` directory. The DB2 documentation recommends creating a `db2cli.ini` file based on that sample file and place it in the same location. I followed their advice, and created my file with the same data source name:

```[db2odbc] Database = GSDB Protocol = TCPIP Hostname = redhat ServiceName = 50000```

Note: For IBM, the `ServiceName` keyword is the port number. The default port number for the DB2 default instance is 50000. Review the correct port with your DBA.

Having set up both unixODBC and DB2 files correctly, I was feeling lucky. Therefore, I attempted a connection again, but I received the same error. My frustration was on the rise.

#### db2cli validate tool is your best friend

I continued digging into the documentation until I found a powerful tool: `db2cli` with a particular parameter: `validate`. This command/parameter duo can help you verify the contents of your configuration files (`db2cli.ini` and `db2dsdriver.cfg`), list their location, test your data source names and databases, display invalid/duplicate keywords, among other functionalities. It basically is your new best friend.

`db2cli validate` helped me identify why the DB2 ODBC driver was unable to find my data source name. I ran this command to examine my environment, and the output was eye-opening:

```[db2inst@muse ~]\$ db2cli validate -dsn db2odbc   =============================================================================== Client information for the current copy: ===============================================================================   Client Package Type : IBM Data Server Client Client Version (level/bit): DB2 v11.1.0.0 (s1606081100/64-bit) Client Platform : Linux/X8664 Install/Instance Path : /opt/ibm/db2/V11.1 DB2DSDRIVER_CFG_PATH value: db2dsdriver.cfg Path : /home/db2inst/sqllib/cfg/db2dsdriver.cfg DB2CLIINIPATH value : db2cli.ini Path : /home/db2inst/sqllib/cfg/db2cli.ini db2diag.log Path : /home/db2inst/sqllib/db2dump/db2diag.log   =============================================================================== db2dsdriver.cfg schema validation for the entire file: ===============================================================================   Note: The validation utility could not find the configuration file db2dsdriver.cfg. The file is searched at "/home/db2inst/sqllib/cfg/db2dsdriver.cfg".   =============================================================================== db2cli.ini validation for data source name "db2odbc": ===============================================================================   Note: The validation utility could not find the configuration file db2cli.ini. The file is searched at "/home/db2inst/sqllib/cfg/db2cli.ini".   =============================================================================== db2dsdriver.cfg validation for data source name "db2odbc": ===============================================================================   Note: The validation utility could not find the configuration file db2dsdriver.cfg. The file is searched at "/home/db2inst/sqllib/cfg/db2dsdriver.cfg".   =============================================================================== The validation is completed. ===============================================================================```

As you can see, the validation tool didn't find my `db2cli.ini`. That file contained my data source name; consequently, the DB2 ODBC driver didn't find it either. As previously indicated, I created that file in the `installation_path/cfg` directory, which it was: `/opt/ibm/db2/V11.1/cfg`, but the DB2 ODBC driver was searching for it in: `/home/db2inst/sqllib/cfg`. I didn't choose that path randomly; I just followed this IBM document. Nevertheless, if you encounter the same issue, the solution is fairly simple. You can either create a copy in the required path or point to the original file using a symbolic link. I chose the latter to avoid having duplicates. To fix this misunderstanding just execute this command as root or sudo:

`[root@muse ~]# ln -s /opt/ibm/db2/V11.1/cfg/db2cli.ini /home/db2inst/sqllib/cfg/db2cli.ini`

Now you should have this symbolic link:

```[db2inst@muse ~]\$ ls -l /home/db2inst/sqllib/cfg/db2cli.ini lrwxrwxrwx. 1 root db2iadm 33 Sep 11 19:07 /home/db2inst/sqllib/cfg/db2cli.ini -> /opt/ibm/db2/V11.1/cfg/db2cli.ini```

After this fix, I used my new friend again, and my environment changed. Below is a partial output:

```[db2inst@muse ~]\$ db2cli validate -dsn db2odbc =============================================================================== db2cli.ini validation for data source name "db2odbc": ===============================================================================   [ Keywords used for the connection ]   Keyword Value --------------------------------------------------------------------------- DATABASE GSDB PROTOCOL TCPIP HOSTNAME redhat SERVICENAME 50000```

At last, the DB2 ODBC driver was able to find the configuration file, read its contents, and validate the keywords. My frustration started to disappear.

It was time to see what unixODBC thought about this change. So I reran my `isql` command:

```[db2inst@muse ~]\$ isql -v db2odbc db2inst mypassword +---------------------------------------+ | Connected! | | | | sql-statement | | help [tablename] | | quit | | | +---------------------------------------+ SQL>```

It finally worked!

#### Do we need to use db2dsdriver.cfg?

I managed to set up a successful ODBC connection using only three files: `db2cli.ini`, along with `odbc.ini`, and `odbcinst.ini`. Thus, the answer is no; you don't need to use this additional DB2 configuration file.

For our scenario, I consider it as optional. Previously, I pointed out that it was an XML file, as opposed to `db2cli.ini`, which is a text file. Both can have the same four keywords I defined. Then why does DB2 provide two files? `db2cli.ini` is used to configure the behavior of CLI/ODBC applications, whereas `db2dsdriver.cfg` can be used for those as well, and for other types of applications including PHP, Perl, .NET, among others.

In short, this concludes the DB2 configuration process. Now it's time to cover the SAS territory.

### SAS joins the party

So far we haven't touched any SAS files. We have only tested the DB2 connection first using the `db2` command, then using unixODBC's `isql` command. These are mandatory steps before a successful connection with SAS.

#### Getting sasenv_local ready for DB2

If you have already connected SAS to other RDBMS in the past, you know that the most important file you need to care about is `sasenv_local`. It is located in the `!SASROOT/bin` directory. The code below shows a partial output with both ODBC and DB2 variables correctly defined:

```# ODBC settings export ODBCSYSINI=/etc   # DB2 settings export INSTHOME=/opt/ibm/db2/V11.1 export DB2CLIINIPATH=/opt/ibm/db2/V11.1/cfg   export LD_LIBRARY_PATH=\${LD_LIBRARY_PATH}:/usr/lib64:\${INSTHOME}/lib64```

Let's analyze each of the above settings. First, SAS needs to know where to locate both unixODBC configuration files. The `ODBCSYSINI` variable instructs SAS to look for them in the `/etc` directory.

Second, as a good practice, I defined an environment variable called `INSTHOME` to reference my DB2 installation directory. This variable is optional but highly recommended. Then, the `DB2CLIINIPATH` variable tells SAS the location of `db2cli.ini`. If you decide to use `db2dsdriver.cfg` instead, then replace that variable with `DB2DSDRIVER_CFG_PATH`.

Finally, you have to inform SAS where to find unixODBC and DB2 libraries. SAS uses the `LD_LIBRARY_PATH` environment variable on Linux. This variable points to `/usr/lib64` because it's the default path for all unixODBC libraries. I also appended the `\$INSTHOME/lib64` directory, since it's the place where DB2 has all its libraries.

At this point, you might think that you are all set. I don't like being pessimistic, but there is a high possibility that your connection may not work. I didn't test all SAS and DB2 releases, but my customer and I experienced the same error when we tested our connection with a LIBNAME statement:

```[sas@muse 9.4]\$ ./sas -nodms   1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;   ERROR: An exception has been encountered. Please contact technical support and provide them with the following traceback information:   The SAS task name is [Line Mod] Segmentation Violation # More lines...```

Moreover, if you run this code in SAS Studio or SAS Enterprise Guide, you might not even get a response, SAS will just crash or hang for a long time.

#### Fixing the Segmentation Violation error

After intense research, I was able to fix this error. We need to link two ODBC libraries that SAS provides to enable 32-bit capability. Change to your `!SASROOT/sasexe` directory and backup both files:

```[sas@muse sasexe]\$ mv sasodb sasodb.orig [sas@muse sasexe]\$ mv sasioodb sasioodb.orig```

Create two symbolic links that point to the 32-bit libraries:

```[sas@muse sasexe]\$ ln -s sasodb_u sasodb [sas@muse sasexe]\$ ln -s sasioodb_u sasioodb```

List your files, and the output should resemble the following:

```[sas@muse sasexe]\$ ls -l *odb* -r-xr-xr-x. 1 sas sas 630196 Nov 9 2016 sasiodbu lrwxrwxrwx. 1 sas sas 10 Sep 11 21:51 sasioodb -> sasioodb_u -r-xr-xr-x. 1 sas sas 603114 Nov 9 2016 sasioodb.orig -r-xr-xr-x. 1 sas sas 603114 Nov 9 2016 sasioodb_u lrwxrwxrwx. 1 sas sas 8 Sep 11 21:52 sasodb -> sasodb_u -r-xr-xr-x. 1 sas sas 59977 Nov 9 2016 sasodbcb -r-xr-xr-x. 1 sas sas 59977 Nov 9 2016 sasodbcb_u -r-xr-xr-x. 1 sas sas 102142 Nov 9 2016 sasodbcs -r-xr-xr-x. 1 sas sas 71982 Nov 9 2016 sasodbdb -r-xr-xr-x. 1 sas sas 71990 Nov 9 2016 sasodbdb_u -r-xr-xr-x. 1 sas sas 202343 Nov 9 2016 sasodb.orig -r-xr-xr-x. 1 sas sas 201815 Nov 9 2016 sasodb_u -r-xr-xr-x. 1 sas sas 443852 Nov 9 2016 tkeodbc.so```

Now your `LIBNAME` statement may work, but unlike my customer, you can also encounter this error:

``` 1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;   ERROR: Could not load /sas/Software/SASFoundation/9.4/sasexe/sasodb (35 images loaded) ERROR: libodbc.so.1: cannot open shared object file: No such file or directory ERROR: The SAS/ACCESS Interface to ODBC cannot be loaded. The SASODB code appendage could not be loaded. ERROR: Error in the LIBNAME statement.```

The error is self-explanatory. SAS couldn't find the file `libodbc.so.1`, which SASODB needs. To display all the required shared libraries, execute the `ldd` command:

```[sas@muse sasexe]\$ ldd sasodb linux-vdso.so.1 => (0x00007ffd4efe2000) libodbc.so.1 => not found libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9aa741d000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f9aa71e6000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f9aa6fe2000) libm.so.6 => /lib64/libm.so.6 (0x00007f9aa6cdf000) libc.so.6 => /lib64/libc.so.6 (0x00007f9aa691e000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f9aa6708000) /lib64/ld-linux-x86-64.so.2 (0x00007f9aa7882000) libfreebl3.so => /lib64/libfreebl3.so (0x00007f9aa6504000)```

Indeed, that library was not found, so to fix this missing dependency, create a symbolic link using root or sudo:

`[root@muse ~]# ln -s /lib64/libodbc.so.2.0.0 /lib64/libodbc.so.1`

#### Creating a library that works

Now that the previous errors are gone, run a final test using the SAS command line to assign a library and then SAS Studio to display some data in a nice-looking way:

```[sas@muse 9.4]\$ ./sas -nodms 1? libname db2lib odbc user=db2inst password=mypassword datasrc=db2odbc;   NOTE: Libref DB2LIB was successfully assigned as follows: Engine: ODBC Physical Name: db2odbc```

Note: Displaying actual data is an essential last step because if you are pointing to an incorrect DB2 ODBC driver (such as `libdb2o.so` instead of `libdb2.so`), you can still assign a library successfully, but you won't see any tables.

In SAS Studio I assigned the same library and ran a small query to retrieve some data:

```proc print data=db2lib.department_lookup; var department_code department_en; run;```

### Wrapping Up

I know it's a lot to digest. However, some users may not need to follow all steps to succeed, whereas others starting from scratch may require performing all steps carefully. Whatever your situation might be, I hope you find this guide useful and easy to follow. Did you struggle like me? Share your experience in the comments below or if you have any problems, don't hesitate to contact me.

Connecting SAS to a DB2 database via ODBC without tears was published on SAS Users.

As a SAS Viya user, you may be wondering whether it is possible to execute data append and data update concurrently to a global Cloud Analytic Services (CAS) table from two or more CAS sessions. (Learn more about CAS.) How would this impact the report view while data append or data update is running on a global CAS table? These questions are even more important for those using the programming interface to load and update data in CAS. This post discusses data append, data update, and concurrency in CAS.

Two or more CAS sessions can simultaneously submit a data append and data update process to a CAS table, but only one process at a time can run against the same CAS table. The multiple append and update processes execute in serial, one after another, never running in a concurrent fashion. Whichever CAS session is first to acquire the write lock on a global CAS table prevails, appending or updating the data first. The other append and update processes must wait in a queue to acquire the write lock.

During the data append process, the appended data is not available to end users or reports until all rows are inserted and committed into the CAS table. While data append is running, users can still render reports against the CAS table using the original data, but excluding the appended rows.

Similarly, during the data update process, the updated data is not available to users or reports until the update process is complete. However, CAS lets you render reports using the original (non-updated) data, as the CAS table is always available for the read process. During the data update process, CAS makes additional copies into memory of the to-be-updated blocks containing rows in order to perform the update statement. Once the update process is complete, the additional and now obsolete copies of blocks, are removed from CAS. Data updates to a global CAS table is an expensive operation in terms of CPU and memory usage. You have to factor in the additional overhead memory or CAS_CACHE space to support the updates. The space requirement depends on the number of rows being affected by the update process.

At any given time, there could be only one active write process (append/update) against a global CAS table. However, there could be many concurrent active read processes against a global CAS table. A global CAS table is always available for read processes, even when an append or update process is running on the same CAS table.

The following log example describes two simultaneous CAS sessions executing data appends to a CAS table. Both append processes were submitted to CAS with a gap of a few seconds. Notice the execution time for the second CAS session MYSESSION1 is double the time that it took the first CAS session to append the same size of data to the CAS table. This shows that both appends were executing one after another. The amount of memory used and the CAS_CACHE location also shows that both processes were running one after another in a serial fashion.

### Log from simultaneous CAS session MYSESSION submitting APPEND

```58 proc casutil ; NOTE: The UUID '1411b6f2-e678-f546-b284-42b6907260e9' is connected using session MYSESSION. 59 load data=mydata.big_prdsale 60 outcaslib="caspath" casout="big_PRDSALE" append ; NOTE: MYDATA.BIG_PRDSALE was successfully added to the "caspath" caslib as "big_PRDSALE". 61 quit ; NOTE: PROCEDURE CASUTIL used (Total process time): real time 49.58 seconds cpu time 5.05 seconds```

### Log from simultaneous CAS session MYSESSION1 submitting APPEND

```58 proc casutil ; NOTE: The UUID 'a20a246e-e0cc-da4d-8691-c0ff2a222dfd' is connected using session MYSESSION1. 59 load data=mydata.big_prdsale1 60 outcaslib="caspath" casout="big_PRDSALE" append ; NOTE: MYDATA.BIG_PRDSALE1 was successfully added to the "caspath" caslib as "big_PRDSALE". 61 quit ; NOTE: PROCEDURE CASUTIL used (Total process time): real time 1:30.33 cpu time 4.91 seconds```

When the data append process from MYSESSION1 was submitted alone (no simultaneous process), the execution time is around the same as for the first session MYSESSION. This also shows that when two simultaneous append processes were submitted against the CAS table, one was waiting for the other to finish. At one time, only one process was running the data APPEND action to the CAS table (no concurrent append).

### Log from a lone CAS session MYSESSION1 submitting APPEND

```58 proc casutil ; NOTE: The UUID 'a20a246e-e0cc-da4d-8691-c0ff2a222dfd' is connected using session MYSESSION1. 59 load data=mydata.big_prdsale1 60 outcaslib="caspath" casout="big_PRDSALE" append ; NOTE: MYDATA.BIG_PRDSALE1 was successfully added to the "caspath" caslib as "big_PRDSALE". 61 quit ; NOTE: PROCEDURE CASUTIL used (Total process time): real time 47.63 seconds cpu time 4.94 seconds```

The following log example describes two simultaneous CAS sessions submitting data updates on a CAS table. Both update processes were submitted to CAS in a span of a few seconds. Notice the execution time for the second CAS session MYSESSION1 is double the time it took the first session to update the same number of rows. The amount of memory used and the CAS_CACHE location also shows that both processes were running one after another in a serial fashion. While the update process was running, memory and CAS_CACHE space increased, which suggests that the update process makes copies of to-be-updated data rows/blocks. Once the update process is complete, the space usage in memory/CAS_CACHE returned to normal.

When the data UPDATE action from MYSESSION1 was submitted alone (no simultaneous process), the execution time is around the same as for the first CAS session.

### Log from a simultaneous CAS session MYSESSION submitting UPDATE

```58 proc cas ; 59 table.update / 60 set={ 61 {var="actual",value="22222"}, 62 {var="country",value="'FRANCE'"} 63 }, 64 table={ 65 caslib="caspath", 66 name="big_prdsale", 67 where="index in(10,20,30,40,50,60,70,80,90,100 )" 68 } 69 ; 70 quit ; NOTE: Active Session now MYSESSION. {tableName=BIG_PRDSALE,rowsUpdated=86400} NOTE: PROCEDURE CAS used (Total process time): real time 4:37.68 cpu time 0.05 seconds```

### Log from a simultaneous CAS session MYSESSION1 submitting UPDATE

```57 proc cas ; 58 table.update / 59 set={ 60 {var="actual",value="22222"}, 61 {var="country",value="'FRANCE'"} 62 }, 63 table={ 64 caslib="caspath", 65 name="big_prdsale", 66 where="index in(110,120,130,140,150,160,170,180,190,1100 )" 67 } 68 ; 69 quit ; NOTE: Active Session now MYSESSION1. {tableName=BIG_PRDSALE,rowsUpdated=86400} NOTE: PROCEDURE CAS used (Total process time): real time 8:56.38 cpu time 0.09 seconds```

The following memory usage snapshot from one of the CAS nodes describes the usage of memory before and during the CAS table update. Notice the values for “used” and “buff/cache” columns before and during the CAS table update.

### Summary

When simultaneous data append and data update requests are submitted against a global CAS table from two or more CAS sessions, they execute in a serial fashion (no concurrent process execution). To execute data updates on a CAS table, you need an additional overhead memory/CAS_CACHE space. While the CAS table is going through the data append or data update process, the CAS table is still accessible to rendering reports.

Concurrent data append and update to a global CAS table was published on SAS Users.

The concept of real-time customer experiences has been around for a while. However, the ability to deliver real-time customer experiences had been limited until recently. Real-time experiences are progressively and significantly influencing today’s customer journeys. Tomorrow, successful customer journeys will not happen at all without them. What is the meaning [...]

How Do You #RealTimeWithSAS? was published on Customer Intelligence Blog.