9月 102020
 

I previously wrote about the RAS algorithm, which is a simple algorithm that performs matrix balancing. Matrix balancing refers to adjusting the cells of a frequency table to match known values of the row and column sums. Ideally, the balanced matrix will reflect the structural relationships in the original matrix. An alternative method is the iterative proportional fitting (IPF) algorithm, which is implemented in the IPF subroutine in SAS/IML. The IPF method can balance n-way tables, n ≥ 2.

The IPF function is a statistical modeling method. It computes maximum likelihood estimates for a hierarchical log-linear model of the counts as a function of the categorical variables. It is more complicated to use than the RAS algorithm, but it is also more powerful.

This article shows how to use the IPF subroutine for the two-way frequency table from my previous article. The data are for six girl scouts who sold seven types of cookies. We know the row totals for the Type variable (the boxes sold for each cookie type) and the column totals for the Person variable (the boxes sold by each girl). But we do not know the values of the individual cells, which are the Type-by-Person totals. A matrix balancing algorithm takes any guess for the table and maps it into a table that has the observed row and column sums.

Construct a table that has the observed marginal totals

As mentioned in the previous article, there are an infinite number of matrices that have the observed row and column totals. Before we discuss the IPF algorithm, I want to point out that it is easy to compute one particular balanced matrix. The special matrix is the one that assumes no association (independence) between the Type and Person variables. That matrix is formed by the the outer product of the marginal totals:
     B[i,j] = (sum of row i)(sum of col j) / (sum of all counts)

It is simple to compute the "no-association" matrix in the SAS/IML language:

proc iml;
/* Known quantities: row totals and column totals */
u = {260, 214, 178, 148, 75, 67, 59}; /* total boxes sold, by type */
v = {272 180 152 163 134 100};        /* total boxes sold, by girl */
 
/* Compute the INDEPENDENT table, which is formed by the outer product
   of the marginal totals:
   B[i,j] = (sum of row i)(sum of col j) / (sum of all counts) */
B = u*v / sum(u);
 
Type   = 'Cookie1':'Cookie7';                  /* row labels    */
Person = 'Girl1':'Girl6';                      /* column labels */
print B[r=Type c=Person format=7.1 L='Independent Table'];

For this balanced matrix, there is no association between the row and column variables (by design). This matrix is the "expected count" matrix that is used in a chi-square test for independence between the Type and Person variables.

Because the cookie types are listed in order from most popular to least popular, the values in the balanced matrix (B) decrease down each column. If you order the girls according to how many boxes they sold (you need to exchange the third and fourth columns), then the values in the balanced matrix would decrease across each row.

Use the IPF routine to balance a matrix

The documentation for the IPF subroutine states that you need to specify the following input arguments. For clarity, I will only discuss two-way r x c tables.

  • dim: The vector of dimensions of the table: dim = {r c};
  • table: This r x c matrix is used to specify the row and column totals. You can specify any matrix; only the marginal totals of the matrix are used. For example, for a two-way table, you can use the "no association" matrix from the previous section.
  • config is an input matrix whose columns specify which marginal totals to use to fit the counts. Because the model is hierarchical, all subsets of specified marginals are included in the model. For a two-way table, the most common choices are the main effect models such as config={1} (model count by first variable only), config={2} (model count by second variable only), and config={1 2} (model count by using both variables). It is possible to fit the "fully saturated model" (config={1, 2}), but that model is overparameterized and results in a perfect fit, if the IPF algorithm converges.
  • initTable (optional): The elements of this matrix will be adjusted to match the observed marginal totals. In practice, this matrix is often based on a survey and you want to adjust the counts by using known marginal totals from a recent census of the population. Or it might be an educated guess, as in the Girl Scout cookie example. If you do not specify this matrix, a matrix of all 1s is used, which implies that you think all cell entries are approximately equal.

The following SAS/IML program defines the parameters for the IPF call.

/* Specify structure between rows and columns by asking each girl 
   to estimate how many boxes of each type she sold. This is the 
   matrix whose cells will be adjusted to match the marginal totals. */
/*   G1 G2 G3 G4 G5 G6          */
A = {75 45 40 40 40 30 ,  /* C1 */
     40 35 45 35 30 30 ,  /* C2 */
     40 25 30 40 30 20 ,  /* C3 */
     40 25 25 20 20 20 ,  /* C4 */
     30 25  0 10 10  0 ,  /* C5 */
     20 10 10 10 10  0 ,  /* C6 */
     20 10  0 10  0  0 }; /* C7 */
 
Dim = ncol(A) || nrow(A);  /* dimensions of r x c table */
Config = {1 2};            /* model main effects: counts depend on Person and Type */
call IPF(FitA, Status,     /* output values */
         Dim, B,           /* use the "no association" matrix to specify marginals */
         Config, A);       /* use best-guess matrix to specify structure */
print FitA[r=Type c=Person format=7.2],     /* the IPF call returns the FitA matrix and Status vector */
      Status[c={'rc' 'Max Diff' 'Iters'}];

The IPF call returns two quantities. The first output is the balanced matrix. For this example, the balanced matrix (FitA) is identical to the result from the RAS algorithm. It has a similar structure to the "best guess" matrix (A). For example, and notice that the cells in the original matrix that are zero are also zero in the balanced matrix.

The second output is the Status vector. The status vector provides information about the convergence of the IPF algorithm:

  • If the first element is zero, the IPF algorithm converged.
  • The second element gives the maximum difference between the marginal totals of the balanced matrix (FitA) and the observed marginal totals.
  • The third element gives the number of iterations of the IPF algorithm.

Other models

The IPF subroutine is more powerful than the RAS algorithm because you can fit n-way tables and because you can model the table in several natural ways. For example, the model in the previous section assumes that the counts depend on the salesgirl and on the cookie type. You can also model the situation where the counts depend only on one of the factors and not on the other:

/* counts depend only on Person, not on cookie Type */
Config = {1};   
call IPF(FitA1, Status, Dim, B, Config, A);
print FitA1[r=Type c=Person format=7.2], Status;
 
/* counts depend only on cookie Type, not on Person */
Config = {2};   
call IPF(FitA2, Status, Dim, B, Config, A);
print FitA2[r=Type c=Person format=7.2], Status;

Summary

In summary, the SAS/IML language provides a built-in function (CALL IPF) for matrix balancing of frequency tables. The routine incorporates statistical modeling and you can specify which variables to use to model the cell counts. The documentation several examples that demonstrate using the IPF algorthm to balance n-way tables, n ≥ 2.

The post Iterative proportional fitting in SAS appeared first on The DO Loop.

9月 092020
 

Heather Cartwright, General Manager of Microsoft Health, joined me to contemplate the new horizons in healthcare made possible by SAS and Microsoft’s new strategic partnership.  The four walls of the doctor’s office are disappearing. Following decades of planning for hypotheticals, this year health care organizations were compelled to make good on the promise of digital transformation. Meeting the [...]

The future of health care and life sciences with SAS Analytics on Azure  was published on SAS Voices by Mark Lambrecht

9月 092020
 

There's no question that we're all increasingly, and often exclusively, interacting with brands digitally. Consumers are now online through countless mechanisms – from laptops and mobile apps to AI-enabled voice assistants and sensor-based wearables. Engagement is diversifying in fascinating new ways. And when organizations can't see their customers interacting in [...]

SAS Customer Intelligence 360: Behavioral event tracking, targeting & engagement analysis was published on Customer Intelligence Blog.

9月 092020
 

Welcome back! Today, I continue with part 2 of my series on building custom applications on SAS Viya. The goal of this series is to discuss securely integrating custom-written applications into your SAS Viya platform.

In the first installment of this series, I outlined my experiences on a recent project. In that example, our team worked with a pre-owned auto mall to create a custom application for their buyers to use at auction, facilitating data-driven decisions and maximizing potential profits. Buyers use the app's UI to select/enter the characteristics of the specific car on which they'd like to bid. Those details are then sent via an HTTP REST call to a specific API endpoint in the customer's SAS Viya environment which exposes a packaged version of a detailed decision flow. Business rules and analytic models evaluate the car details as part of that flow. The endpoint returns a maximum bid recommendation and other helpful information via the custom applications's UI.

If you haven't already, I encourage you to read part 1 for more details.

If the concept of exposing endpoints to custom applications elicits some alarming thoughts for you, fear not. The goal of this post is to introduce the security-related considerations when integrating custom applications into your SAS Viya environment. In fact, that is the topic of the next two sections. Today we'll take a look at the security underpinnings of SAS Viya so we can start to flesh out how that integration works. In part 3, we'll dive into more specific decision points. To round things out, the series finishes with some examples of the concepts discussed in parts 1-3.

Looking a little deeper

So how does SAS Viya know it can trust your custom app? How does it know your app should have the authority to call specific API endpoints in your SAS Viya environment and what HTTP methods it should allow? Moreover, who should be able to make those REST calls, the app itself or only individual logged-in users of that app? To answer these questions, we need to understand the foundation of the security frameworks in SAS Viya.

Security in SAS Viya is built around OAuth 2.0 (authorization) and OpenID Connect (authentication), both of which are industry-standard frameworks. The details of how they work, separately and together, are out of the scope of this post. We will, however, delve into a few key things to be aware of within their implementation context for SAS Viya.

First, SAS Logon Manager is the centerpiece. When a client (application) requests access to endpoints and resources, they must first communicate with SAS Logon Manager which returns a scoped access token if it was a valid request. This is true both within SAS Viya (like when internal services hosting APIs communicate with each other) and for external applications like we’re talking about here. The client/application subsequently hands that token in with their request to the desired endpoint.

Workflow: applications requesting access to SAS Viya endpoints

The long and short of it is that we don’t want users of our custom applications to see or manipulate anything they shouldn’t, so these tokens need to be scoped. The authorization systems in SAS Viya need to be able to understand what the bearer of that specific token should be able to do. For a detailed overview of how SAS Viya services leverage OAuth and OpenID Connect tokens for authentication, I'd recommend reading this SAS Global Forum paper from SAS' Mike Roda.

Later in this series, we'll see examples of obtaining a scoped token and then using it to make REST calls to SAS Viya API endpoints, like the auto mall's custom app does to return recommendations for its users. For now, suffice it to say the scope of the token is the critical component to answering our questions regarding how SAS Viya returns specific responses to our custom application.

The glue that binds the two

We've established that native SAS Viya applications and services register themselves as clients to SAS Logon Manager. When a client requests access to an endpoint and/or resource, they must first communicate with SAS Logon Manager which returns a scoped access token if it was a valid request. If the request was unauthorized, it receives a 403 unauthorized (sometimes 401) response. Your SAS Viya environment's administrator has complete control over the rules that determine the authorization decision. We can register additional custom applications, like the auto mall's auction bidding app, to SAS Logon Manager so they too can request scoped tokens. This provides our custom applications with fine-grained access to endpoints and content within SAS Viya. But we’ve already hit a fork in the road. We need to understand the purpose of the application we’re looking to build, so we can decide if the access should be user-scoped or general-scoped.

In the pre-owned auto mall's custom application, a decision and modeling flow is exposed by way of a SAS Micro Analytics Score (MAS) module at a specific HTTP REST API endpoint. All the users of the application have the same level of access to the same endpoint by design; there was no need for differentiated access levels. This is considered an instance of general-scoped access. Abiding by a least-privilege approach, the app should only access certain data and endpoints in SAS Viya, that is, only what it needs to rather than having free reign. In this case, the appropriate level of access can be granted to the registered client (custom application) itself. The result is an access token from SAS Logon Manager scoped within the context of the application.

In other applications you might find it necessary to make more sensitive, or user-specific resources available. In this instance, you'll want to make all authorization decisions within the context of the individual logged-in user. For these situations, it is more prudent to redirect the user directly to SAS Logon Manager to log in and then redirect them back to your application. This results in SAS Logon Manager returning an access token scoped within the context of the logged-in user.

What's next?

Once you've decided which of these two scenarios most closely fits your custom application's needs, the next step is to decide on a flow (or grant) to provide the associated type of access token, general- or user-scoped. Here “flow” is a term used to describe the process for accessing a protected resource within the OAuth framework and that's what I'll cover in the next post in this series.

Building custom apps on top of SAS Viya, Part Two: Understanding the integration points was published on SAS Users.

9月 082020
 

Inclusion and diversity are strengths. The SAS company culture is built on a foundation of caring for one another and holding ourselves and others accountable. That includes recognition and accountability for where we can improve. Our work of inclusivity is evolving and continuous. We must frequently re-examine the way we speak, act [...]

Inclusive terminology: Knowing better, doing better  was published on SAS Voices by Gavin Day

9月 082020
 

Matrix balancing is an interesting problem that has a long history. Matrix balancing refers to adjusting the cells of a frequency table to match known values of the row and column sums. One of the early algorithms for matrix balancing is known as the RAS algorithm, but it is also called the raking algorithm in some fields. The presentation in this article is inspired by a paper by Carol Alderman (1992).

The problem: Only marginal totals are known

Alderman shows a typical example (p. 83), but let me give a smaller example. Suppose a troop of six Girl Scouts sells seven different types of cookies. At the end of the campaign, each girl reports how many boxes she sold. Also, the troop leader knows how many boxes of each cookie type were sold. However, no one kept track of how many boxes of each type were sold by each girl.

The situation is summarized by the following 7 x 6 table. We know the row totals, which are the numbers of boxes that were sold for each cookie type. We also know the column totals, which are the numbers of boxes that were sold by each girl. But we do not know the values of the individual cells, which are the type-by-girl totals.

Guessing the cells of a matrix

As stated, the problem usually has infinitely many solutions. For an r x c table, there are r*c cells, but the marginal totals only provide r + c linear constraints. Except for 2 x 2 tables, there are more unknowns than equations, which leads to an underdetermined linear system that (typically) has infinitely many solutions.

Fortunately, often obtain a unique solution if we provide an informed guess for the cells in the table. Perhaps we can ask the girls to provide estimates of the number of boxes of each type that they sold. Or perhaps we have values for the cells from the previous year. In either case, if we can estimate the cell values, we can use a matrix balancing algorithm to adjust the cells so that they match the observed marginal totals.

Let's suppose that the troop leader asks the girls to estimate (from memory) how many boxes of each type they sold. The estimates are shown in the following SAS/IML program. The row and column sums of the estimates do not match the known marginal sums, but that's okay. The program uses subscript reduction operators to compute the marginal sums of the girls' estimates. These are displayed next to the actual totals, along with the differences.

proc iml;
/* Known marginal totals: */
u = {260, 214, 178, 148, 75, 67, 59}; /* total boxes sold, by type */
v = {272 180 152 163 134 100};        /* total boxes sold, by girl */
 
/* We don't know the cell values that produce these marginal totals,
   but we can ask each girl to estimate how many boxes of each type she sold */
/*   G1 G2 G3 G4 G5 G6          */
A = {75 45 40 40 40 30 ,  /* C1 */
     40 35 45 35 30 30 ,  /* C2 */
     40 25 30 40 30 20 ,  /* C3 */
     40 25 25 20 20 20 ,  /* C4 */
     30 25  0 10 10  0 ,  /* C5 */
     20 10 10 10 10  0 ,  /* C6 */
     20 10  0 10  0  0 }; /* C7 */
 
/* notice that the row/col totals for A do not match the truth: */
rowTotEst = A[ ,+];        /* for each row, sum across all cols */
colTotEst = A[+, ];        /* for each col, sum down all rows   */
Items  = 'Cookie1':'Cookie7';                  /* row labels    */
Person = 'Girl1':'Girl6';                      /* column labels */
 
/* print known totals, estimated totals, and difference */
rowSums = (u || rowTotEst || (u-rowTotEst));
print rowSums[r=Items c={'rowTot (u)' 'rowTotEst' 'Diff'}];
colSums = (v // colTotEst // (v-colTotEst));
print colSums[r={'colTot (v)' 'colTotEst' 'Diff'} c=Person];

You can see that the marginal totals do not match the known row and column totals. However, we can use the guesses as an initial estimate for the RAS algorithm. The RAS algorithm will adjust the estimates to obtain a new matrix that DOES satisfy the marginal totals.

Adjusting a matrix to match marginal totals

The RAS algorithm can iteratively adjust the girls' estimates until the row sums and column sums of the matrix match the known row sums and column sums. The resulting matrix will probably not be an integer matrix. It will, however, reflect the structure of the girls' estimates in three ways:

  1. It will approximately preserve each girl's recollection of the relative proportions of cookie types that she sold. If a girl claims that she sold many of one type of cookie and few of another, that relationship will be reflected in the balanced matrix.
  2. It will approximately preserve the relative proportions of cookie types among girls. If one girl claims that she sold many more of one type of cookie than another girl, that relationship will be reflected in the balanced matrix.
  3. It will preserve zeros. If a girl claims that she did not sell any of one type of cookie, the balanced matrix will also contain a zero for that cell.

The RAS algorithm is explained in Alderman (1992). The matrix starts with a nonnegative r x c matrix. Let u be the observed r x 1 vector of row sums. Let v be the observed 1 x c vector of column sums. The main steps of the RAS algorithm are:

  1. Initialize X = A.
  2. Scale the rows. Let R = u / RowSum(X), where 'RowSum' is a function that returns the vector of row sums. Update X ← R # X. Here R#X indicates elementwise multiplication of each column of X by the corresponding element of R.
  3. Scale the columns. Let S = v / ColSum(X), where 'ColSum' is a function that returns the vector of column sums. Update X ← X # S. Here X#S indicates elementwise multiplication of each row of X by the corresponding element of S.
  4. Check for convergence. If RowSum(X) is close to u and ColSum(X) is close to v, then the algorithm has converged, and X is the balanced matrix. Otherwise, return to Step 2.

The RAS algorithm is implemented in the following SAS/IML program:

/* Use the RAS matrix scaling algorithm to find a matrix X that 
   approximately satisfies 
   X[ ,+] = u  (marginals constraints on rows)
   X[+, ] = v  (marginals constraints on cols)
   and X is obtained by scaling rows and columns of A */
start RAS(A, u, v);
   X = A;       /* Step 0: Initialize X */
   converged = 0;
   do k = 1 to 100 while(^Converged);
      /* Step 1: Row scaling: X = u # (X / X[ ,+]); */
      R = u / X[ ,+];
      X = R # X;
 
      /* Step 2: Column scaling: X = (X / X[+, ]) # v; */
      S = v / X[+, ];
      X = X # S;
 
      /* check for convergence: u=rowTot and v=colTot */
      rowTot = X[ ,+];   /* for each row, sum across all cols */
      colTot = X[+, ];   /* for each col, sum down all rows */
      maxDiff = max( abs(u-rowTot), abs(v-colTot) );
      Converged = (maxDiff < 1e-8);
   end;
   return( X );
finish;
 
X = RAS(A, u, v);
print X[r=Items c=Person format=7.1];

The matrix X has row and column sums that are close to the specified values. The matrix also preserves relationships among individual columns and rows, based on the girls' recollections of how many boxes they sold. The algorithm preserves zeros for the girls who claim they sold zero boxes of a certain type. It does not introduce any new zeros for cells.

You can't determine how close the X matrix is to "the truth," because we don't know the true number of boxes that each girl sold. The X matrix depends on the girls' recollections and estimates. If the recollections are close to the truth, then X will be close to the truth as well.

Summary

This article looks at the RAS algorithm, which is one way to perform matrix balancing on a table that has nonnegative cells. "Balancing" refers to adjusting the interior cells of a matrix to match observed marginal totals. The RAS algorithm starts with an estimate, A. The estimate does not need to satisfy the marginal totals, but it should reflect relationships between the rows and columns that you want to preserve, such as relative proportions and zero cells. The RAS algorithm iteratively scales the rows and columns of A to create a new matrix that matches the observed row and column sums.

It is worth mentioning the row scaling could be accumulated into a diagonal matrix, R, and the column scaling into a diagonal matrix, S. Thus, the balanced matrix is the product of these diagonal matrices and the original estimate, A. As a matrix equation, you can write X = RAS. It is this matrix factorization that gives the RAS algorithm its name.

The RAS is not the only algorithm for matrix balancing, but it is the simplest. Another algorithm, called iterative proportional fitting (IPF), will be discussed in a second article.

The post Matrix balancing: Update matrix cells to match row and column sums appeared first on The DO Loop.

9月 072020
 

Locale-specific SAS® format catalogs make reporting in multiple languages more dynamic. It is easy to generate reports in different languages when you use both the LOCALE option in the FORMAT procedure and the LOCALE= system option to create these catalogs. If you are not familiar with the LOCALE= system option, see the "Resources" section below for more information.

This blog post, inspired by my work on this topic with a SAS customer, focuses on how to create and use locale-specific informats to read in numeric values from a Microsoft Excel file and then transform them into SAS character values. I incorporated this step into a macro that transforms ones and zeroes from the Excel file into meaningful information for multilingual readers.

Getting started: Creating the informats

The first step is to submit the LOCALE= system option with the value fr_FR. For the example in this article, I chose the values fr_FR and en_US for French and English from this table of LOCALE= values. (That is because I know how to say “yes” and “no” in both English and French — I need to travel more!)

   options locale=fr_fr;

The following code uses both the INVALUE statement and the LOCALE option in PROC FORMAT to create an informat that is named $PT_SURVEY:

   proc format locale library=work;
      invalue $pt_survey 1='oui' 0='non'; run;

Now, toggle the LOCALE= system option and create a second informat using labels in a different language (in this example, it is English):
options locale=en_us;

   proc format locale library=work;
      invalue $pt_survey 1='yes' 0='no';
   run;

In the screenshot below, which shows the output from the DATASETS procedure, you can see that PROC FORMAT created two format catalogs using the specified locale values, which are preceded by underscore characters. If the format catalogs already exist, PROC FORMAT simply adds the $PT_SURVEY informat entry type to them.

   proc datasets memtype=catalog; 
   quit;

Before you use these informats for a report, you must tell SAS where the informats are located. To do so, specify /LOCALE after the libref name within the FMTSEARCH= system option. If you do not add the /LOCALE specification, you see an error message stating either that the $PT_SURVEY informat does not exist or that it cannot be found. In the next two OPTIONS statements, SAS searches for the locale-specific informat in the FORMATS_FR_FR catalog, which PROC FORMAT created in the WORK library:

   options locale=fr_fr;
   options fmtsearch=(work/locale);

If you toggle the LOCALE= system option to have the en_US locale value, SAS then searches for the informat in the other catalog that was created, which is the FORMATS_EN_US catalog.

Creating the Excel file for this example

For this example, you can create an Excel file by using the ODS EXCEL destination from the REPORT procedure output. Although you can create the Excel file in various ways, the reason that I chose the ODS EXCEL statement was to show you some options that can be helpful in this scenario and are also useful at other times.
Use the ODS EXCEL destination to create a file from PROC REPORT. I specify the TAGATTR= style attribute using “TYPE:NUMBER” for the Q_1 variable:

   %let  path=%sysfunc(getoption(WORK));
   filename temp "&path\surveys.xlsx"; 
   ods excel file=temp;
 
 
   data one;
      infile datalines truncover;
      input ptID Q_1;
      datalines;
   111 0
   112 1
   ;
   run;
 
   proc report data=one;
      define ptID / display style(column)={tagattr="type:String"};
      define Q_1 / style(column)={tagattr="type:Number"};
   run;
 
   ods excel close;

Now you have a file that looks like this screenshot when it is opened in Excel. Note that the data value for the Q_1 column is numeric:

The IMPORT procedure uses the DBSASTYPE= data set option to convert the numeric Excel data into SAS character values. Then I can apply the locale-specific character informat to a character variable.

As you will see below, in the macro, I use DBMS=EXCEL in PROC IMPORT to read the Excel file because my SAS and Microsoft Office versions are both 64-bit. (You might have to use the PCFILES LIBNAME Engine to connect to Excel through the SAS PC Files Server if you are not set up this way.)

Using the informats in a macro to create the multilingual reports

The final step is to run the macro with parameters to produce the two reports in French and English, using the locale-specific catalogs. When the macro is called, depending on the parameter value for the macro variable LOCALE, the LOCALE= system option changes, and the $PT_SURVEY informat from the locale-specific catalog is applied. These two tabular reports are produced:

Here is the full code for the example:

   %let  path=%sysfunc(getoption(WORK));
   filename temp "&path\surveys.xlsx";
   ods excel file=temp;
 
   data one;
      infile datalines truncover;
      input ptID Q_1;
      datalines;
   111 0
   112 1
   ;
   run;
 
   proc report data=one;
      define ptID / display style(column)={tagattr="type:String"};
      define Q_1 / style(column)={tagattr="type:Number"};
   run;
 
   ods excel close;
   options locale=fr_fr;
 
   proc format locale library=work;
      invalue $pt_survey 1='oui' 0='non';
   run;
 
   options locale=en_us;
 
   proc format locale library=work;
      invalue $pt_survey 1='yes' 0='no';
   run;
 
   /* Set the FMTSEARCH option */
   options fmtsearch=(work/locale);
 
   /* Compile the macro */
   %macro survey(locale,out);
      /* Set the LOCALE system option */
      options locale=&locale;
 
      /* Import the Excel file  */
      filename survey "&path\surveys.xlsx";
 
      proc import dbms=excel datafile=survey out=work.&out replace;
         getnames=yes;
         dbdsopts="dbsastype=(Q_1='char(8)')";
      run;
 
      data work.&out;
         set work.&out;
 
         /* Create a new variable for the report whose values are assigned by specifying the locale-specific informat in the INPUT function */
         newvar=input(Q_1, $pt_survey.);
         label newvar='Q_1';
      run;
 
      options missing='0';
 
      /*  Create the tabular report */
      proc tabulate data=&out;
         class ptID newvar;
 
         table ptID='Patient ID', newvar*n=' '/box="&locale";
      run;
 
   %mend survey;
 
   /* Call the macros */
   %survey(fr_fr,fr)
   %survey(en_us,en)

For a different example that does not involve an informat, you can create a format in a locale-specific catalog to print a data set in both English and Romanian. See Example 19: Creating a Locale-Specific Format Catalog in the Base SAS® 9.4 Procedures Guide.

Resources

For more information about the LOCALE option:

For more information about reading and writing Excel files:

For more information about creating macros and using the macro facility in SAS:

Using locale-specific format catalogs to create reports in multiple languages was published on SAS Users.

9月 042020
 

As SAS Viya adoption increases among customers, many discover that it fits perfectly alongside their existing SAS implementations, which can be integrated and kept running until major projects have been migrated over. Conversely, SAS Grid Manager has been deployed during the past years to countless production sites. Because SAS Viya provides distributed computing capabilities, customers wonder how it compares to SAS Grid Manager.

SAS® Grid Manager and SAS® Viya® implement distributed computing according to different computational patterns. They can complement each other in providing a highly available and scalable environment to process large volumes of data and produce rapid results. At a high level, the questions we get the most from SAS customers can be summarized in four categories:

  1. I have SAS Viya and SAS Grid Manager. How can I get the most value from using them together?
  2. I have SAS Viya. Can I get any additional benefits by also implementing SAS Grid Manager?
  3. I have SAS Grid Manager. Should I move to SAS Viya?
  4. I am starting a new project. Which platform should I use - SAS Viya or SAS Grid Manager?

To better understand how to get the most from both an architecture and an administration perspective, I answer these questions and more in my SGF 2020 paper SAS® Grid Manager and SAS® Viya®: A Strong Relationship, and its accompanying YouTube video:

I’ve also written a three-part series with more details:

SAS Grid Manager and SAS Viya: A Strong Relationship was published on SAS Users.

9月 042020
 

As SAS Viya adoption increases among customers, many discover that it fits perfectly alongside their existing SAS implementations, which can be integrated and kept running until major projects have been migrated over. Conversely, SAS Grid Manager has been deployed during the past years to countless production sites. Because SAS Viya provides distributed computing capabilities, customers wonder how it compares to SAS Grid Manager.

SAS® Grid Manager and SAS® Viya® implement distributed computing according to different computational patterns. They can complement each other in providing a highly available and scalable environment to process large volumes of data and produce rapid results. At a high level, the questions we get the most from SAS customers can be summarized in four categories:

  1. I have SAS Viya and SAS Grid Manager. How can I get the most value from using them together?
  2. I have SAS Viya. Can I get any additional benefits by also implementing SAS Grid Manager?
  3. I have SAS Grid Manager. Should I move to SAS Viya?
  4. I am starting a new project. Which platform should I use - SAS Viya or SAS Grid Manager?

To better understand how to get the most from both an architecture and an administration perspective, I answer these questions and more in my SGF 2020 paper SAS® Grid Manager and SAS® Viya®: A Strong Relationship, and its accompanying YouTube video:

I’ve also written a three-part series with more details:

SAS Grid Manager and SAS Viya: A Strong Relationship was published on SAS Users.

9月 022020
 

SAS offering free learning resources in celebration of programmers

For more than 40 years, SAS programmers have crafted software and solutions that transform the world. From statistics to data science, to analytics and artificial intelligence, people writing code have architected a new economy with incredible opportunities. SAS Programmer Week honors those people by offering free learning resources available for everyone, from students to early career professionals to SAS veterans.

Running from Sept. 7-11, SAS Programmer Week leads up to the international Day of the Programmer on Saturday, Sept. 12. Training resources will be available for free through a variety of YouTube and video tutorials, webinars, blogs and documentation.

There will be three different tracks for new, experienced and analytics-focused users, with new content released each day. The week culminates with SAS certification prep content that will have participants ready to pursue a valuable SAS credential.

For instance, Tech Republic named SAS as one of 7 data science certifications to boost your resume and salary. CIO Magazine puts SAS among the top 11 big data and analytics certifications for 2020.

Since SAS programmers are busy and may not have all day to engage with the materials, SAS Programmer Week is flexible. Participants can access the material when they want to learn a specific skill related to the day’s topic or consume the material in snippets when they have time.

Interested participants can visit the SAS Programmer Week website to register today, preview the materials and schedule, and jump-start their career journeys.

 

All hail the SAS programmer! was published on SAS Users.