3月 272020
 

Six editions is a lot! If you had told us, back when we wrote the first edition of The Little SAS Book, that someday we would write a sixth; we would have wondered how we could possibly find that much to say. After all, it is supposed to be The Little SAS Book, isn’t it? But the developers at SAS Institute are constantly hard at work inventing new and better ways of analyzing and visualizing data. And some of those ways turn out to be so fundamental that they belong even in a little book about SAS.

Interface independence

One of the biggest changes to SAS software in recent years is the proliferation of interfaces. SAS programmers have more choices than ever before. Previous editions contained some sections specific to the SAS windowing environment (also called Display Manager). We wrote this edition for all SAS programmers whether you use SAS Studio, SAS Enterprise Guide, the SAS windowing environment, or run in batch. That sounds easy, but it wasn’t. There are differences in how SAS behaves with different interfaces, and these differences can be very fundamental. In particular, the system option that sets the rules for names of variables varies depending on how you run SAS. So old sections had to be rewritten, and we added a whole new section showing how to use variable names containing blanks and special characters.

New ways to read and write Microsoft Excel files

Previous editions already covered how to read and write Microsoft Excel files, but SAS developers have created new ways that are even better. This edition contains new sections about the XLSX LIBNAME engine and the ODS EXCEL destination.

More PROC SQL

From the very first edition, The Little SAS Book always covered PROC SQL. But it was in an appendix and over time we noticed that most people ignore appendices. So for this edition, we removed the appendix and added new sections on using PROC SQL to

  • Subset your data
  • Join data sets
  • Add summary statistics to a data set
  • Create macro variables with the INTO clause

For people who are new to SQL, these sections provide a good introduction; for people who already know SQL, they provide a model of how to leverage SQL in your SAS programs.

Updates and additions throughout the book

Almost every section in this edition has been changed in some way. We added new options, made sure everything is up-to-date, and ran every example in every SAS interface noting any differences. For example, PROC SGPLOT has some new options, the default ODS style for PDF has changed, and the LISTING destination behaves differently in different interfaces. Here’s a short list, in no particular order, of new or expanded topics in the sixth edition:

  • More examples with permanent SAS data sets, CSV files, or tab-delimited files
  • More log notes throughout the book showing what to look for
  • LIKE or sounds-like (=*) operators in WHERE statements
  • CROSSLIST, NOCUM, and NOPRINT options in PROC FREQ
  • Grouping data with a user-defined format and the PUT function
  • Iterative DO groups
  • DO WHILE and DO UNTIL statements
  • %DO statements

Even though we have added a lot to this edition, it is still a little book.  In fact, this edition is shorter than the last—by twelve pages! We think this is the best edition yet.

3月 262020
 

Excitement levels are high for the March 2020 release of SAS Customer Intelligence 360, which includes multiple years of research and development culminating in enhancements to the platform's underlying data model. The changes will introduce the unification of a comprehensive data model recording both: Customer behavior -- what users are [...]

SAS Customer Intelligence 360: Unified data model, marketing attribution and AutoML was published on Customer Intelligence Blog.

3月 252020
 

During an outbreak of a disease, such as the coronavirus (COVID-19) pandemic, the media shows daily graphs that convey the spread of the disease. The following two graphs appear frequently:

  • New cases for each day (or week). This information is usually shown as a histogram or needle plot. The graph is sometimes called a frequency graph.
  • The total number of cases plotted against time. Usually, this graph is a line graph. The graph is sometimes called a cumulative frequency graph.

An example of each graph is shown above. The two graphs are related and actually contain the same information. However, the cumulative frequency graph is less familiar and is harder to interpret. This article discusses how to read a cumulative frequency graph. The shape of the cumulative curve indicates whether the daily number of cases is increasing, decreasing, or staying the same.

For this article, I created five examples that show the spread of a hypothetical disease. The numbers used in this article do not reflect any specific disease or outbreak.

How to read a cumulative frequency graph

When the underlying quantity is nonnegative (as for new cases of a disease), the cumulative curve never decreases. It either increases (when new cases are reported) or it stays the same (if no new cases are reported).

When the underlying quantity (new cases) is a count, the cumulative curve is technically a step function, but it is usually shown as a continuous curve by connecting each day's cumulative total. A cumulative curve for many days (more than 40) often looks smooth, so you can describe its shape by using the following descriptive terms:

  • When the number of new cases is increasing, the cumulative curve is "concave up." In general, a concave-up curve is U-shaped, like this: ∪. Because a cumulative frequency curve is nondecreasing, it looks like the right side of the ∪ symbol.
  • When the number of new cases is staying the same, the cumulative curve is linear. The slope of the curve indicates the number of new cases.
  • When the number of new cases is decreasing, the cumulative curve is "concave down." In general, a concave-up curve looks like an upside-down U, like this: ∩. Because a cumulative frequency curve is nondecreasing, a concave-down curve looks like the left side of the ∩ symbol.

A typical cumulative curve is somewhat S-shaped, as shown to the right. The initial portion of the curve (the red region) is concave up, which indicates that the number of new cases is increasing. The cumulative curve is nearly linear between Days 35 and 68 (the yellow region), which indicates that the number of new cases each day is approximately constant. After Day 68, the cumulative curve is concave down, which indicates that the number of daily cases is decreasing. Each interval can be short, long, or nonexistent.

The cumulative curve looks flat near Day 100. When the cumulative curve is exactly horizontal (zero slope), it indicates that there are no new cases.

Sometimes you might see a related graph that displayed the logarithm of the cumulative cases. Near the end of this article, I briefly discuss how to interpret a log-scale graph.

Examples of frequency graphs

It is useful to look at the shape of the cumulative frequency curve for five different hypothetical scenarios. This section shows the cases-per-day frequency graphs; the cumulative frequency curves are shown in subsequent sections.

In each scenario, a population experiences a total of 1,000 cases of a disease over a 100-day time period. For the sake of discussion, suppose that the health care system can treat up to 20 new cases per day. The graphs to the right indicate that some scenarios will overwhelm the health care system whereas others will not. The five scenarios are:

  • Constant rate of new cases: In the top graph, the community experiences about 10 new cases per day for each of the 100 days. Because the number of cases per day is small, the health care system can treat all the infected cases.
  • Early peak: In the second graph, the number of new cases quickly rises for 10 days before gradually declining over the next 50 days. Because the more than 20 new cases develop on Days 5–25, the health care system is overwhelmed on those days.
  • Delayed peak: In the third graph, the number of new cases gradually rises, levels out, and gradually declines. There are only a few days in which the number of new cases exceeds the capacity of the health care system. Epidemiologists call this scenario "flattening the curve" of the previous scenario. By practicing good hygiene and avoiding social interactions, a community can delay the spread of a disease.
  • Secondary outbreak: In the fourth graph, the first outbreak is essentially resolved when a second outbreak appears. This can happen, for example, if a new infected person enters a community after the first outbreak ends. To prevent this scenario, public health officials might impose travel bans on certain communities.
  • Exponential growth: In the fifth graph, the number of new cases increases exponentially. The health care system is eventually overwhelmed, and the graph does not indicate when the spread of the disease might end.

The graphs in this section are frequency graphs. The next sections show and interpret a cumulative frequency graph for each scenario.

Constant rate of new cases

In the first scenario, new cases appear at a constant rate. Consequently, the cumulative frequency chart looks like a straight line. The slope of the line is the rate at which new cases appear. For example, in this scenario, the number of new cases each day is approximately 10. Consequently, the cumulative curve has an average slope ("rise over run") that is close to 10.

Early peak

In the second scenario, new cases appear very quickly at first, then gradually decline. Consequently, the first portion of the cumulative curve is concave up and the second portion is concave down. In this scenario, the number of new cases dwindles to zero, as indicated by the nearly horizontal cumulative curve.

At any moment in time, you can use the slope of the cumulative curve to estimate the number of new cases that are occurring at that moment. Days when the slope of the cumulative curve is large (such as Day 10), correspond to days on which many new cases are reported. Where the cumulative curve is horizontal (zero slope, such as after Day 60), there are very few new cases.

Delayed peak

In the third scenario, new cases appear gradually, level out, and then decline. This is reflected in the cumulative curve. Initially, the cumulative curve is concave up. It then straightens out and appears linear for 10–15 days. Finally, it turns concave down, which indicates that the number of new cases is trending down. Near the end of the 100-day period, the cumulative curve is nearly horizontal because very few new cases are being reported.

Secondary outbreak

In the fourth scenario, there are two outbreaks. During the first outbreak, the cumulative curve is concave up or down as the new cases increase or decrease, respectively. The cumulative curve is nearly horizontal near Day 50, but then goes through a smaller concave up/down cycle as the second outbreak appears. Near the end of the 100-day period, the cumulative curve is once again nearly horizontal as the second wave ends.

Exponential growth

The fifth scenario demonstrates exponential growth. Initially, the number of new cases increases very gradually, as indicated by the small slope of the cumulative frequency curve. However, between Days 60–70, the number of new cases begins to increase dramatically. The lower and upper curves are both increasing at an exponential rate, but the scale of the vertical axis for the cumulative curve (upper graph) is much greater than for the graph of new cases (lower graph). This type of growth is more likely in a population that does not use quarantines and "social distancing" to reduce the spread of new cases.

This last example demonstrates why it is important to label the vertical axis. At first glance, the upper and lower graphs look very similar. Both exhibit exponential growth. One way to tell them apart is to remember that a cumulative frequency graph never decreases. In contrast, if you look closely at the lower graph, you can see that some bars (Days 71 and 91) are shorter than the previous day's bar.

Be aware of log-scale axes

The previous analysis assumes that the vertical axis plot the cumulative counts on a linear scale. Scientific articles might display the logarithm of the total counts. The graph is on a log scale if the vertical axis says "log scale" or if the tick values are powers of 10 such as 10, 100, 1000, and so forth. If the graph uses a log scale:

  • A straight line indicates that new cases are increasing at an exponential rate. The slope of the line indicates how quickly cases will double, with steep lines indicating a short doubling time.
  • A concave-down curve indicates that new cases are increasing at rate that is less than exponential. Log-scale graphs make it difficult to distinguish between a slowly increasing rate and a decreasing rate.

Summary

In summary, this article shows how to interpret a cumulative frequency graph. A cumulative frequency graph is provided for five scenarios that describe the spread of a hypothetical disease. In each scenario, the shape of the cumulative frequency graph indicates how the disease is spreading:

  • When the cumulative curve is concave up, the number of new cases is increasing.
  • When the cumulative curve is linear, the number of new cases is not changing.
  • When the cumulative curve is concave down, the number of new cases is decreasing.
  • When the cumulative curve is horizontal, there are no new cases being reported.

Although the application in this article is the spread of a fictitious disease, the ideas apply widely. Anytime you see a graph of a cumulative quantity (sales, units produced, number of traffic accidents,...), you can the ideas in this article to interpret the cumulative frequency graph and use its shape to infer the trends in the underlying quantity. Statisticians use these ideas to relate a cumulative distribution function (CDF) for a continuous random variable to its probability density function (PDF).

The post How to read a cumulative frequency graph appeared first on The DO Loop.

3月 252020
 

During an outbreak of a disease, such as the coronavirus (COVID-19) pandemic, the media shows daily graphs that convey the spread of the disease. The following two graphs appear frequently:

  • New cases for each day (or week). This information is usually shown as a histogram or needle plot. The graph is sometimes called a frequency graph.
  • The total number of cases plotted against time. Usually, this graph is a line graph. The graph is sometimes called a cumulative frequency graph.

An example of each graph is shown above. The two graphs are related and actually contain the same information. However, the cumulative frequency graph is less familiar and is harder to interpret. This article discusses how to read a cumulative frequency graph. The shape of the cumulative curve indicates whether the daily number of cases is increasing, decreasing, or staying the same.

For this article, I created five examples that show the spread of a hypothetical disease. The numbers used in this article do not reflect any specific disease or outbreak.

How to read a cumulative frequency graph

When the underlying quantity is nonnegative (as for new cases of a disease), the cumulative curve never decreases. It either increases (when new cases are reported) or it stays the same (if no new cases are reported).

When the underlying quantity (new cases) is a count, the cumulative curve is technically a step function, but it is usually shown as a continuous curve by connecting each day's cumulative total. A cumulative curve for many days (more than 40) often looks smooth, so you can describe its shape by using the following descriptive terms:

  • When the number of new cases is increasing, the cumulative curve is "concave up." In general, a concave-up curve is U-shaped, like this: ∪. Because a cumulative frequency curve is nondecreasing, it looks like the right side of the ∪ symbol.
  • When the number of new cases is staying the same, the cumulative curve is linear. The slope of the curve indicates the number of new cases.
  • When the number of new cases is decreasing, the cumulative curve is "concave down." In general, a concave-up curve looks like an upside-down U, like this: ∩. Because a cumulative frequency curve is nondecreasing, a concave-down curve looks like the left side of the ∩ symbol.

A typical cumulative curve is somewhat S-shaped, as shown to the right. The initial portion of the curve (the red region) is concave up, which indicates that the number of new cases is increasing. The cumulative curve is nearly linear between Days 35 and 68 (the yellow region), which indicates that the number of new cases each day is approximately constant. After Day 68, the cumulative curve is concave down, which indicates that the number of daily cases is decreasing. Each interval can be short, long, or nonexistent.

The cumulative curve looks flat near Day 100. When the cumulative curve is exactly horizontal (zero slope), it indicates that there are no new cases.

Sometimes you might see a related graph that displayed the logarithm of the cumulative cases. Near the end of this article, I briefly discuss how to interpret a log-scale graph.

Examples of frequency graphs

It is useful to look at the shape of the cumulative frequency curve for five different hypothetical scenarios. This section shows the cases-per-day frequency graphs; the cumulative frequency curves are shown in subsequent sections.

In each scenario, a population experiences a total of 1,000 cases of a disease over a 100-day time period. For the sake of discussion, suppose that the health care system can treat up to 20 new cases per day. The graphs to the right indicate that some scenarios will overwhelm the health care system whereas others will not. The five scenarios are:

  • Constant rate of new cases: In the top graph, the community experiences about 10 new cases per day for each of the 100 days. Because the number of cases per day is small, the health care system can treat all the infected cases.
  • Early peak: In the second graph, the number of new cases quickly rises for 10 days before gradually declining over the next 50 days. Because the more than 20 new cases develop on Days 5–25, the health care system is overwhelmed on those days.
  • Delayed peak: In the third graph, the number of new cases gradually rises, levels out, and gradually declines. There are only a few days in which the number of new cases exceeds the capacity of the health care system. Epidemiologists call this scenario "flattening the curve" of the previous scenario. By practicing good hygiene and avoiding social interactions, a community can delay the spread of a disease.
  • Secondary outbreak: In the fourth graph, the first outbreak is essentially resolved when a second outbreak appears. This can happen, for example, if a new infected person enters a community after the first outbreak ends. To prevent this scenario, public health officials might impose travel bans on certain communities.
  • Exponential growth: In the fifth graph, the number of new cases increases exponentially. The health care system is eventually overwhelmed, and the graph does not indicate when the spread of the disease might end.

The graphs in this section are frequency graphs. The next sections show and interpret a cumulative frequency graph for each scenario.

Constant rate of new cases

In the first scenario, new cases appear at a constant rate. Consequently, the cumulative frequency chart looks like a straight line. The slope of the line is the rate at which new cases appear. For example, in this scenario, the number of new cases each day is approximately 10. Consequently, the cumulative curve has an average slope ("rise over run") that is close to 10.

Early peak

In the second scenario, new cases appear very quickly at first, then gradually decline. Consequently, the first portion of the cumulative curve is concave up and the second portion is concave down. In this scenario, the number of new cases dwindles to zero, as indicated by the nearly horizontal cumulative curve.

At any moment in time, you can use the slope of the cumulative curve to estimate the number of new cases that are occurring at that moment. Days when the slope of the cumulative curve is large (such as Day 10), correspond to days on which many new cases are reported. Where the cumulative curve is horizontal (zero slope, such as after Day 60), there are very few new cases.

Delayed peak

In the third scenario, new cases appear gradually, level out, and then decline. This is reflected in the cumulative curve. Initially, the cumulative curve is concave up. It then straightens out and appears linear for 10–15 days. Finally, it turns concave down, which indicates that the number of new cases is trending down. Near the end of the 100-day period, the cumulative curve is nearly horizontal because very few new cases are being reported.

Secondary outbreak

In the fourth scenario, there are two outbreaks. During the first outbreak, the cumulative curve is concave up or down as the new cases increase or decrease, respectively. The cumulative curve is nearly horizontal near Day 50, but then goes through a smaller concave up/down cycle as the second outbreak appears. Near the end of the 100-day period, the cumulative curve is once again nearly horizontal as the second wave ends.

Exponential growth

The fifth scenario demonstrates exponential growth. Initially, the number of new cases increases very gradually, as indicated by the small slope of the cumulative frequency curve. However, between Days 60–70, the number of new cases begins to increase dramatically. The lower and upper curves are both increasing at an exponential rate, but the scale of the vertical axis for the cumulative curve (upper graph) is much greater than for the graph of new cases (lower graph). This type of growth is more likely in a population that does not use quarantines and "social distancing" to reduce the spread of new cases.

This last example demonstrates why it is important to label the vertical axis. At first glance, the upper and lower graphs look very similar. Both exhibit exponential growth. One way to tell them apart is to remember that a cumulative frequency graph never decreases. In contrast, if you look closely at the lower graph, you can see that some bars (Days 71 and 91) are shorter than the previous day's bar.

Be aware of log-scale axes

The previous analysis assumes that the vertical axis plot the cumulative counts on a linear scale. Scientific articles might display the logarithm of the total counts. The graph is on a log scale if the vertical axis says "log scale" or if the tick values are powers of 10 such as 10, 100, 1000, and so forth. If the graph uses a log scale:

  • A straight line indicates that new cases are increasing at an exponential rate. The slope of the line indicates how quickly cases will double, with steep lines indicating a short doubling time.
  • A concave-down curve indicates that new cases are increasing at rate that is less than exponential. Log-scale graphs make it difficult to distinguish between a slowly increasing rate and a decreasing rate.

Summary

In summary, this article shows how to interpret a cumulative frequency graph. A cumulative frequency graph is provided for five scenarios that describe the spread of a hypothetical disease. In each scenario, the shape of the cumulative frequency graph indicates how the disease is spreading:

  • When the cumulative curve is concave up, the number of new cases is increasing.
  • When the cumulative curve is linear, the number of new cases is not changing.
  • When the cumulative curve is concave down, the number of new cases is decreasing.
  • When the cumulative curve is horizontal, there are no new cases being reported.

Although the application in this article is the spread of a fictitious disease, the ideas apply widely. Anytime you see a graph of a cumulative quantity (sales, units produced, number of traffic accidents,...), you can the ideas in this article to interpret the cumulative frequency graph and use its shape to infer the trends in the underlying quantity. Statisticians use these ideas to relate a cumulative distribution function (CDF) for a continuous random variable to its probability density function (PDF).

The post How to read a cumulative frequency graph appeared first on The DO Loop.

3月 232020
 

When you create a graph by using the SGPLOT procedure in SAS, usually the default tick locations are acceptable. Sometimes, however, you might want to specify a set of custom tick values for one or both axes. This article shows three examples:

  • Specify evenly spaced values.
  • Specify tick values that are not evenly spaced.
  • Specify custom labels for the ticks, including symbols such as π.

You can accomplish these tasks by using options on the XAXIS and YAXIS statements. The VALUES= option enables you to specify tick locations; the VALUESDISPLAY= option enables you to specify text strings to display on an axis.

Use the VALUES= option to specify tick locations

You can use the VALUES= option on the XAXIS or YAXIS statement to specify a set of values at which to display ticks. If you want to specify a set of evenly spaced values, you can use the (start TO stop BY increment) syntax. For example, the following call to PROC SGPLOT specifies that the ticks on the Y axis should be specified at the locations -0.5, 0, 0.5, 1, ..., 3.

/* create data for the graph of the function y = x*exp(x) */
data Function;
do x = -5 to 1 by 0.05;
   y = x*exp(x);
   output;
end;
run;
 
ods graphics / width=480px height=360px;
title "Graph of y = x*exp(x)";
proc sgplot data=Function;
   series x=x y=y;
   yaxis grid values=(-0.5 to 3 by 0.5);   /* specify tick locations */
   xaxis grid;
run;

Notice that the Y axis has tick marks at the values that are specified by the VALUES= option.

Specify tick mark locations and labels

Sometimes you might want to highlight a specific value on a graph. For example, you might want to indicate where a graph has a maximum, minimum, or inflection point. The graph in the previous section has a minimum value at (x,y) = (1, -1/e) ≈ (1, -0.368). You can use the VALUES= option to specify a set of tick locations that include -0.368. Because nobody who reads the graph will associate -0.368 with the number -1/e, you might want to display the text string "-1/e" on the axis instead of -0.368.

The VALUESDISPLAY= option enables you to specify a text string for each value in the VALUES= list. The strings are used as labels for each tick mark. For example, the following call to PROC SGPLOT uses the VALUESDISPLAY= option to display the string "-1/e" at the location y = -0.368. To further emphasize that the value corresponds to the minimum value of the function, I use the DROPLINE statement to display line segments that extend from the point (1, -0,368) to each axis.

%let eInv = %sysfunc(exp(-1)); /* e inverse = exp(-1) */
title "Graph of y = x*exp(x)";
proc sgplot data=Function;
   series x=x y=y;
   dropline x= -1 y= -&eInv / dropto=both lineattrs=(color=coral);
   /* Draw coordinate axes in dark grey */
   refline 0 / axis=y lineattrs=(color=darkgrey);
   refline 0 / axis=x lineattrs=(color=darkgrey);
   /* ticks on the Y axis include -1/e */
   yaxis grid  max=1.5
         values        = (-1   -&eInv  0   0.5   1   1.5 )   /* locations */
         valuesdisplay = ("-1" "-1/e" "0" "0.5" "1" "1.5");  /* strings to display */
   xaxis grid;
run;

Use symbols for tick labels

The previous example displays the string "-1/e" at a tick mark. You can also display symbols on the axes. Back in 2011, Dan Heath discussed how to specify symbols by using Unicode values. Since SAS 9.4M3, you can also use Unicode symbols in user-defined formats. The following example defines the Unicode symbol for pi (π) and uses the symbol in the VALUESDISPLAY= option to label tick marks at π/2, π, 3π/2, and 2π.

/* create the graph of y = sin(x) on [0, 2*pi] */
data Trig;
pi = constant('pi');  drop pi;
do x = 0 to 2*pi by pi/25;
   y = sin(x);
   output;
end;
run;
 
%let piChar = (*ESC*){Unicode pi};              /* define pi symbol */
title "Graph of y = sin(x)";
proc sgplot data=Trig noautolegend;
   series x=x y=y;
   refline 0 / axis=y lineattrs=(color=darkgrey);
   xaxis grid 
         values        = (0    1.57        3.14      4.71         6.28 )     /* 0, pi/2, pi, 3pi/2, 2pi */
         valuesdisplay = ("0" "&piChar/2" "&piChar" "3&piChar/2" "2&piChar");  
run;

In this example, I used the VALUES= and VALUESDISPLAY= option on the XAXIS statement. Consequently, the X axis displays the symbols π/2, π, 3π/2, and 2π.

In summary, you can use the VALUES= option to specify a list of locations for ticks on an axis. You can use the (start TO stop BY increment) notation to specify evenly spaced ticks. For unevenly spaced ticks, you can list the ticks manually, such as (1 5 10 25 50 100). If you want to display text instead of numbers, you can use the VALUESDISPLAY= option, which enables you to specify arbitrary strings. You can use Unicode symbols in the VALUESDISPLAY= list.

The post Add custom tick marks to a SAS graph appeared first on The DO Loop.

3月 202020
 

When using SAS software, you might occasionally encounter a font-related issue. This post helps you debug the following five font issues:

  • Listing registered SAS fonts
  • Registering new fonts
  • Getting SAS SG procedures to use a new font
  • Circumventing an error indicating that the device driver cannot find any fonts
  • Resolving an error that references SASFont

1. How can I tell which fonts are registered to SAS?

To see which fonts are currently registered to SAS, submit the following code:


proc registry startat="\CORE\PRINTING\FREETYPE\FONTS" list levels=1;
run;

2. How can I register or add a new font to SAS?

If you want to use a new font with SAS, you first must register the font by using the FONTREG procedure. Here is example code:


proc fontreg mode=all msglevel=verbose;
fontfile “/path/fontname.ttf”;
run;

For example, if you are running SAS on Windows and want to register the Arial font, which resides in C:\Windows\Fonts, submit the following code:


proc fontreg mode=all msglevel=verbose;
fontfile “C:\Windows\Fonts\arial.ttf”;
run;

Note that the code above registers the new font in the user’s SASUSER directory. In this case, the font is registered only for the user who submits the PROC FONTREG code.

It is possible to register the font for all users. To accomplish this, the person submitting the code must be a SAS administrator who not only has Update access to the SASHELP directory but who also has exclusive Update access to SASHELP (so that no other users or processes can be using SAS at the time the code is run). The administrator must add the USESASHELP option to the PROC FONTREG statement. Here is example code:


proc fontreg mode=all msglevel=verbose usesashelp;
fontfile “/path/fontname.ttf”;
run;

3. When running on UNIX systems, how do I get the SAS SG procedures (such as SGPLOT) to recognize and use a new font?

To get the SAS SG procedures to use a new font on UNIX systems, complete these steps:

  1. Register the new font to SAS using PROC FONTREG (as described above).
  2. Place a copy of the TrueType font file in the following UNIX directory:

!SASROOT/SASPrivateJavaRuntimeEnvironment/9.4/jre/lib/fonts

Note: In the directory above, !SASROOT is your default SAS install directory.

4. Why do I get an error indicating that the device driver cannot find any fonts?

In certain situations, the following error might be written to the SAS log:

Error: The <device driver> driver cannot find any fonts.

This error typically occurs when the SAS system option FONTSLOC is not set properly. First, check the current value of the FONTSLOC system option by submitting the following code:


proc options option=fontsloc;
run;

The directory that the FONTSLOC option points to is written to the log. With a typical install of SAS, the FONTSLOC system option should point to the following directory: !SASROOT\ReportFontsforClients\9.4

Note: In the directory above, !SASROOT is your default SAS install directory.

If the FONTSLOC system option is not set correctly, edit your SAS configuration file and modify the value for the FONTSLOC option.

If the FONTSLOC system option is set correctly, make sure that the directory that the FONTSLOC option points to exists. If the directory does exist, check the contents of this directory, which should contain numerous TrueType font files (with an extension of .ttf). If this directory is missing as part of your SAS install or exists but contains no .ttf font files, contact SAS Technical Support.

5. When invoking the SAS Display Manager System (DMS) on Windows, why do I receive an error that references SASFont?

In certain situations, when invoking SAS DMS on Windows, you might receive the following error:

Error: The SAS system could not get metrics for font “SASFont”

This error is typically caused by a Windows issue that cannot be directly addressed from SAS software. However, to circumvent this issue, modify your SAS configuration file and add the following statement to the top of your configuration file:

-FONT “Courier New” 10

After saving the modified sasv9.cfg file, restart SAS (for this change to take effect). Note that your SAS configuration file is typically named sasv9.cfg and resides in the following Windows directory:

!SASROOT\nls\en

Note: In the directory above, !SASROOT is your default SAS install directory.

Summary

While the information above might not address all your font issues, it should cover most of the more common font issues that you are likely to run into. For detailed information about the FONTREG procedure, consult the SAS online documentation for PROC FONTREG.

How to debug 5 common SAS® software font issues was published on SAS Users.

3月 202020
 

Watch list screening has been one of the rules with highest false-positive rate. Watch list screening has been one of the pillars for know your customer (KYC) and anti-money laundering (AML) regulatory requirements since the beginning. It was introduced to prevent known criminals (or known high risk entities) from utilizing [...]

A practical guide to improve the effectiveness of watch list screening was published on SAS Voices by Nuth Ratanachu-ek

3月 192020
 

At SAS Press, we agree with the saying “The best things in life are free.” And one of the best things in life is knowledge. That’s why we offer free e-books to help you learn SAS or improve your skills. In this blog post, we will introduce you to one of our amazing titles that is absolutely free.

SAS Programming for R Users

Many data scientists today need to know multiple programming languages including SAS, R, and Python. If you already know basic statistical concepts and how to program in R but want to learn SAS, then SAS Programming for R Users by Jordan Bakerman was designed specifically for you! This free e-book explains how to write programs in SAS that replicate familiar functions and capabilities in R. This book covers a wide range of topics including the basics of the SAS programming language, how to import data, how to create new variables, random number generation, linear modeling, Interactive Matrix Language (IML), and many other SAS procedures. This book also explains how to write R code directly in the SAS code editor for seamless integration between the two tools.

The book is based on the free, 14-hour course of the same name offered by SAS Education available here. Keep reading to learn more about the differences between SAS and R.

SAS versus R

R is an object-oriented programming language. Results of a function are stored in an object and desired results are pulled from the object as needed. SAS revolves around the data table and uses procedures to create and print output. Results can be saved to a new data table.

Let’s briefly compare SAS and R in a general way. Look at the following table, which outlines some of the major differences between SAS and R.

Here are a few other things about SAS to note:

  • SAS has the flexibility to interact with objects. (However, the book focuses on procedural methods.)
  • SAS does not have a command line. Code must be run in order to return results.

SAS Programs

A SAS program is a sequence of one or more steps. A step is a sequence of SAS statements. There are only two types of steps in SAS: DATA and PROC steps.

  • DATA steps read from an input source and create a SAS data set.
  • PROC steps read and process a SAS data set, often generating an output report. Procedures can be called an umbrella term. They are what carry out the global analysis. Think of a PROC step as a function in R.

Every step has a beginning and ending boundary. SAS steps begin with either of the following statements:

  • a DATA statement
  • a PROC statement

After a DATA or PROC statement, there can be additional SAS statements that contain keywords that request SAS perform an operation or they can give information to the system. Think of them as additional arguments to a procedure. Statements always end with a semicolon!

SAS options are additional arguments and they are specific to SAS statements. Unfortunately, there is no rule to say what is a statement versus what is an option. Understanding the difference comes with a little bit of experience. Options can be used to do the following:

  • generate additional output like results and plots
  • save output to a SAS data table
  • alter the analytical method

SAS detects the end of a step when it encounters one of the following statements:

  • a RUN statement (for most steps)
  • a QUIT statement (for some procedures)

Most SAS steps end with a RUN statement. Think of the RUN statement as the right parentheses of an R function. The following table shows an example of a SAS program that has a DATA step and a PROC step. You can see that both SAS statements end with RUN statements, while the R functions begin and end with parentheses.

If you want to learn more about this book or any other free e-books from SAS Press, visit https://support.sas.com/en/books/free-books.html. Subscribe to our newsletter to get the latest information on new books.

Free e-book: SAS Programming for R Users was published on SAS Users.

3月 192020
 

Digital transformation. Yup, I said it. It's over-hyped, but it's also real and powerful. While our customer-obsessed world is being liquefied from physical assets into virtual assets, and analog processes into digital processes - the world is turning into bits and bytes of data. As this trends evolves, the role [...]

SAS Customer Intelligence 360: Analytics as a guiding light was published on Customer Intelligence Blog.

3月 182020
 

Let’s be honest, there is a lot of SAS content available on the web. Sometimes it gets difficult to navigate through everything to find what you need, especially if you are looking for complimentary resources.

Training budgets can be limited or already used for the year, but you’re still interested in learning a new SAS product or diving deeper into a specific subject to facilitate any current projects you are working on. Or you’re a real over-achiever (go, you!) and you’re looking to expand your personal SAS skills outside of your day-to-day work.

You start asking, “How do I find what I need?”

Don’t worry, SAS has you covered!

SAS learn & support

Let’s start with a favorite resource (in a Customer Success Manager’s opinion) – SAS’ learn and support pages. SAS recently released updated learn and support pages for SAS products. These pages provide a great overview of SAS’ product offerings, and they provide resources for those who are new to SAS or those looking to expand their knowledge. The learn and support pages cover the most current product release, information on getting started, tutorials, training courses, books, and documentation for current and past releases.

Not sure how to locate the learn and support page for the SAS product you are using? Search the SAS Product Support A to Z page and select the product of your choice.

SAS documentation

Browsing the web for resources is a great way to find answers to your SAS questions. But as mentioned previously, it can sometimes get tricky to find what you are looking for.

A great place to start your search is on the SAS documentation site. You can use the search bar to enter what you are looking for, or browse by products, titles or system requirements.

What’s new in SAS

You may have heard the saying, “There are three ways to do anything in SAS.” (Or four, or five or six!) Which raises the question, “How do I know what I’m doing is the most efficient?”

One way to stay on top of the most efficient way to do things is to stay current with your SAS knowledge. Knowing what’s new in SAS helps users know and understand what new features and enhancements are available. When a SAS product release occurs, SAS provides documentation on what’s new.

To know what’s new in the SAS release you’re using, check out the What’s New documentation. The documentation is broken into two parts: SAS 9.4 and SAS Viya 3.5. You can use the ‘Version’ tab on the left-hand side of the page to select the version currently installed at your organization.

If you are not sure what version you are running, you can run PROC PRODUCT_STATUS. This PROC will return what version numbers are running for the SAS products installed.

proc product_status;
run;

Another great resource to stay on top of what’s new from SAS is to check out SAS webinars. SAS offers live and on-demand webinars hosted by SAS experts. There are topics for every level of SAS user and every level of an organization, from SAS programmers to executives.

To attend a live webinar, select the webinar of your choice, register to attend, and you will be sent an email with the calendar invite.

If you’re interested in checking out an on-demand webinar, you can search by topic or industry to find a topic that fits what you’re looking for.

Looking for a webinar that focuses on a SAS tool? Check out the SAS Ask the Expert webinars. These are one-hour live and on-demand webinars for SAS users and administrators. The sessions cover a wide range of topics from what’s new in new releases of SAS products, to overviews on getting started, to tips and tricks that help take your SAS knowledge to the next level.

With SAS’ extensive catalog of webinars to choose from you will be a SAS pro in no time!

SAS training and education

Did you know that SAS offers free e-learning for some of our training courses? These courses are self-paced and cover a wide range of topics. With 180 days of access to these courses, it allows you to work through them at your own speed. It’s also very easy to get started!

Step 1: Select a course from the course library

Step 2: Sign into your SAS profile or create one

Step 3: Activate your product(s) and review the License Agreement

Step 4: Work through the course lessons

Step 5: Complete the course and receive your SAS digital Learn Badge and Course Completion Certificate

Leverage expertise worldwide

SAS recently released SAS Analytics Explorer. This is an interactive way to connect with other SAS professionals, expand your SAS knowledge, and access private SAS events and resource all while earning points that can be exchanged for rewards.

Are you up for the challenge? No really, are you? The SAS Analytics Explorer has fun and educational challenges that allow you to showcase your SAS skills to climb the ranks in the network. Show off your SAS talent and get some cool rewards while you’re at it!

Interested in joining? Fill out the form on the bottom of the SAS Analytics Explorer page to request an invitation.

Don’t forget about the SAS Communities! Connect with other SAS professionals and experts to ask questions, assist other SAS professionals with their questions, connect with users, and see what’s going on at SAS.

You can also connect with SAS on our website using the chat feature. We love SAS users, and we are here to help you!

Tips and resources for making the most of your SAS experience was published on SAS Users.