5月 152020
 

This post is the second in our Young Data Scientists series, featuring the motivations, work and advice of the next generation of data scientists. Be sure to check back for future posts, or read the whole series by clicking on the image to the right.   Kai Woon Goh is [...]

Leading advancement in industry and society for Malaysia was published on SAS Voices by Jelena Stankovic

5月 152020
 
This blog demonstrates how to modify your ODS HTML code to make your column headers “sticky,” or fixed in position. Using sticky headers is most beneficial when you have long tables on your web page and you want the column headers to stay in view while scrolling through the rest of the page. The ability to add sticky headers was added with CSS 2.1, with the cascading style sheet (CSS) position property and its sticky value. You might have seen this capability before CSS 2.1 because it was supported by WebKit, which is a browser engine that Apple developed and is used primarily in the Safari browser (In Safari, you use the position property with the value -webkit-sticky.) The position: sticky style property is supported in the latest versions of the major browsers, except for Internet Explorer. The FROZEN_HEADERS= option can be used with the TableEditor tagset; see the TableEditor tagset method below.

Before you start

Here is a brief explanation about the task that this blog helps you accomplish. Since the position: sticky style property is supported with the <TH> HTML tags within tables, it is very easy for you to add the position: sticky style for HTML tables that ODS HTML generates. When this CSS style attribute is added for the headers, the headers are fixed within the viewport, which is the viewable area. The content in the viewport is scrollable, as seen in the example output below.

In the past, JavaScript was the main tool for generating fixed headers that are compatible across browsers and devices. However, the position: sticky property has also made it easier to fix various other elements, such as footers, within the viewport on the web page. This blog demonstrates how to make the <TH> tag or .header class sticky but enable the rest of the web page to be scrolled. The techniques here work for both desktop and mobile applications. There are multiple ways to add this style. Choose the method that is most convenient for you.

Method 1: Use the HEADTEXT= option

This example uses the position: sticky style property for the .header class, which is added to the HEADTEXT= option in the ODS HTML statement. The .header class is added along with the position style property between the <HEAD> and </HEAD> tags, which is the header section of the web page. This method is very convenient. However, you are limited to 256 characters and you might want to add other CSS style properties. The position style property is added using the .header class name, which is used by ODS HTML to add style attributes to the column headers. As the name suggests, cascading elements cascade and enable elements with like names to be combined. In the following code example, the HEADTEXT= option uses a CSS rule with the .header class and the position: sticky property for the header section of the web page.

ods html path="c:\temp" file="sticky.html"
headtext="<style> .header {position: sticky;top:0}</style>";
 
proc print data=sashelp.cars;
run;
 
ods html close;

Here is what the output looks like:

Method 2: Use the STYLESHEET= option

You can also add the position: sticky property to the .header class from an external CSS file, which can be referenced in ODS HTML code by using the STYLESHEET= option with the (URL=) suboption. This method uses a CSS file as a basis for the formatting, unlike the first method above, which had applied the default HTMLBLUE style for the destination.

Another item worth mentioning in this second example is the grouping of the CSS class selectors, which match the style element names used with ODS and the TEMPLATE procedure. For example, the .body, .systemtitle, .header, .rowheader, and .data class selectors are added and grouped into the font-family style property. This method is also used for several of the other style properties below. The .data class adds some additional functionality worth discussing, such as the use of a pseudo style selector, which applies a different background color for even alternating rows. In the example below, the .class names and the template element names are the same. You should place the CSS style rules that are shown here in a file that is named sticky.css.

.body, .systemtitle, .header, .rowheader, .data { 
font-family: arial, sans-serif; 
}  
.systemtitle, .header, .rowheader { 
font-weight: bold
} 
.table, .header, .rowheader, .data { 
border-spacing: 0; 
border-collapse: collapse; 
border: 1px solid #606060;
} 
.table tbody tr:nth-child(even) td { 
background-color: #e0e0e0; 
color: black;
}
.header { 
background-color: #e0e0e0;
position: -webkit-sticky;
position: sticky;
top:0;
} 
.header, .rowheader, .data { 
padding: 5px 10px;
}

After you create that CSS file, you can use the ODS HTML statement with the STYLESHEET= option. In that option, the (URL=) suboption uses the sticky.css file as the basis for the formatting. Forgetting to add the (URL=) suboption re-creates a CSS file with the current template style that is being used.

ods html path="c:\temp" file="sticky.html"
   stylesheet=(url="sticky.css");
proc print data=sashelp.cars;
run; 
ods html close;

Here is what the output looks like:

The pseudo class selector in the CSS file indicated that even alternating rows for all <TD> tags would be colored with the background color gray. Also, the position: sticky property in the .header class fixed the position of the header within the viewport.

Method 3: Use the TableEditor tagset

A third method uses the TableEditor tagset, which enables sticky headers to be added by using options. Options are also applied to modify the style for the alternating even and odd rows as well as to have sortable headers.

/* Reference the TableEditor tagset from support.sas.com. */
filename tpl url "http://support.sas.com/rnd/base/ods/odsmarkup/tableeditor/tableeditor.tpl";
/* Insert the tagset into the search path for ODS templates. */
ods path(Prepend) work.templat(update);
%include tpl;
ods tagsets.tableeditor file="c:\output\temp.html" 
options(sticky_headers="yes"
sort="yes"
banner_color_even="#e0e0e0") style=htmlblue;
 
proc print data=sashelp.cars;
run;
 
ods tagsets.tableeditor close;

Here is what the output looks like:

In summary, this article describes how easy it is to add sticky headers to tables that are generated by using the ODS HTML destination. Adding fixed headers to any table allows the output to dynamically preserve the headers in the viewable area while scrolling through the table, allowing a much richer experience. Give it a try and let me know how it goes.

Learn More

How to Add Sticky Headers with ODS HTML was published on SAS Users.

5月 152020
 
This blog demonstrates how to modify your ODS HTML code to make your column headers “sticky,” or fixed in position. Using sticky headers is most beneficial when you have long tables on your web page and you want the column headers to stay in view while scrolling through the rest of the page. The ability to add sticky headers was added with CSS 2.1, with the cascading style sheet (CSS) position property and its sticky value. You might have seen this capability before CSS 2.1 because it was supported by WebKit, which is a browser engine that Apple developed and is used primarily in the Safari browser (In Safari, you use the position property with the value -webkit-sticky.) The position: sticky style property is supported in the latest versions of the major browsers, except for Internet Explorer. The FROZEN_HEADERS= option can be used with the TableEditor tagset; see the TableEditor tagset method below.

Before you start

Here is a brief explanation about the task that this blog helps you accomplish. Since the position: sticky style property is supported with the <TH> HTML tags within tables, it is very easy for you to add the position: sticky style for HTML tables that ODS HTML generates. When this CSS style attribute is added for the headers, the headers are fixed within the viewport, which is the viewable area. The content in the viewport is scrollable, as seen in the example output below.

In the past, JavaScript was the main tool for generating fixed headers that are compatible across browsers and devices. However, the position: sticky property has also made it easier to fix various other elements, such as footers, within the viewport on the web page. This blog demonstrates how to make the <TH> tag or .header class sticky but enable the rest of the web page to be scrolled. The techniques here work for both desktop and mobile applications. There are multiple ways to add this style. Choose the method that is most convenient for you.

Method 1: Use the HEADTEXT= option

This example uses the position: sticky style property for the .header class, which is added to the HEADTEXT= option in the ODS HTML statement. The .header class is added along with the position style property between the <HEAD> and </HEAD> tags, which is the header section of the web page. This method is very convenient. However, you are limited to 256 characters and you might want to add other CSS style properties. The position style property is added using the .header class name, which is used by ODS HTML to add style attributes to the column headers. As the name suggests, cascading elements cascade and enable elements with like names to be combined. In the following code example, the HEADTEXT= option uses a CSS rule with the .header class and the position: sticky property for the header section of the web page.

ods html path="c:\temp" file="sticky.html"
headtext="<style> .header {position: sticky;top:0}</style>";
 
proc print data=sashelp.cars;
run;
 
ods html close;

Here is what the output looks like:

Method 2: Use the STYLESHEET= option

You can also add the position: sticky property to the .header class from an external CSS file, which can be referenced in ODS HTML code by using the STYLESHEET= option with the (URL=) suboption. This method uses a CSS file as a basis for the formatting, unlike the first method above, which had applied the default HTMLBLUE style for the destination.

Another item worth mentioning in this second example is the grouping of the CSS class selectors, which match the style element names used with ODS and the TEMPLATE procedure. For example, the .body, .systemtitle, .header, .rowheader, and .data class selectors are added and grouped into the font-family style property. This method is also used for several of the other style properties below. The .data class adds some additional functionality worth discussing, such as the use of a pseudo style selector, which applies a different background color for even alternating rows. In the example below, the .class names and the template element names are the same. You should place the CSS style rules that are shown here in a file that is named sticky.css.

.body, .systemtitle, .header, .rowheader, .data { 
font-family: arial, sans-serif; 
}  
.systemtitle, .header, .rowheader { 
font-weight: bold
} 
.table, .header, .rowheader, .data { 
border-spacing: 0; 
border-collapse: collapse; 
border: 1px solid #606060;
} 
.table tbody tr:nth-child(even) td { 
background-color: #e0e0e0; 
color: black;
}
.header { 
background-color: #e0e0e0;
position: -webkit-sticky;
position: sticky;
top:0;
} 
.header, .rowheader, .data { 
padding: 5px 10px;
}

After you create that CSS file, you can use the ODS HTML statement with the STYLESHEET= option. In that option, the (URL=) suboption uses the sticky.css file as the basis for the formatting. Forgetting to add the (URL=) suboption re-creates a CSS file with the current template style that is being used.

ods html path="c:\temp" file="sticky.html"
   stylesheet=(url="sticky.css");
proc print data=sashelp.cars;
run; 
ods html close;

Here is what the output looks like:

The pseudo class selector in the CSS file indicated that even alternating rows for all <TD> tags would be colored with the background color gray. Also, the position: sticky property in the .header class fixed the position of the header within the viewport.

Method 3: Use the TableEditor tagset

A third method uses the TableEditor tagset, which enables sticky headers to be added by using options. Options are also applied to modify the style for the alternating even and odd rows as well as to have sortable headers.

/* Reference the TableEditor tagset from support.sas.com. */
filename tpl url "http://support.sas.com/rnd/base/ods/odsmarkup/tableeditor/tableeditor.tpl";
/* Insert the tagset into the search path for ODS templates. */
ods path(Prepend) work.templat(update);
%include tpl;
ods tagsets.tableeditor file="c:\output\temp.html" 
options(sticky_headers="yes"
sort="yes"
banner_color_even="#e0e0e0") style=htmlblue;
 
proc print data=sashelp.cars;
run;
 
ods tagsets.tableeditor close;

Here is what the output looks like:

In summary, this article describes how easy it is to add sticky headers to tables that are generated by using the ODS HTML destination. Adding fixed headers to any table allows the output to dynamically preserve the headers in the viewable area while scrolling through the table, allowing a much richer experience. Give it a try and let me know how it goes.

Learn More

How to Add Sticky Headers with ODS HTML was published on SAS Users.

5月 152020
 
This blog demonstrates how to modify your ODS HTML code to make your column headers “sticky,” or fixed in position. Using sticky headers is most beneficial when you have long tables on your web page and you want the column headers to stay in view while scrolling through the rest of the page. The ability to add sticky headers was added with CSS 2.1, with the cascading style sheet (CSS) position property and its sticky value. You might have seen this capability before CSS 2.1 because it was supported by WebKit, which is a browser engine that Apple developed and is used primarily in the Safari browser (In Safari, you use the position property with the value -webkit-sticky.) The position: sticky style property is supported in the latest versions of the major browsers, except for Internet Explorer. The FROZEN_HEADERS= option can be used with the TableEditor tagset; see the TableEditor tagset method below.

Before you start

Here is a brief explanation about the task that this blog helps you accomplish. Since the position: sticky style property is supported with the <TH> HTML tags within tables, it is very easy for you to add the position: sticky style for HTML tables that ODS HTML generates. When this CSS style attribute is added for the headers, the headers are fixed within the viewport, which is the viewable area. The content in the viewport is scrollable, as seen in the example output below.

In the past, JavaScript was the main tool for generating fixed headers that are compatible across browsers and devices. However, the position: sticky property has also made it easier to fix various other elements, such as footers, within the viewport on the web page. This blog demonstrates how to make the <TH> tag or .header class sticky but enable the rest of the web page to be scrolled. The techniques here work for both desktop and mobile applications. There are multiple ways to add this style. Choose the method that is most convenient for you.

Method 1: Use the HEADTEXT= option

This example uses the position: sticky style property for the .header class, which is added to the HEADTEXT= option in the ODS HTML statement. The .header class is added along with the position style property between the <HEAD> and </HEAD> tags, which is the header section of the web page. This method is very convenient. However, you are limited to 256 characters and you might want to add other CSS style properties. The position style property is added using the .header class name, which is used by ODS HTML to add style attributes to the column headers. As the name suggests, cascading elements cascade and enable elements with like names to be combined. In the following code example, the HEADTEXT= option uses a CSS rule with the .header class and the position: sticky property for the header section of the web page.

ods html path="c:\temp" file="sticky.html"
headtext="<style> .header {position: sticky;top:0}</style>";
 
proc print data=sashelp.cars;
run;
 
ods html close;

Here is what the output looks like:

Method 2: Use the STYLESHEET= option

You can also add the position: sticky property to the .header class from an external CSS file, which can be referenced in ODS HTML code by using the STYLESHEET= option with the (URL=) suboption. This method uses a CSS file as a basis for the formatting, unlike the first method above, which had applied the default HTMLBLUE style for the destination.

Another item worth mentioning in this second example is the grouping of the CSS class selectors, which match the style element names used with ODS and the TEMPLATE procedure. For example, the .body, .systemtitle, .header, .rowheader, and .data class selectors are added and grouped into the font-family style property. This method is also used for several of the other style properties below. The .data class adds some additional functionality worth discussing, such as the use of a pseudo style selector, which applies a different background color for even alternating rows. In the example below, the .class names and the template element names are the same. You should place the CSS style rules that are shown here in a file that is named sticky.css.

.body, .systemtitle, .header, .rowheader, .data { 
font-family: arial, sans-serif; 
}  
.systemtitle, .header, .rowheader { 
font-weight: bold
} 
.table, .header, .rowheader, .data { 
border-spacing: 0; 
border-collapse: collapse; 
border: 1px solid #606060;
} 
.table tbody tr:nth-child(even) td { 
background-color: #e0e0e0; 
color: black;
}
.header { 
background-color: #e0e0e0;
position: -webkit-sticky;
position: sticky;
top:0;
} 
.header, .rowheader, .data { 
padding: 5px 10px;
}

After you create that CSS file, you can use the ODS HTML statement with the STYLESHEET= option. In that option, the (URL=) suboption uses the sticky.css file as the basis for the formatting. Forgetting to add the (URL=) suboption re-creates a CSS file with the current template style that is being used.

ods html path="c:\temp" file="sticky.html"
   stylesheet=(url="sticky.css");
proc print data=sashelp.cars;
run; 
ods html close;

Here is what the output looks like:

The pseudo class selector in the CSS file indicated that even alternating rows for all <TD> tags would be colored with the background color gray. Also, the position: sticky property in the .header class fixed the position of the header within the viewport.

Method 3: Use the TableEditor tagset

A third method uses the TableEditor tagset, which enables sticky headers to be added by using options. Options are also applied to modify the style for the alternating even and odd rows as well as to have sortable headers.

/* Reference the TableEditor tagset from support.sas.com. */
filename tpl url "http://support.sas.com/rnd/base/ods/odsmarkup/tableeditor/tableeditor.tpl";
/* Insert the tagset into the search path for ODS templates. */
ods path(Prepend) work.templat(update);
%include tpl;
ods tagsets.tableeditor file="c:\output\temp.html" 
options(sticky_headers="yes"
sort="yes"
banner_color_even="#e0e0e0") style=htmlblue;
 
proc print data=sashelp.cars;
run;
 
ods tagsets.tableeditor close;

Here is what the output looks like:

In summary, this article describes how easy it is to add sticky headers to tables that are generated by using the ODS HTML destination. Adding fixed headers to any table allows the output to dynamically preserve the headers in the viewable area while scrolling through the table, allowing a much richer experience. Give it a try and let me know how it goes.

Learn More

How to Add Sticky Headers with ODS HTML was published on SAS Users.

5月 132020
 

As we continue the fight against COVID-19 and reflect on the pandemic response, it’s in our nature to look for opportunities to work together and help others. Indeed, we’ve seen numerous acts of heroism alongside tragic stories of loss. In the research community, we are inspired by efforts to find [...]

Can data sharing accelerate research in the fight against COVID-19? was published on SAS Voices by Jeremy Racine

5月 132020
 

This article shows how to find local maxima and maxima on a regression curve, which means finding points where the slope of the curve is zero. An example appears at the right, which shows locations where the loess smoother in a scatter plot has local minima and maxima. Except for simple cases like quadratic regression, you need to use numerical techniques to locate these values.

In a previous article, I showed how to use SAS to evaluate the slope of a regression curve at specific points. The present article applies that technique by scoring the regression curve on a fine grid of point. You can use finite differences to approximate the slope of the curve at each point on the grid. You can then estimate the locations where the slope is zero.

In this article, I use the LOESS procedure to demonstrate the technique, but the method applies equally well to any one-dimensional regression curve. There are several ways to score a regression model:

  • Most parametric regression procedures in SAS (GLM, GLIMMIX, MIXED, ...) support the STORE statement. The STORE statement saves a representation of the model in a SAS item store. You can use PROC PLM to score a model from an item store.
  • Some nonparametric regression procedures in SAS do not support the STORE statement but support a SCORE statement. PROC ADAPTIVEREG and PROC LOESS are two examples. This article shows how to use the SCORE statement to find points where the regression curve has zero slope.
  • Some regression procedures in SAS do not support either the STORE or SCORE statements. For those procedures, you need to use the use the missing value trick to score the model.

The technique in this article will not detect inflection points. An inflection point is a location where the curve has zero slope but is not a local min or max. Consequently, this article is really about "how to find a point where a regression curve has a local extremum," but I will use the slightly inaccurate phrase "find points where the slope is zero."

How to find locations where the slope of the curve is zero?

For convenience, I assume the explanatory variable is named X and the response variable is named Y. The goal is to find locations where a nonparametric curve (x, f(x)) has zero slopes, where f(x) is the regression model. The general outline follows:

  1. Create a grid of points in the range of the explanatory variable. The grid does not have to be evenly spaced, but it is in this example.
  2. Score the model at the grid locations.
  3. Use finite differencing to approximate the slope of the regression curve at the grid points. If the slope changes sign between consecutive grid points, estimate the location between the grid points where the slope is exactly zero. Use linear interpolation to approximate the response at that location.
  4. Optionally, graph the original data, the regression curve, and the point along the curve where the slope is zero.

Example data

SAS distributes the ENSO data set in the SASHelp library. You can create a DATA step view that renames the explanatory and response variables to X and Y, respectively, so that it is easier to follow the logic of the program:

/* Create VIEW where x is the independent variable and y is the response */
data Have / view=Have;
set Sashelp.Enso(rename=(Month=x Pressure=y));
keep x y;
run;

Create a grid of points

After the data set is created, you can use PROC SQL to find the minimum and maximum values of the explanatory variable. You can create an evenly spaced grid of points for the range of the explanatory variable.

/* Put min and max into macro variables */
proc sql noprint;
  select min(x), max(x) into :min_x, :max_x 
  from Have;
quit;
 
/* Evaluate the model and estimate derivatives at these points */
data Grid;
dx = (&max_x - &min_x)/201;    /* choose the step size wisely */
do x = &min_x to &max_x by dx;   
   output;
end;
drop dx;
run;

Score the model at the grid locations

This is the step that will vary from procedure to procedure. You have to know how to use the procedure to score the regression model on the points in the Grid data set. The LOESS procedure supports a SCORE statement, so the call fits the model and scores the model on the Grid data set:

/* Score the model on the grid */
ods select none;    /* do not display the tables */
proc loess data=Have plots=none;
   model y = x;
   score data=Grid;  /* PROC LOESS does not support an OUT= option */
   /* Most procedures support an OUT= option to save the scored values.
      PROC LOESS displays the scored values in a table, so use ODS to
      save the table to an output data set */
   ods output ScoreResults=ScoreOut;
run;
ods select all;

If a procedure supports the STORE statement, you can use PROC PLM to score the model on the data. The SAS program that accompanies this article includes an example that uses the GAMPL procedure. The GAMPL procedure does not support the STORE or SCORE statements, but you can use the missing value trick to find zero derivatives.

Find the locations where the slope is zero

This is the mathematical portion of the computation. You can use a backward difference scheme to estimate the derivative (slope) of the curve. If (x0, y0) and (x1, y1) are two consecutive points along the curve (in the ScoreOut data set), then the slope at (x1, y1) is approximately m = (y1 - y0) / (x1 - x0). When the slope changes sign between consecutive points, it indicates that the slope changed from positive to negative (or vice versa) between the points. If the slope is continuous, it must have been exactly zero somewhere on the interval. You can use a linear approximation to find the point, t, where the slope is zero. You can then use linear interpolation to approximate the point (t, f(t)) at which the curve is a local min or max.

You can use the following SAS DATA step to process the scoring data, approximate the slope, and estimate where the slope of the curve is zero:

/* Compute slope by using finite difference formula. */
data Deriv0;
set ScoreOut;
Slope = dif(p_y) / dif(x);      /* (f(x) - f(x-dx)) / dx */
xPrev = lag(x);  yPrev = lag(p_y);  SlopePrev = lag(Slope);
if n(SlopePrev) AND sign(SlopePrev) ^= sign(Slope) then do;
   /* The slope changes sign between this obs and the previous.
      Assuming linearity on the interval, find (t, f(t))
      where slope is exactly zero */
   t0 = xPrev - SlopePrev * (x - xPrev)/(Slope - SlopePrev); 
   /* use linear interpolation to find the corresponding y value:
      f(t) ~ y0 + (y1-y0)/(x1-x0) * (t - x0)       */
   f_t0 = yPrev + (yPrev - p_y)/(x - xPrev) * (t0 - xPrev);
   if sign(SlopePrev) > 0 then _Type_ = "Max";
   else _Type_ = "Min";
   output; 
end;
keep t0 f_t0 _Type_;
label f_t0 = "f(t0)";
run;
 
proc print data=Deriv0 label;
run;

The table shows that there are seven points at which the derivative of the loess regression curve has a local min or max.

Graph the results

If you want to display the local extreme on the graph of the regression curve, you can concatenate the original data, the regression curve, and the local extreme. You can then use PROC SGPLOT to overlay the three layers. The resulting graph is shown at the top of this article.

data Combine;
merge Have                           /* data   : (x, y)     */
      ScoreOut(rename=(x=t p_y=p_t)) /* curve  : (t, p_t)   */
      Deriv0;                        /* extrema: (t0, f_t0) */
run;
 
title "Loess Smoother";
title2 "Red Markers Indicate Zero Slope for Smoother";
proc sgplot data=Combine noautolegend;
   scatter x=x y=y;
   series x=t y=p_t / lineattrs=GraphData2;
   scatter x=t0 y=f_t0 / markerattrs=(symbol=circlefilled color=red);
   yaxis grid; 
run;

Summary

In summary, if you can evaluate a regression curve on a grid of points, you can approximate the slope at each point along the curve. By looking for when the slope changes sign, you can find local minima and maxima. You can then use a simple linear estimator on the interval to estimate where the slope is exactly zero.

You can download the SAS program that performs the computations in this article.

The post Find points where a regression curve has zero slope appeared first on The DO Loop.

5月 122020
 

You’ve chosen the right class, added-to-cart, and hit submit.

You’re committed – now what?

Once you book a class with us, no matter the format, you can expect an email confirming your request within 24 hours. For instructor-led training courses, a reminder email is sent 3-5 days before the course is set to begin providing access to the course notes and instructions on what will happen the first day. SAS Live Web course instructions include tasks to perform to ensure your system is set up properly. If you’re taking in-person, classroom training, you can expect an email with guidelines for the specific training center location with the address and travel or parking tips.

For e-Learners, you can start right away! Your confirmation email will give you a link to your personal My Training page where you’ll log in to access your training – anytime, anywhere.

Depending on the course level, you may be asked if you’ve met all the prerequisites. Maybe you’ll even take a training assessment. We’re always available to answer your questions and want you to be 100% satisfied with your course, so reach out and we’ll be sure you’re in the correct class.

The time has come – class is starting.

First day jitters? Nah, we’ve got you covered. The reminder email you’ll receive has all the tools you need to get started. So, relax and just show up! SAS instructors are some of the best teachers in the business – and you can be assured they know their stuff. You’ll learn tips and tricks, even when they’re reviewing familiar content!

Live Web classes are as interactive as traditional classroom training. With our state-of-the-art technology, you’ll interact with the instructor and classmates throughout the course and have access to a virtual lab with the software and data. As you noticed when you registered, the class layout varies – sometimes you have full-day training and sometimes the class is split into half-day sessions over a longer time period. Always check the times to be sure you log in to the right time zone.

One of the greatest things about SAS instructors is their diversity – we really love to encourage uniqueness, so our classes vary a bit. Each instructor has their own way of breaking down the course, and much of it will depend on you, the students who make up the class. So, be ready to speak up and share what you know, what you don’t, and what you want to accomplish.

What remains the same across the board is the fact that you’ll undoubtedly walk away with several ah-ha moments. Expect lectures interspersed with mathematical details on the algorithms used in the demos. You’ll have quizzes and exercises that take it to the next level. Don’t worry, there’s always room for Q&A, and the instructors make themselves available 30 minutes before and after class to answer questions. And, we’re all human, so expect some breaks. Full-day classes will also have a lunch hour.

As you approach the end of your training, reflect on and realize the accomplishments you’ve achieved – including all the new SAS skills you have to show off!

Success! You finished the course.

But that doesn’t mean the fun ends!

That’s right, you’ve only scratched the surface – to really solidify your skills, you must use what you learned. Most classes have Extended Learning Pages, which you’ll get access to in your Thank you email after class. As you practice your newfound knowledge you may have questions. While you probably have someone at work who can assist, most instructors encourage students to email them when questions arise.

If you were part of an onsite course or just have a group of people working on similar tasks, it might be a good idea to schedule a mentoring session with an instructor. While this is not free, it’s invaluable to see SAS in action using your own data.

There are plenty of other great resources available free of charge, right at your fingertips.

  • SAS Communities is a great place to go for discussion boards – search for a topic or start your own thread.
  • Subscribe to the SAS Users YouTube channel. There are tons of amazing videos done by our subject matter experts and some renowned guests. New content is released every other Monday.
  • Find your path – with so many amazing instructors, we’re bound to have lots to offer. Check out all our learning paths and pick what’s right for you.

So, track your progress, earn Learn Badges and prepare for a globally recognized SAS Certification. Then, see where it leads.

What to expect when you take SAS training: Before, during and after was published on SAS Users.

5月 112020
 

I recently showed how to use linear interpolation in SAS. Linear interpolation is a common way to interpolate between a set of planar points, but the interpolating function (the interpolant) is not smooth. If you want a smoother interpolant, you can use cubic spline interpolation. This article describes how to use a cubic spline interpolation in SAS.

As mentioned in my previous post, an interpolation requires two sets of numbers:

  1. Data: Let (x1, y1), (x2, y2), ..., (xn, yn) be a set of n data points. These sample data should not contain any missing values. The data must be ordered so that x1 < x2 < ... < xn.
  2. Values to score: Let {t1, t2, ..., tk} be a set of k new values for the X variable. For interpolation, all values must be within the range of the data: x1 ≤ ti ≤ xn for all i. The goal of interpolation is to produce a new Y value for each value of ti. The scoring data is also called the "query data."

The following SAS DATA steps define the data for this example. The POINTS data set contains the sample data, which are shown as blue markers on the graph to the right. The SCORE data set contains the scoring data, which are shown as red tick marks along the horizontal axis.

/* Example dats for 1-D interpolation */
data Points;  /* these points define the model */
input x y;
datalines;
0  1
1  3
4  5
5  4
7  6
8  3
10 3
;
 
data Score; /* these points are to be interpolated */
input t @@;
datalines;
2 -1 4.8 0 0.5 1 9 5.3 7.1 10.5 9
;

On the graph, the blue curve is the cubic spline interpolant. Every point that you interpolate will be on that curve. The red asterisks are the interpolated values for the values in the SCORE data set. Notice that points -1 and 10.5 are not interpolated because they are outside of the data range. The following section shows how to compute the cubic spline interpolation in SAS.

Cubic spline interpolation in SAS

A linear interpolation uses a linear function on each interval between the data points. In general, the linear segments do not meet smoothly: the resulting interpolant is continuous but not smooth. In contrast, spline interpolation uses a polynomial function on each interval and chooses the polynomials so that the interpolant is smooth where adjacent polynomials meet. For polynomials of degree k, you can match the first k – 1 derivatives at each data point.

A cubic spline is composed of piecewise cubic polynomials whose first and second derivatives match at each data point. Typically, the second derivatives at the minimum and maximum of the data are set to zero. This kind of spline is known as a "natural cubic spline" with knots placed at each data point.

I have previously shown how use the SPLINE call in SAS/IML to compute a smoothing spline. A smoothing spline is not an interpolant because it does not pass through the original data points. However, you can get interpolation by using the SMOOTH=0 option. Adding the TYPE='zero' option results in a natural cubic spline.

For more control over the interpolation, you can use the SPLINEC function ('C' for coefficients) to fit the cubic splines to the data and obtain a matrix of coefficients. You can then use that matrix in the SPLINEV function ('V' for value) to evaluate the interpolant at the locations in the scoring data.

The following SAS/IML function (CubicInterp) computes the spline coefficients from the sample data and then interpolates the scoring data. The details of the computation are provided in the comments, but you do not need to know the details in order to use the function to interpolate data:

/* Cubic interpolating spline in SAS.
   The interpolation is based on the values (x1,y1), (x2,y2), ..., (xn, yn).
   The X  values must be nonmissing and in increasing order: x1 < x2 < ... < xn
   The values of the t vector are interpolated.
*/
proc iml;
start CubicInterp(x, y, t);
   d = dif(x, 1, 1);                     /* check that x[i+1] > x[i] */
   if any(d<=0) then stop "ERROR: x values must be nonmissing and strictly increasing.";
   idx = loc(t>=min(x) && t<=max(x));    /* check for valid scoring values */
   if ncol(idx)=0 then stop "ERROR: No values of t are inside the range of x.";
 
   /* fit the cubic model to the data */
   call splinec(splPred, coeff, endSlopes, x||y) smooth=0 type="zero";
 
   p = j(nrow(t)*ncol(t), 1, .);       /* allocate output (prediction) vector */
   call sortndx(ndx, colvec(t));       /* SPLINEV wants sorted data, so get sort index for t */
   sort_t = t[ndx];                    /* sorted version of t */
   sort_pred = splinev(coeff, sort_t); /* evaluate model at (sorted) points of t */
   p[ndx] = sort_pred[,2];             /* "unsort" by using the inverse sort index */
   return( p );
finish;
 
/* example of linear interpolation in SAS */
use Points; read all var {'x' 'y'}; close;
use Score; read all var 't'; close;
 
pred = CubicInterp(x, y, t);
create PRED var {'t' 'pred'}; append; close;
QUIT;

The visualization of the interpolation is similar to the code in the previous article, so the code is not shown here. However, you can download the SAS program that performs the cubic interpolation and creates the graph at the top of this article.

Although cubic spline interpolation is slower than linear interpolation, it is still fast: The CubicInterp program takes about 0.75 seconds to fit 1000 data points and interpolate one million scoring values.

Summary

In summary, the SAS/IML language provides the computational tools for cubic spline interpolation. The CubicInterp function in this article encapsulates the functionality so that you can perform cubic spline interpolation of your data in an efficient manner.

The post Cubic spline interpolation in SAS appeared first on The DO Loop.

5月 062020
 

During this coronavirus pandemic, there are many COVID-related graphs and curves in the news and on social media. The public, politicians, and pundits scrutinize each day's graphs to determine which communities are winning the fight against coronavirus.

Interspersed among these many graphs is the oft-repeated mantra, "Flatten the curve!" As people debate whether to reopen businesses and schools, you might hear some people claim, "we have flattened the curve," whereas others argue, "we have not yet flattened the curve." But what is THE curve to which people are referring? And when does a flat curve indicate success against the coronavirus pandemic?

This article discusses "flattening the curve" in the context of three graphs:

  1. A "what if" epidemiological curve.
  2. The curve of cumulative cases versus time.
  3. A smoothing curve added to a graph of new cases versus time. Spoiler alert: This is the WRONG curve to flatten! You want this curve to be decreasing, not flat.

Before you read further, I strongly encourage you to watch the two-minute video "What does 'flattening the curve' actually mean?" by the Australian Academy of Science. It might be the best-spent two minutes of your day.

Flattening the "what if" epidemiological curve

In 2007, the US Centers for Disease Control and Prevention (CDC) published a report on pre-pandemic planning guidance, which discussed how "nonpharmaceutical interventions" (NPIs) can mitigate the spread of viral infections. Nonpharmaceutical interventions include isolating infected persons, social distancing, hand hygiene, and more. The report states (p. 28), "NPIs can delay and flatten the epidemic peak.... [Emphasis added.] Delay of the epidemic peak is critically important because it allows additional time for vaccine development and antiviral production."

This is the original meaning of "flatten the curve." It has to do with the graph of new cases versus time. A graph appeared on p. 18, but the following graph is Figure 1 in the updated 2017 guidelines:

FIGURE 1. Goals of community mitigation for pandemic influenza (CDC, 2007 and 2017)

This image represents a "what if" scenario. The purple curve on the left is a hypothetical curve that represents rapid spread. It shows what might happen if the public does NOT adopt public-health measures to slow the spread of the virus. This is "the curve" that we wish to flatten. The smaller curve on the right represents a slower spread. It shows what can happen if society adopts public-health interventions. The peak of the second curve has been delayed (moved to a later date) and reduced (fewer cases). This second curve is the "flattened curve."

Because the curve that we are trying to flatten (the rapid-spread curve) is unobservable, how can we measure success? One measure of success is if the number of daily cases remains less than the capacity of the healthcare system. Another is that the total number of affected persons is less than predicted under the no-intervention model.

If the public adopts measures that mitigate the spread, the observed new-cases-versus-time curve should look more like the slower-spread curve on the right and less like the tall rapid-spread curve on the left. Interestingly, I have heard people complain that the actual numbers of infections, hospitalizations, and deaths due to COVID-19 are lower than initially projected. They argue that these low numbers are evidence that the mathematical models were wrong. The correct conclusion is that the lower numbers are evidence that social distancing and other NPIs helped to slow the spread.

Flattening the curve of cumulative cases versus time

The previous section discusses the new-cases-by-time graph. Another common graph in the media is the cumulative number of confirmed cases. I have written about the relationship between the new-cases-by-time graph and the cumulative-cases-by-time graph. In brief, the slope of the cumulative-cases-by-time graph equals the height of the new-cases-by-time graph.

For the rapid-spread curve, the associated cumulative curve is very steep, then levels out at a large number (the total number of cases). For the slower-spread curve, the associated cumulative curve is not very steep. It climbs gradually and eventually then levels out at a smaller number of total cases.

When you read that some country (such as South Korea, Australia, or New Zealand) has flattened the curve, the curve that usually accompanies the article is the cumulative-cases-by-time graph. A cumulative curve that increases slowly means that the rate of new cases is small. A cumulative curve that has zero slope means that no new cases are being confirmed. This is definitely a good thing!

A hypothetical scenario is shown below. The upper graph shows the cumulative cases. The lower graph shows the new cases on the same time scale. After Day 60, there are very few new cases. Accordingly, the curve of cumulative cases is flat.

Thus, a cumulative curve that increases slowly and then levels out is analogous to the slower-the-spread epidemiological curve. When the cumulative curve becomes flat, it indicates that new cases are zero or nearly zero.

The wrong curve: Flattening new cases versus time

Unfortunately, not every curve that is flat indicates that conditions are improving. The primary example is the graph of new cases versus time. A constant rate of new cases is not good—although it is certainly better than an increasing rate. To defeat the pandemic, the number of new cases each day should be decreasing. For a new-cases-by-time curve, a downward-sloping curve is the goal, not a flat curve.

The following graph shows new cases by day in the US (downloaded from the New York Times data on 03May2020). I have added a seven-day rolling average to show a smoothed version of the data:

The seven-day average is no longer increasing. It has leveled out. However, this is NOT an example of a "flattened" curve. This graph shows that the average number of new cases was approximately constant (or perhaps slightly declining) for the last three weeks of April. The fact that the curve is approximately horizontal indicates that approximately 28,000 new cases of coronavirus are being confirmed every day.

The cumulative graph for the same US data is shown below. The slope of the cumulative curve in April 2020 is approximately 28,000 cases per day. This cumulative curve is not flattening, but healthcare workers and public-health officials are all working to flatten it.

Sometimes graphs do not have well-labeled axes, so be sure to identify which graph you are viewing. For a curve of cumulative cases, flatness is "good": there are very few new cases. For a curve of new cases, a downward-sloping curve is desirable.

Summary

In summary, this article discusses "flattening the curve," which is often used without specifying what "the curve" actually is. In the classic epidemiological model, "the curve" is the hypothetical number of new cases versus time under the assumption that society does not adopt public-health measures to slow the spread. A flattened curve refers to the effect of interventions that delay and reduce the peak number of cases.

You cannot observe a hypothetical curve, but you can use the curve of cumulative cases to assess the spread of the disease. A cumulative curve that increases slowly and flattens out indicates a community that has slowed the spread. So, conveniently, you can apply the phrase "flatten the curve" to the curve of cumulative cases.

Note that you do not want the graph of new cases versus time to be flat. You want that curve to decrease towards zero.

You can download the SAS program used to create graphs in this article.


LEARN MORE | See all Coronavirus dashboard blog posts

The post What does 'flatten the curve' mean? To which curve does it apply? appeared first on The DO Loop.

5月 062020
 

During this coronavirus pandemic, there are many COVID-related graphs and curves in the news and on social media. The public, politicians, and pundits scrutinize each day's graphs to determine which communities are winning the fight against coronavirus.

Interspersed among these many graphs is the oft-repeated mantra, "Flatten the curve!" As people debate whether to reopen businesses and schools, you might hear some people claim, "we have flattened the curve," whereas others argue, "we have not yet flattened the curve." But what is THE curve to which people are referring? And when does a flat curve indicate success against the coronavirus pandemic?

This article discusses "flattening the curve" in the context of three graphs:

  1. A "what if" epidemiological curve.
  2. The curve of cumulative cases versus time.
  3. A smoothing curve added to a graph of new cases versus time. Spoiler alert: This is the WRONG curve to flatten! You want this curve to be decreasing, not flat.

Before you read further, I strongly encourage you to watch the two-minute video "What does 'flattening the curve' actually mean?" by the Australian Academy of Science. It might be the best-spent two minutes of your day.

Flattening the "what if" epidemiological curve

In 2007, the US Centers for Disease Control and Prevention (CDC) published a report on pre-pandemic planning guidance, which discussed how "nonpharmaceutical interventions" (NPIs) can mitigate the spread of viral infections. Nonpharmaceutical interventions include isolating infected persons, social distancing, hand hygiene, and more. The report states (p. 28), "NPIs can delay and flatten the epidemic peak.... [Emphasis added.] Delay of the epidemic peak is critically important because it allows additional time for vaccine development and antiviral production."

This is the original meaning of "flatten the curve." It has to do with the graph of new cases versus time. A graph appeared on p. 18, but the following graph is Figure 1 in the updated 2017 guidelines:

FIGURE 1. Goals of community mitigation for pandemic influenza (CDC, 2007 and 2017)

This image represents a "what if" scenario. The purple curve on the left is a hypothetical curve that represents rapid spread. It shows what might happen if the public does NOT adopt public-health measures to slow the spread of the virus. This is "the curve" that we wish to flatten. The smaller curve on the right represents a slower spread. It shows what can happen if society adopts public-health interventions. The peak of the second curve has been delayed (moved to a later date) and reduced (fewer cases). This second curve is the "flattened curve."

Because the curve that we are trying to flatten (the rapid-spread curve) is unobservable, how can we measure success? One measure of success is if the number of daily cases remains less than the capacity of the healthcare system. Another is that the total number of affected persons is less than predicted under the no-intervention model.

If the public adopts measures that mitigate the spread, the observed new-cases-versus-time curve should look more like the slower-spread curve on the right and less like the tall rapid-spread curve on the left. Interestingly, I have heard people complain that the actual numbers of infections, hospitalizations, and deaths due to COVID-19 are lower than initially projected. They argue that these low numbers are evidence that the mathematical models were wrong. The correct conclusion is that the lower numbers are evidence that social distancing and other NPIs helped to slow the spread.

Flattening the curve of cumulative cases versus time

The previous section discusses the new-cases-by-time graph. Another common graph in the media is the cumulative number of confirmed cases. I have written about the relationship between the new-cases-by-time graph and the cumulative-cases-by-time graph. In brief, the slope of the cumulative-cases-by-time graph equals the height of the new-cases-by-time graph.

For the rapid-spread curve, the associated cumulative curve is very steep, then levels out at a large number (the total number of cases). For the slower-spread curve, the associated cumulative curve is not very steep. It climbs gradually and eventually then levels out at a smaller number of total cases.

When you read that some country (such as South Korea, Australia, or New Zealand) has flattened the curve, the curve that usually accompanies the article is the cumulative-cases-by-time graph. A cumulative curve that increases slowly means that the rate of new cases is small. A cumulative curve that has zero slope means that no new cases are being confirmed. This is definitely a good thing!

A hypothetical scenario is shown below. The upper graph shows the cumulative cases. The lower graph shows the new cases on the same time scale. After Day 60, there are very few new cases. Accordingly, the curve of cumulative cases is flat.

Thus, a cumulative curve that increases slowly and then levels out is analogous to the slower-the-spread epidemiological curve. When the cumulative curve becomes flat, it indicates that new cases are zero or nearly zero.

The wrong curve: Flattening new cases versus time

Unfortunately, not every curve that is flat indicates that conditions are improving. The primary example is the graph of new cases versus time. A constant rate of new cases is not good—although it is certainly better than an increasing rate. To defeat the pandemic, the number of new cases each day should be decreasing. For a new-cases-by-time curve, a downward-sloping curve is the goal, not a flat curve.

The following graph shows new cases by day in the US (downloaded from the New York Times data on 03May2020). I have added a seven-day rolling average to show a smoothed version of the data:

The seven-day average is no longer increasing. It has leveled out. However, this is NOT an example of a "flattened" curve. This graph shows that the average number of new cases was approximately constant (or perhaps slightly declining) for the last three weeks of April. The fact that the curve is approximately horizontal indicates that approximately 28,000 new cases of coronavirus are being confirmed every day.

The cumulative graph for the same US data is shown below. The slope of the cumulative curve in April 2020 is approximately 28,000 cases per day. This cumulative curve is not flattening, but healthcare workers and public-health officials are all working to flatten it.

Sometimes graphs do not have well-labeled axes, so be sure to identify which graph you are viewing. For a curve of cumulative cases, flatness is "good": there are very few new cases. For a curve of new cases, a downward-sloping curve is desirable.

Summary

In summary, this article discusses "flattening the curve," which is often used without specifying what "the curve" actually is. In the classic epidemiological model, "the curve" is the hypothetical number of new cases versus time under the assumption that society does not adopt public-health measures to slow the spread. A flattened curve refers to the effect of interventions that delay and reduce the peak number of cases.

You cannot observe a hypothetical curve, but you can use the curve of cumulative cases to assess the spread of the disease. A cumulative curve that increases slowly and flattens out indicates a community that has slowed the spread. So, conveniently, you can apply the phrase "flatten the curve" to the curve of cumulative cases.

Note that you do not want the graph of new cases versus time to be flat. You want that curve to decrease towards zero.

You can download the SAS program used to create graphs in this article.


LEARN MORE | See all Coronavirus dashboard blog posts

The post What does 'flatten the curve' mean? To which curve does it apply? appeared first on The DO Loop.