Numerical Analysis

3月 082023
 

Happy Pi Day! Every year on March 14th (written 3/14 in the US), people in the mathematical sciences celebrate "all things pi-related" because 3.14 is the three-decimal approximation to π ≈ 3.14159265358979....

Modern computer methods and algorithms enable us to calculate 100 trillion digits of π. However, I think it is fascinating to learn about how mathematicians in the pre-computer age were able to approximate π to many decimal places through cleverness and manual paper-and-pencil calculations.

One of best mathematicians of all time was Isaac Newton, who calculated 16 digits of π by hand way back in 1666. Newton accomplished this by combining geometry with several mathematical techniques that were new at the time. These techniques included the new field of integral calculus (which Newton himself invented), and an early use of a truncated infinite series to approximate a function. Newton's formulation of the problem is understandable by any high-school geometry student. Although the manual calculations are tedious, they can be understood by anyone who has taken calculus. In this article, I use a computer to simplify the long calculations in Newton's approximation. The computer calculation avoids the infinite series and provides a 14-decimal approximation to π.

I learned about Newton's approximation from Matt Parker and the "Think Maths" group in the UK, who made a wonderful video about Newton's approximation. They also provide resources for math teachers who want to present this topic to their students.

The geometry of Newton's calculation

The figure at the right shows the geometry behind Newton's calculation of pi. First, draw a semicircle of radius 1/2 centered at the point C(1/2, 0) in the Cartesian plane. Then draw a vertical line segment from A(1/4, 0) until it hits the circle at B. High school geometry shows that the triangle ABC is a 30-60-90 triangle, and the line segment AB intersects the circle at B(1/4, sqrt(3)/4).

Newton knew the following facts:

  1. The angle OCB is 60 degrees. Therefore, the area of the sector OCB is π/6 r2, where r is the radius of the circle. Because the radius is r = 1/2, the area of the sector OCB is π / 24.
  2. The area of the triangle ABC is easily found to be (1/2)*base*height = (1/2)(1/4)(sqrt(3)/4) = sqrt(3)/32. Newton could calculate this quantity to an arbitrary number of decimal places because the square root algorithm was known by the ancient Babylonians and by the Greeks.
  3. The area of the crosshatched region (denoted M) can be found by calculating a definite integral. The equation of the semicircle is \(f(x) = \sqrt{\tfrac{1}{4} - (x-\tfrac{1}{2})^2}\), which simplifies to \(f(x) = \sqrt{x} \sqrt{1-x}\). Therefore, the area can be calculated as \( M = \int_0^{1/4} \sqrt{x} \sqrt{1-x} \, dx\).

From the figure, we see that Area(OCB) = M + Area(ABC). Plugging in the formulas for these quantities, we get
π /24 = M + sqrt(3)/32. Multiplying both sides by 24 gives the equation that Newton used to approximate π:
      π = 24*M + 3 sqrt(3)/4

Newton approximated M to many decimal places, thus providing a decimal approximation for π. Newton used a truncated infinite series expansion to approximate M, but the next section shows an alternative approach: you can use numerical integration to approximate M.

A modern version of Newton's approximation

By using modern computational techniques, you can evaluate the integral (M) numerically. If you evaluate M to high precision, you get a high-precision approximation for π.

Most modern computer languages include a function to evaluate a definite interval on a closed interval. The SAS/IML language provides the QUAD subroutine, which enables you to compute M to many decimal digits by using numerical quadrature (integration). Here is the SAS/IML program that carries out Newton's approximation of pi:

/* Newton's approximation of pi. For details, see https://think-maths.co.uk/newton-pi */
proc iml;
/* integrand is f(x) = sqrt(0.25 - (x-0.5)##2)
                     = sqrt(x*(1-x)) */
start SemiCircle(x);
   return( sqrt(x - x##2) );
finish;
 
/* numerical approximation of M = area under semicircle on [0, 1/4] */
call quad(M, "SemiCircle", {0, 0.25}) EPS=1E-15;
 
/* AreaSector = pi/6 * r^2 = pi / 24 
   Newton used AreaSector = M + AreaTri, so
     pi / 24 = M + AreaTri
     pi = 24*(M + AreaTri)
*/
AreaTri = 1/2 * (1/4) * (sqrt(3) / 4); /* area of triangle ABC */
piApprox = 24*(M + AreaTri);
print piApprox[F=16.14 L="Newton's Approx"];

The program outputs an approximation to pi that is accurate to 14 decimal digits. This value is the standard double-precision floating-point approximation to pi. The program gives the same result as the SAS function constant('pi').

Newton's approximation by using infinite series

Notice that my approximation only captures the value of pi to 14 decimal digits, whereas Newton achieved 16-digit accuracy. How did Newton get more accuracy? He didn't use a numerical method for integrating the integral. Instead, he used the binomial expansion of the term sqrt(1 - x), which replaces the term by an infinite series. He kept 22 terms of the infinite series and integrated (by hand!) the resulting expression.

I am not going to repeat Newton's tour de force calculation, but you can watch a video in which the Think Maths group in the UK repeats Newton's calculation by hand using long division and extremely large pieces of paper! The group also created a document that shows the details. The main idea is to recall the binomial expansion:
\( (1+z)^{\frac {1}{2}} = 1 +{\tfrac {1}{2}}z -{\tfrac {1}{8}}z^{2} +{\tfrac {1}{16}}z^{3} -{\tfrac {5}{128}}z^{4} +{\tfrac {7}{256}}z^{5} -\cdots = \sum _{n=0}^{\infty }{\frac {(-1)^{n-1}(2n)!}{4^{n}(n!)^{2}(2n-1)}}z^{n} \)

In this case, we want to replace z by (-x), which changes the sign of terms that contain odd powers of z. The result is an infinite series in which each term (except the first) has a negative sign:

\( (1-x)^{\frac {1}{2}} = 1 -{ \tfrac {1}{2}}x -{\tfrac {1}{8}}x^{2} -{\tfrac {1}{16}}x^{3} -{\tfrac {5}{128}}x^{4} -{\tfrac {7}{256}}x^{5} -\cdots \)

Newton kept 22 terms of this infinite series and integrated the function term-by-term on the interval [0, 1/4]. The result is 22 fractions that can be evaluated by long division. However, the numerator and denominator of the higher-order terms are HUGE! For example, the 20th term is the fraction 119,409,675 / (41*275). Although, in principle, you could calculate Newton's approximation by hand (and, remember, he did!), it would not be a pleasant way to spend an evening.

Summary

In 1666, Newton computed 16 digits of π by constructing a geometric figure in which the value of pi is related to the numerical approximation of an integral. Newton approximated the integral by expanding a function in an infinite series, truncating the series at 22 terms, and evaluating each term by using long division. Although the formulation and general idea is accessible to calculus students, the calculations are long and tedious. Instead, you can use modern numerical methods to evaluate the integral, thus computing a 14-digit approximation to π.

References

The post How Newton calculated pi to 16 decimal places appeared first on The DO Loop.

3月 012023
 

A SAS programmer wanted to compute the distribution of X-Y, where X and Y are two beta-distributed random variables. Pham-Gia and Turkkan (1993) derive a formula for the PDF of this distribution. Unfortunately, the PDF involves evaluating a two-dimensional generalized hypergeometric function, which is not available in all programming languages. Hypergeometric functions are not supported natively in SAS, but this article shows how to evaluate the generalized hypergeometric function for a range of parameter values. By using the generalized hypergeometric function, you can evaluate the PDF of the difference between two beta-distributed variables.

The difference between two beta random variables

Before doing any computations, let's visualize what we are trying to compute. Let X ~ Beta(a1, b1) and Y ~ Beta(a1, b1) be two beta-distributed random variables. We want to determine the distribution of the quantity d = X-Y. One way to approach this problem is by using simulation: Simulate random variates X and Y, compute the quantity X-Y, and plot a histogram of the distribution of d. Because each beta variable has values in the interval (0, 1), the difference has values in the interval (-1, 1).

The following simulation generates 100,000 pairs of beta variates: X ~ Beta(0.5, 0.5) and Y ~ Beta(1, 1). Both X and Y are U-shaped on (0,1). The following simulation generates the differences, and the histogram visualizes the distribution of d = X-Y:

/* Case 2 from Pham-Gia and Turkkan, 1993, p. 1765 */
%let a1 = 0.5;
%let b1 = 0.5;
%let a2 = 1;
%let b2 = 1;
%let N = 1E5;
data BetaDiff;
call streaminit(123);
do i = 1 to &N;
   x = rand("beta", &a1, &b1);
   y = rand("beta", &a2, &b2);
   d = x - y;
   output;
end;
drop i;
run;
 
title "Distribution of X-Y";
title2 "X~beta(&a1,&b1); Y~beta(&a2,&b2)";
proc sgplot data=BetaDiff;
   histogram d;
run;

For these values of the beta parameters, the distribution of the differences between the two beta variables looks like an "onion dome" that tops many Russian Orthodox churches in Ukraine and Russia. For other choices of parameters, the distribution can look quite different. The remainder of this article defines the PDF for the distribution of the differences. The formula for the PDF requires evaluating a two-dimensional generalized hypergeometric distribution.

Appell's hypergeometric function

A previous article discusses Gauss's hypergeometric function, which is a one-dimensional function that has three parameters. For certain parameter values, you can compute Gauss's hypergeometric function by computing a definite integral. The two-dimensional generalized hypergeometric function that is used by Pham-Gia and Turkkan (1993), is called Appell's hypergeometric function (denoted F1 by mathematicians). Appell's function can be evaluated by solving a definite integral that looks very similar to the integral encountered in evaluating the 1-D function. Appell's F1 contains four parameters (a,b1,b2,c) and two variables (x,y). In statistical applications, the variables and parameters are real-valued. For the parameter values c > a > 0, Appell's F1 function can be evaluated by computing the following integral:
\(F_{1}(a,b_{1},b_{2},c;x,y)={\frac {1}{B(a, c-a)}} \int _{0}^{1}u^{a-1}(1-u)^{c-a-1}(1-x u)^{-b_{1}}(1-y u)^{-b_{2}}\,du\)
where B(s,t) is the complete beta function, which is available in SAS by using the BETA function. Both arguments to the BETA function must be positive, so evaluating the BETA function requires that c > a > 0. Notice that the integration variable, u, does not appear in the answer. Appell's hypergeometric function is defined for |x| < 1 and |y| < 1. Notice that the integrand is unbounded when either x → 1 or y → 1 (assuming b1 > 0 and b2 > 0).

The following SAS IML program defines a function that uses the QUAD function to evaluate the definite integral, thereby evaluating Appell's hypergeometric function for the parameters (a,b1,b2,c) = (2,1,1,3). A table shows the values of the function at a few (x,y) points. The graph shows a contour plot of the function evaluated on the region [-0.95, 0.9] x [-0.95, 0.9].

/* Appell hypergeometric function of 2 vars 
   F1(a,b1,b2; c; x,y) is a function of (x,y) with parms = a // b1 // b2 // c;
   F1 is defined on the domain {(x,y) | |x|<1 and |y|<1}.
   You can evaluate F1 by using an integral for c > a > 0, as shown at
   https://en.wikipedia.org/wiki/Appell_series#Integral_representations
*/
proc iml;
/* Integrand for the QUAD function */
start F1Integ(u)  global(g_parm, g_x, g_y);
   a=g_parm[1]; b1=g_parm[2]; b2=g_parm[3]; c=g_parm[4];
   x=g_x; y=g_y;
   numer = u##(a-1) # (1-u)##(c-a-1);
   denom = (1-u#x)##b1 # (1-u#y)##b2;
   return( numer / denom );
finish;
 
/* Evaluate the Appell F1 hypergeometric function when c > a > 0
   and |x|<1 and |y|<1
   Pass in parm = {a, b1, b2, c} and
   z = (x1 y1,
        x2 y2,
        ...
        xn yn}; */
start AppellF1(z, parm)  global(g_parm, g_x, g_y);
   /* transfer parameters to global symbols */
   g_parm = parm;
   a = parm[1]; b1 = parm[2]; b2 = parm[3]; c = parm[4];
   n = nrow(z);
   F = j(n, 1, .);
   if a <= 0 | c <= a then do;
      /* print error message or use PrintToLOg function:
         https://blogs.sas.com/content/iml/2023/01/25/printtolog-iml.html */
      print "This implementation of the F1 function requires c > a > 0.",
            parm[r={'a','b1','b2','c'}];
      return( F );
   end;
   /* loop over the (x,y) values */
   do i = 1 to n;
      /* defined only when |x| < 1 */
      g_x = z[i,1]; g_y = z[i,2];
      if g_x^=. & g_y^=. & 
         abs(g_x)<1 & abs(g_y)<1 then do;  
         call quad(val, "F1Integ", {0 1}) MSG="NO";
         F[i] = val;
      end;
   end;
   return( F / Beta(a, c-a) );    /* scale the integral */
finish;
 
/* Example of calling AppellF1 */
parm = {2, 1, 1, 3};
parm = {2, 1, 1, 3};
xy = { 0.9  0,
       0.7  0.3,
      -0.5  0.2,
      -0.9 -0.5,
       0    0};
F1 = AppellF1(xy, parm);
print xy[L="" c={'x' 'y'}] F1;

The PDF of beta differences

Pham-Gia and Turkkan (1993) derive the PDF of the distribution for the difference between two beta random variables, X ~ Beta(a1,b1) and Y ~ Beta(a2,b2). The PDF is defined piecewise. There are different formulas, depending on whether the difference, d, is negative, zero, or positive. The formulas use powers of d, (1-d), (1-d2), the Appell hypergeometric function, and the complete beta function. The formulas are specified in the following program, which computes the PDF. Notice that linear combinations of the beta parameters are used to construct the parameters for Appell's hypergeometric function.

/* Use Appell's hypergeometric function to evaluate the PDF
   of the distribution of the difference X-Y between 
   X ~ Beta(a1,b1) and Y ~ Beta(a2,b2)
*/
start PDFBetaDiff(_d, _parm);
   a1=_parm[1]; b1=_parm[2]; a2=_parm[3]; b2=_parm[4];
   n = nrow(_d)*ncol(_d);
   PDF = j(n, 1, .);
   if a1<=0 | b1<=0 | a2<=0 | b2<=0 then do;
      print "All parameters must be positive", a1 b1 a2 b2;
      return(PDF);
   end;
   A = Beta(a1,b1) * Beta(a2,b2);
   do i = 1 to n;
      d = _d[i];
      *print d a1 b1 a2 b2;
      if d = . then 
         PDF[i] = .;
      if abs(d) < 1 then do;
         /* Formulas from Pham-Gia and Turkkan, 1993 */
         if abs(d)<1e-14 then do;
            if a1+a2 > 1 & b1+b2 > 1 then 
               PDF[i] = Beta(a1+a2-1, b1+b2-1) / A;
            *print "d=0" (a1+a2-1)[L='a1+a2-1'] (b1+b2-1)[L='b1+b2-1'] (PDF[i])[L='PDF'];
         end;
         else if d > 0 then do;  /* (0,1] */
            parm = b1            //
                   a1+b1+a2+b2-2 //
                   1-a1          //
                   a2+b1;
            xy = 1-d || 1-d##2;
            F1 = AppellF1(xy, parm);
            PDF[i] = Beta(a2,b1) # d##(b1+b2-1) # (1-d)##(a2+b1-1) # F1 / A;
            *print "d>0" xy (PDF[i])[L='PDF'];
         end;
         else do;                /* [-1,0) */
            parm = b2            //
                   1-a2          //
                   a1+b1+a2+b2-2 //
                   a1+b2;
            xy = 1-d##2 || 1+d;
            F1 = AppellF1(xy, parm);
            PDF[i] = Beta(a1,b2) # (-d)##(b1+b2-1) # (1+d)##(a1+b2-1) # F1 / A;
            *print "d<0" xy (PDF[i])[L='PDF'];
         end;
      end;
   end;
   return PDF;
finish;
 
print "*** Case 2 in Pham-Gia and Turkkan, p. 1767 ***";
a1 = 0.5;
b1 = 0.5;
a2 = 1;
b2 = 1;
parm = a1//b1//a2//b2;
 
d = {-0.9, -0.5, -0.25, 0, 0.25, 0.5, 0.9};
PDF = PDFBetaDiff(d, parm);
print d PDF;

Notice that the parameters are the same as in the simulation earlier in this article. The following graph visualizes the PDF on the interval (-1, 1):

/* graph the distribution of the difference */
d = T(do(-0.975, 0.975, 0.025));
PDF = PDFBetaDiff(d, parm);
 
title "X-Y for X ~ Beta(0.5,0.5) and Y ~ Beta(1,1)";
title2 "Fig 1, p. 1766";
call series(d, PDF) grid={x y};

The PDF, which is defined piecewise, shows the "onion dome" shape that was noticed for the distribution of the simulated data. The following graph overlays the PDF and the histogram to confirm that the two graphs agree.

Not every combination of beta parameters results in a non-smooth PDF. For example, if you define X ~ beta(3,5) and Y ~ beta(2, 8), then you can compute the PDF of the difference, d = X-Y, by changing the parameters as follows:

/* Case 5 from Pham-Gia and Turkkan, 1993, p. 1767 */
a1 = 3;
b1 = 5;
a2 = 2;
b2 = 8;
parm = a1//b1//a2//b2;

If you rerun the simulation and overlay the PDF for these parameters, you obtain the following graph:

Summary

The distribution of X-Y, where X and Y are two beta-distributed random variables, has an explicit formula (Pham-Gia and Turkkan, 1993). Unfortunately, the PDF involves evaluating a two-dimensional generalized hypergeometric function, which is a complicated special function. Hypergeometric functions are not supported natively in SAS, but this article shows how to evaluate the generalized hypergeometric function for a range of parameter values, which enables you to evaluate the PDF of the difference between two beta-distributed variables.

You can download the following SAS programs, which generate the tables and graphs in this article:

References

The post The distribution of the difference between two beta random variables appeared first on The DO Loop.

2月 272023
 

In applied mathematics, there is a large collection of "special functions." These function appera in certain applications, often as the solution to a differential equation, but also in the definition of probability distributions. For example, I have written about Bessel functions, the complete and incomplete gamma function, and the complete and incomplete beta function, and the Lambert W function, just to name a few special functions that are encountered in statistics.

You can be blissfully ignorant of a special function for many years, then one day you encounter a problem in which the function is essential. This happened recently when a SAS programmer told me about a problem that requires computing with a two-dimensional hypergeometric function. I had never heard of the function, but I did some research to figure out how to compute the function in SAS. (For a mathematical overview, see Beukers, 2014.) This article shows how to use numerical integration in SAS to compute a one-dimensional hypergeometric function, which is called Gauss's hypergeometric function and denoted \(\,_2F_1\). A subsequent article will discuss a two-dimensional hypergeometric function, which is needed to compute the PDF of a certain probability distribution. I am not an expert on the hypergeometric function, but this article shares what I have learned.

Hypergeometric functions are sometimes defined as functions of complex numbers, but the applications in statistics involve only real numbers. Consequently, this article evaluates the functions for real arguments. The hypergeometric functions in this article converge when the magnitude of each variable is less than 1.

A one-dimensional hypergeometric function

For the definition of Gauss's hypergeometric function, see the Wikipedia article about the 1-D hypergeometric series. Hypergeometric functions are defined as an infinite series. The one-dimensional hypergeometric functions have three parameters, (a,b;c). The most useful hypergeometric series is denoted by \(\,_{2}F_{1}(a,b;c; x)\) or sometimes just as \(F(a,b;c; x)\), where |x| < 1. Interestingly, the convention for a hypergeometric function is to list the parameters first and the variable (x) last. The two parameters (a,b) are called the upper parameters; the parameter c is called the lower parameter. The \(\,_2F_1\) function is a special case of the generalized hypergeometric function \(\,_pF_q\), which has p upper parameters and q lower parameters.

Compute Gauss's hypergeometric function in SAS

An important special case of the parameters is c > b > 0 and a > 0. If c > b > 0, then you can write the hypergeometric function as a definite integral. The defining equation is
\(B(b,c-b) * \,_{2}F_{1}(a,b;c; x)=\int _{0}^{1}u^{b-1}(1-u)^{c-b-1} / (1-x u)^{a}\,du\)
where B(s,t) is the complete beta function, which is available in SAS by using the BETA function. Both arguments to the BETA function must be positive, so evaluating the BETA function requires that c > b > 0. Notice that the integration variable, u, does not appear in the answer. The definite integral is a function of the parameters (a,b,c) and of the variable x.

You can perform numerical integration in SAS by using the QUAD function in SAS IML software. Notice that when a > 0, the integrand becomes unbounded as x → 1. Because of this, the QUAD function will display some warnings in the log as it seeks an accurate solution. According to my tests, the value of this integral seems to be computed correctly.

The following SAS IML program uses the QUAD function to evaluate the definite integral by using the user-defined HG1 function. The GLOBAL statement enables you to pass values of the parameters (a,b,c) and the variable x. I use G_ as a prefix to denote global variables. After evaluating the integral for each value of x, the function Hypergeometric2F1 divides the integral by the beta function, thus obtaining the value of Gauss's hypergeometric function, \(\,_2F_1\).

/* 1-D hypergeometric function 1F2(a,b;c;x) */
proc iml;
/* Integrand for the QUAD function. Parameters are in the GLOBAL clause. */
start HG1(u)  global(g_parm, g_x);
   a=g_parm[1]; b=g_parm[2]; c=g_parm[3]; x=g_x;
   /* Note: integrand unbounded as x->1 */
   F = u##(b-1) # (1-u)##(c-b-1) / (1-x#u)##a;
   return( F );
finish;
 
/* Evaluate the 2F1 hypergeometric function when c > b > 0 and |x|<1 */
start Hypergeometric2F1(x, parm)  global(g_parm, g_x);
   /* transfer the parameters to global symbols for QUAD */
   g_parm = parm;
   a = parm[1]; b = parm[2]; c = parm[3]; 
   n = nrow(x)* ncol(x);
   F = j(n, 1, .);
   if b <= 0 | c <= b then do;
      /* print error message or use PrintToLOg function:
         https://blogs.sas.com/content/iml/2023/01/25/printtolog-iml.html */
      print "This implementation of the F1 function requires c > b > 0.",
            parm[r={'a','b','c'}];
      return( F );
   end;
   /* loop over the x values */
   do i = 1 to n;
      if x[i]^=. & abs(x[i])<1 then do;  /* defined only when |x| < 1 */
         g_x = x[i];
         call quad(val, "HG1", {0 1}) MSG="NO";
         F[i] = val;
      end;
   end;
   return( F / Beta(b, c-b) );    /* scale the integral */
finish;
 
/* test the function */
a = 0.5;
b = 0.5;
c = 1;
parm = a//b//c;
 
x = {-0.5, -0.25, 0, 0.25, 0.5, 0.9};
F = Hypergeometric2F1(x, parm);
print x F[L='2F1(0.5,0.5;1;x)'];

The table shows a few values of Gauss's hypergeometric function for the parameters (a,b;c)=(0.5,0.5;1). You can visualize the function on the open interval (-1, 1) by using the following statements:

x = T( do(-0.99, 0.99, 0.01) );
F = Hypergeometric2F1(x, parm);
 
title "2F1(0.5,0.5; 1; x)";
call series(x, F) grid={x y} label={'x' '2F1(x)'};

Summary

Although I am not an expert on hypergeometric functions, this article shares a few tips for computing Gauss's hypergeometric function, \(\,_{2}F_{1}(a,b;c; x)\) in SAS for real arguments and for the special case c > b > 0. For this parameter range, you can compute the hypergeometric function as a definite integral. As shown in the next blog post, you can use this computation to compute the PDF of a certain probability distribution.

References

The post Compute hypergeometric functions in SAS appeared first on The DO Loop.

9月 072022
 

Monotonic transformations occur frequently in math and statistics. Analysts use monotonic transformations to transform variable values, with Tukey's ladder of transformations and the Box-Cox transformations being familiar examples. Monotonic distributions figure prominently in probability theory because the cumulative distribution is a monotonic increasing function. For a continuous distribution that is strictly monotonic, the quantile function is also monotonic.

In a recent project, I needed to determine whether a certain function is monotonic increasing. Because the function is only known at a finite sequence of values, the reduced problem is to determine whether a sequence is increasing. I have previously shown that you can use the DIF function in SAS to test for an increasing sequence. This article shows how to apply the method to the problem of deciding whether a function is increasing.

What is a monotone sequence?

There are two types of monotonicity. In a weakly monotonic sequence, adjacent terms can be equal. That is, the difference between adjacent terms is allowed to be 0. In a strongly monotonic sequence (also called strictly monotonic), adjacent terms cannot be equal. For a strictly increasing sequence, the difference between adjacent terms is always positive. For a strictly decreasing sequence, the difference between adjacent terms is always negative. By itself, the term monotonic means the sequence is either increasing or decreasing.

Thus, there are four different tests for monotonicity. You can test whether a sequence is weakly increasing, strictly increasing, weakly decreasing, or strictly decreasing.

Test for monotone increasing in SAS

The following SAS/IML module evaluates a vector and tests whether the elements are increasing. It uses the DIF function to produce a vector that contains the difference between adjacent elements of the input vector. That is, if x = {x1, x2, x3, ..., xn} is the input vector, then d = diff(x,1,1) is the vector {x2-x1, x3-x2, ..., xn-x[n-1]}. The second argument specifies the lag. The third argument is a flag that specifies whether the result is padded with missing values. (See the Appendix for more information.) By default, the function tests for a weakly increasing sequence.

proc iml;
/* Test whether a sequence of elements is monotonic increasing.
   Valid options are 
   strict=0 : (Default) Return 1 if a sequence is nondecreasing
   strict=1 : Return 1 if a sequence is strictly increasing
*/
start IsIncr(_x, strict=0);
   x = colvec(_x);
   if nrow(x)=1 then return(1);      /* one element is always monotonic! */
   d = dif(x,1,1);                   /* lag=1; delete initial missing value */
   if strict then 
      return( all(d > 0) );
   return( all(d >= 0) );
finish;
 
/* test whether sequences are increasing */
x = {0,2,2,2,6,7,9};
y = {0,1,3,4,6,7,9};
z = {0,1,3,4,2,7,9};
b1 = IsIncr(x);     /* test weakly increasing */
b2 = IsIncr(x, 1);  /* test strictly increasing */
b3 = IsIncr(y, 1);  /* test strictly increasing */
b4 = IsIncr(z);     /* test weakly increasing */
 
print b1 b2 b3 b4;

The IsIncr function is called four times:

  1. The first call returns the value 1 because the elements of x are weakly monotonic (nondecreasing).
  2. The second call returns the value 0 because the elements of x are not strictly increasing.
  3. The third call returns the value 1 because the elements of y are strictly increasing.
  4. The fourth call returns the value 0 because the elements of z are not monotonic.

Test for monotone decreasing in SAS

If a sequence {x1, x2, x3, ...} is monotone increasing, then the sequence obtained by multiplying each element by -1 is monotone decreasing. Therefore, it is trivial to write a function that tests whether a sequence is decreasing: simply test whether the negative of the sequence is increasing! This is accomplished by the following SAS/IML function:

/* Test whether a sequence of elements is monotonic decreasing.
   strict=0 : (Default) Return 1 if a sequence is nonincreasing
   strict=1 : Return 1 if a sequence is strictly decreasing
*/
start IsDecr(x, strict=0);
   return IsIncr(-x, strict);
finish;
 
/* test whether sequence is increasing */
u = {9,8,7,7,6,2,0};
b5 = IsDecr(u);     /* test weakly decreasing */
b6 = IsDecr(u, 1);  /* test strictly decreasing */
print b5 b6;

The sequence is weakly decreasing but not strictly decreasing. The first call to the IsDecr function returns 1. The second call returns 0.

An application: Test whether a transformation is monotonic

I used the IsIncr function to address a general question: Can you detect whether an unknown univariate function is monotonic increasing?

In a recent project, I was modeling the cumulative distribution function (CDF) of an unknown continuous distribution. By definition, a CDF must be increasing, but for some values of the parameters, the model is not an increasing function. I needed to identify these "infeasible" parameter values in a quick and reliable manner.

Mathematically, an increasing function, F, has the property that for every increasing sequence, {x_i}, the image of that sequence, {F(x_i)}, is also increasing. Numerically, you can generate an increasing sequence in the domain and ask whether the image of that sequence is also increasing.

Here's an example. Suppose you want to test whether the following functions are increasing:
F1(x) = (5 - 4*x) log( x/(1-x) )
F2(x) = (5 - 5.2*x) log( x/(1-x) )
The functions are defined on the domain (0, 1). The following program defines an increasing sequence on (0,1) and tests whether the image of the sequence is also increasing for F1 and for F2:

start Func1(x);
   return (5 - 4*x)#log( x/(1-x) );
finish;
start Func2(x);
   return (5 - 5.2*x)#log( x/(1-x) );
finish;
 
dt = 0.005;
x = do(dt, 1-dt, dt);  /* increasing sequence on a fine grid */
y1 = Func1(x);         /* image of sequence under F1 */
y2 = Func2(x);         /* image of sequence under F2 */
b1 = IsIncr(y1);
b2 = IsIncr(y2);
print b1 b2;

The output indicates that the first function in increasing, but the second function is not. The following graph of the second function shows, indeed, that the function is not strictly increasing.

The accuracy of this method depends on choosing a sequence on a fine grid of points in the domain. It assumes that the derivative of the function is bounded so that function cannot quickly increase and decrease on one of the small gaps between consecutive elements of the sequence.

Summary

This article shows a way to test whether a function is monotonic on an interval. A common application is to test for an increasing function, but it is equally easy to test for a decreasing function. You can test whether the function is weakly increasing or strictly increasing.

These numerical tests evaluate a function at a finite set of points. Consequently, they can rule out the hypothesis of monotonicity, but they do not prove that the function is increasing everywhere. Nevertheless, these tests work well in practice if you choose the input sequence on a fine grid in the domain. If you have a mathematical bound on the derivative of the function, you can say more.

Appendix: A useful option for the DIF function

By default, the DIF function in SAS/IML returns a vector that is the same size as the input vector. If k is the lag parameter, the first k elements are missing values. The differences between elements are contained in the elements k+1, k+2, and so forth. If you specify, the value 1 for the third option, the DIF function deletes the leading missing values. The result is a vector that has n – k elements, where n is the length of the input vector and k is the lag parameter.

For example, the following example passes in k=1 and deletes the leading missing value for the result:

t = {0,3,3,2,7};   /* input argument has 5 elements */
d = dif(t, 1, 1);  /* LAG=1; delete leading missing value */
print d;           /* result has 4 elements */

The post A test for monotonic sequences and functions appeared first on The DO Loop.

7月 182022
 

A colleague was struggling to compute a right-tail probability for a distribution. Recall that the cumulative distribution function (CDF) is defined as a left-tail probability. For a continuous random variable, X, with density function f, the CDF at the value x is
    F(x) = Pr(X ≤ x) = ∫ f(t) dt
where the integral is evaluated on the interval (-∞, x). That is, it is the area under the density function "to the left" of x. In contrast, the right-tail probability is the complement of this quantity. The right-tail probability is defined by
    S(x) = Pr(X ≥ x) = 1 – F(x)
It is the area under the density function "to the right" of x.

The right-tail probability is also called the right-side or upper-tail probability. The function, S, that gives the right-tail probability is called the survival distribution function (SDF). Similarly, a quantile that is associated with a probability that is close to 0 or 1 might be called an extreme quantile. This article discusses why a naive computation of an extreme quantile can be inaccurate. It shows how to compute extreme probabilities and extreme quantiles by using special functions in SAS:

Accurate computations of extreme probabilities

My colleague contacted me to discuss some issues related to computing an extreme quantile for p ≈ 1. To demonstrate, I will use the standard lognormal distribution, whose PDF is shown to the right. (However, this discussion applies equally well to other distributions.) When p is small, the lognormal quantile is close to 0. When p is close to 1, the lognormal quantile is arbitrarily large because the lognormal distribution is unbounded.

The following SAS DATA step uses the QUANTILE function to compute extreme quantiles in the left and right tails of the lognormal distribution. The quantiles are associated with the probabilities p=1E-k and p=1 – 1E-k for k=8, 9, 10, ..., 15.

%let Distrib = Lognormal;
data Have;
do pow = -8 to -15 by -1;
   p = 10**pow;                   
   Lx = quantile("&Distrib", p);   /* left-tail quantile: find x such that CDF(x) = p */
   Rp = 1 - p;                     /* numerically, 1-p = 1 for p < MACEPS */
   Rx = quantile("&Distrib", Rp);  /* right-tail quantile: find x such that CDF(x) = 1-p */
   output;
end;
format Rp 18.15;
run;
 
proc print noobs; run;

The second column shows the probability, p, and the Lx column shows the corresponding left-tail quantile. For example, the first row shows that 1E-8 is the probability that a standard lognormal random variate has a value less than 0.00365. The Rp column shows the probability 1 – p. The Rx column shows the corresponding quantile. For example, the first row shows that 1 – 1E-8 is the probability that a standard lognormal random variate has a value greater than 273.691.

From experimentation, it appears that the QUANTILE function returns a missing value when the probability argument is greater than or equal to 1 – 1E-12. The SAS log displays a note that states
    NOTE: Argument 2 to function QUANTILE('Normal',1) at line 5747 column 9 is invalid.
This means that the function will not compute extreme-right quantiles. Why? Well, I can think of two reasons:

  • When p is very small, the quantity 1 – p does not have full precision. (This is called the "loss-of-precision problem" in numerical analysis.) In fact, if p is less than machine epsilon, the expression 1 – p EXACTLY EQUALS 1 in floating-point computations! Thus, we need a better way to specify probabilities that are close to 1.
  • For unbounded distributions, the CDF function for an unbounded distribution is extremely flat for values in the right tail. Because the pth quantile is obtained by solving for the root of the function CDF(x) - p = 0, the computation to find an accurate extreme-right quantile is numerically imprecise for unbounded distributions.

Compute extreme quantiles in SAS

The previous paragraph does not mean that you can't compute extreme-right quantiles. It means that you should use a function that is specifically designed to handle probabilities that are close to 1. The special function is the SQUANTILE function, which uses asymptotic expansions and other tricks to ensure that extreme-right quantiles are computed accurately. If x is the value returned by the SQUANTILE function, then 1 – CDF(x) = p. You can, of course, rewrite this expression to see that x satisfies the equation CDF(x) = 1 – p.

The syntax for the SQUANTILE function is almost the same as for the QUANTILE function. For any distribution, QUANTILE("Distrib", 1-p) = SQUANTILE("Distrib", p). That is, you add an 'S' to the function name and replace 1 – p with p. Notice that by specifying p instead of 1 – p, the right-tail probability can be specified more accurately.

The following SAS DATA step is a modification of the previous DATA step. It uses the SQUANTILE function to compute the extreme-right quantiles:

data Want;
do pow = -8 to -15 by -1;
   p = 10**pow;
   Lx = quantile("&Distrib", p);   /* find x such that CDF(x) = p */
   Rp = 1 - p;
   Rx = squantile("&Distrib", p);  /* find x such that 1-CDF(x)=p */
   output;
end;
format Rp 18.15;
drop pow;
run;
 
proc print noobs; run;

The table shows that the SQUANTILE function can compute quantiles that are far in the right tail. The first row of the table shows that the probability is 1E-8 that a lognormal random variate exceeds 273.69. (Equivalently, 1 – 1E-8 is the probability that the variate is less than 273.69.) The last row of the table shows that 1E-15 is the probability that a lognormal random variate exceeds 2811.14.

Compute extreme probabilities in SAS

When p is small, the "loss-of-precision problem" (representing 1 – p in finite precision) also affects the computation of the cumulative probability (CDF) of a distribution. That is, the expression CDF("Distrib", 1-p) might not be very accurate because 1 – p loses precision when p is small. To counter that problem, SAS provides the SDF function, which computes the survival distribution function (SDF). The SDF returns the right-tail quantile x for which 1 – CDF(x) = p. This is equivalent to finding the left-tail quantile that satisfies CDF(x) = 1 – p.

The following DATA step computes several one-sided probabilities. For each value of x, the program computes the probability that a random lognormal variate is greater than x:

%let Distrib = lognormal;
data SDF;
do x = 500 to 2500 by 500;              /* extreme quantiles  */
   Rp       = 1 - cdf("&Distrib", x);   /* right-tail probability */
   RpBetter = sdf("&Distrib", x);       /* better computation of right-tail probability */
   output;
end;
run;
 
proc print noobs; run;

The difference between the probabilities in the Rp and RpBetter columns are not as dramatic as for the quantiles in the previous section. That is because the quantity 1 – CDF(x) does not lose precision when x is large because CDF(x) ≈ 1. Nevertheless, you can see small differences between the columns, and the RpBetter column will tend to be more accurate when x is large.

Summary

This article shows two tips for working with extreme quantiles and extreme probabilities. When the probability is close to 1, consider using the SQUANTILE function to compute quantiles. When the quantile is large, consider using the SDF function to compute probabilities.

The post Tips for computing right-tail probabilities and quantiles appeared first on The DO Loop.

6月 272022
 

Recently, I needed to solve an optimization problem in which the objective function included a term that involved the quantile function (inverse CDF) of the t distribution, which is shown to the right for DF=5 degrees of freedom. I casually remarked to my colleague that the optimizer would have to use finite-difference derivatives because "this objective function does not have an analytical derivative." But even as I said it, I had a nagging thought in my mind. Is that statement true?

I quickly realized that I was wrong. The quantile function of ANY continuous probability distribution has an analytical derivative as long as the density function (PDF) is nonzero at the point of evaluation. Furthermore, the quantile function has an analytical expression for the second derivative if the density function is differential.

This article derives an exact analytical expression for the first and second derivatives of the quantile function of any probability distribution that has a smooth density function. The article demonstrates how to apply the formula to the t distribution and to the standard normal distribution. For the remainder of this article, we assume that the density function is differentiable.

The derivative of a quantile function

It turns out that I had already solved this problem in a previous article in which I looked at a differential equation that all quantile functions satisfy. In the appendix of that article, I derived a relationship between the first derivative of the quantile function and the PDF function for the same distribution. I also showed that the second derivative of the quantile function can be expressed in terms of the PDF and the derivative of the PDF.

Here are the results from the previous article. Let X be a random variable that has a smooth density function, f. Let w = w(p) be the p_th quantile. Then the first derivative of the quantile function is
\(\frac{dw}{dp} = \frac{1}{f(w(p))}\)
provided that the denominator is not zero. The second derivative is
\(\frac{d^2w}{dp^2} = \frac{-f^\prime(w)}{f(w)} \left( \frac{dw}{dp} \right)^2 = \frac{-f^\prime(w)}{f(w)^3}\).

Derivatives of the quantile function for the t distribution

Let's confirm the formulas for the quantile function of the t distribution with five degrees of freedom. You can use the PDF function in SAS to evaluate the density function for many common probability distributions, so it is easy to get the first derivative of the quantile function in either the DATA step or in the IML procedure, as follows:

proc iml;
reset fuzz=1e-8;
DF = 5;
/* first derivative of the quantile function w = F^{-1}(p) for p \in (0,1) */
start DTQntl(p) global(DF);
   w = quantile("t", p, DF);  /* w = quantile for p */
   fw = pdf("t", w, DF);      /* density at w */
   return( 1/fw );
finish;
 
/* print the quantile and derivative at several points */
p = {0.1, 0.5, 0.75};
Quantile = quantile("t", p, DF);  /* w = quantile for p */
DQuantile= DTQntl(p);
print p Quantile[F=Best6.] DQuantile[F=Best6.];

This table shows the derivative (third column) for the quantiles of the t distribution. If you look at the graph of the quantile function, these values are the slope of the tangent lines to the curve at the given values of the p variable.

Second derivatives of the quantile function for the t distribution

With a little more work, you can obtain an analytical expression for the second derivative. It is "more work" because you must use the derivative of the PDF, and that formula is not built-in to most statistical software. However, derivatives are easy (if unwieldy) to compute, so if you can write down the analytical expression for the PDF, you can write down the expression for the derivative.

For example, the following expression is the formula for the PDF of the t distribution with ν degrees of freedom:

\( f(x)={\frac {\Gamma ({\frac {\nu +1}{2}})} {{\sqrt {\nu \pi }}\,\Gamma ({\frac {\nu }{2}})}} \left(1+{\frac {x^{2}}{\nu }}\right)^{-(\nu +1)/2} \)
Here, \(\Gamma\) is the gamma function, which is available in SAS by using the GAMMA function. If you take the derivative of the PDF with respect to x, you obtain the following analytical expression, which you can use to compute the second derivative of the quantile function, as follows:
/* derivative of the PDF for the t distribution */
start Dtpdf(x) global(DF);
   nu = DF;
   pi = constant('pi');
   A = Gamma( (nu +1)/2 ) / (sqrt(nu*pi) * Gamma(nu/2));
   dfdx = A * (-1-nu)/nu # x # (1 + x##2/nu)##(-(nu +1)/2 - 1); 
   return dfdx;
finish;
 
/* second derivativbe of the quantile function for the t distribution */
start D2TQntl(p) global(df);
   w = quantile("t", p, df);
   fw = pdf("t", w, df);
   dfdx = Dtpdf(w);
   return( -dfdx / fw##3 );
finish;
 
D2Quantile = D2TQntl(p);
print p Quantile[F=Best6.] DQuantile[F=Best6.] D2Quantile[F=Best6.];

Derivatives of the normal quantiles

You can use the same theoretical formulas for the normal quantiles. The second derivative, however, does not require that you compute the derivative of the normal PDF. It turns out that the derivative of the normal PDF can be written as the product of the PDF and the quantile function, which simplifies the computation. Specifically, if f is the standard normal PDF, then \(f^\prime(w) = -f(w) w\).
and so the second derivative of the quantile function is
\(\frac{d^2w}{dp^2} = w \left( \frac{dw}{dp}\right)^2\).
This is shown in the following program:

/* derivatives of the quantile function for the normal distribution */
proc iml;
/* first derivative of the quantile function */
start DNormalQntl(p);
   w = quantile("Normal", p);
   fw = pdf("Normal", w);
   return 1/fw;
finish;
/* second derivative of the quantile function */
start D2NormalQntl(p);
   w = quantile("Normal", p);
   dwdp = DNormalQntl(p);
   return( w # dwdp##2 );
finish;
 
p = {0.1, 0.5, 0.75};
Quantile = quantile("Normal", p);
DQuantile = DNormalQntl(p);
D2Quantile = D2NormalQntl(p);
print p Quantile[F=Best6.] DQuantile[F=Best6.] D2Quantile[F=Best6.];

Summary

In summary, this article shows how to compute the first and second derivatives of the quantile function for any probability distribution (with mild assumptions for the density function). The first derivative of the quantile function can be defined in terms of the quantile function and the density function. For the second derivative, you usually need to compute the derivative of the density function. The process is demonstrated by using the t distribution. For the standard normal distribution, you do not need to compute the derivative of the density function; the derivatives depend only on the quantile function and the density function.

The post The derivative of a quantile function appeared first on The DO Loop.

4月 252022
 

To a numerical analyst, numerical integration has two meanings. Numerical integration often means solving a definite integral such as \(\int_{a}^b f(s) \, ds\). Numerical integration is also called quadrature because it computes areas. The other meaning applies to solving ordinary differential equations (ODEs). My article about new methods for solving ODEs in SAS uses the words "integrate" and "integrator" a lot. This article caused a colleague to wonder about the similarities and differences between integration as a method for solving a definite integral and integration as a method for solving an initial-value problem for an ODE. Specifically, he asked whether you could use an ODE solver to solve a definite integral?

It is an intriguing question. This article shows that, yes, you can use an ODE solver to compute some definite integrals. However, just as a builder should use the most appropriate tools to build a house, so should a programmer use the best tools to solve numerical problems. Just because you can solve an integral by using an ODE does not mean that you should discard the traditional tried-and-true methods of solving integrals.

The most famous differential equation in probability and statistics

There are many famous differential equations in physics and engineering because Newton's Second Law is a differential equation. In probability and statistics, the most famous differential equation is the relationship between the two fundamental ways of looking at a continuous distribution. Namely, the derivative of the cumulative distribution function (CDF) is the probability density function (PDF). We usually write the relationship \(F^\prime(x) = f(x)\), where F is the CDF and f is the PDF. Thus, the CDF is the solution of a first-order differential equation that has the PDF on the right-hand side. If you know the PDF and an initial condition, you can solve the differential equation to find the CDF.

However, there is a problem. By integrating both sides of the differential equation, you obtain \(F(x) = \int_{-\infty}^x f(s) \, ds\). Because the lower limit of integration is -∞, you cannot solve the differential equation as an initial-value problem (IVP). An initial-value problem requires that you know the value of the function for some finite value of the lower limit.

However, you can get around this issue for some specific distributions. Some distributions have a lower limit of 0 instead of -∞. If the density at x=0 is finite, you can solve the ODE for those distributions.

You can also use an ODE to solve for the CDF of symmetric distributions. If f is a symmetric distribution, the point of symmetry is the median value of the distribution. So, if s0 is the symmetric value, you can solve the ODE \(F^\prime(x) = f(x)\) for any value xs0 by using the initial condition \(F(s_0) = 0.5\). Thus, you can use an IVP solve any integral of the form \(\int_{s_0}^b f(s) \, ds\). By symmetry, you can solve any integral of the form \(\int_a^{s_0} f(s) \, ds\). By combining these results, you can use an IVP to solve the integral on any finite-length interval [a, b].

This symmetry idea is shown in the graph to the right for the case of the standard normal distribution. The normal distribution is symmetric about the median s0=0. This means that the CDF passes through the point (0, 1/2). You can integrate the ODE to obtain the CDF curve for any positive value of x. The computations are shown in the next section.

Solve an integral to find the normal probability

I like to know the correct answer BEFORE I do any computation. The usual way to compute an integral in SAS is to use the QUAD subroutine in SAS/IML software. (QUAD is short for quadrature.) This subroutine is designed to compute one-dimensional integrals. To use the QUAD subroutine, you need to define the integrand and interval of integration. For example, the following statements define the integral for the normal PDF on the interval [0, 3]:

proc iml;
/* Use QUAD subroutine to integrate the standard normal PDF.
   Note: For this integrand, you can call the CDF function to compute the integral:
         CDF = cdf("Normal", 3) - 0.5; */
start NormalPDF(x);
   return pdf("Normal", x);   /* f(x) = 1/sqrt(2*pi) *exp(-x##2/2) */
finish;
 
/* find the area under pdf on [0, 3] */
call quad(Area, "NormalPDF", {0 3});
print Area;

The QUAD subroutine makes it simple to integrate any continuous function on a finite interval. In fact, the QUAD subroutine can also integrate over infinite intervals! And, if you specify the points of discontinuity, it can integrate functions that are not continuous at finitely many points.

Solve an ODE to find the normal probability

Just for fun, let's see if we can get the same answer by using the ODE subroutine to solve a differential equation. We want to compute the integral \(F(a) = \int_0^a \phi(s) \, ds\), where φ is the standard normal PDF. The equivalent formulation as a differential equation is \(F^{\prime}(x) = \phi(x)\), where F(0)=0.5. For this example, we want to solution at x=3.

The module that evaluates the right-hand side of the ODE must take two arguments. The first argument is the independent variable, which is x. The second argument is the state of the system, which is not used for this problem. You also need to specify the initial condition, which is F(0)=0.5. Lastly, you also need to specify the limits of integration which is [0, 3]. The following statements use the ODE subroutine in SAS/IML software to solve the IVP defined by F`(x)=PDF("Normal",x) and F(0)=0.5:

/* ODE is A` = 1/sqrt(2*pi) *exp(-x##2/2) 
   A(0) = 0.5  */
start NormalPDF_ODE(s, y);
   return pdf("Normal", s);
finish;
 
c = 0.5;                     /* initial value at x=0 */
x = {0 3};                  /* integrate from x=0 to x=3 */
h={1.e-7, 1, 1e-4};          /* (min, max, initial) step size for integrator */
call ode(result, "NormalPDF_ODE", c, x, h) eps=1e-7;
A = result - c;
print A;

Success! The ODE gives a similar numerical result for the integral of the PDF, which is the CDF at x=3.

Integrating an ODE is a poor way to compute an integral

Although it is possible to use an ODE to solve some integrals on a finite domain, this result is more of a curiosity than a practical technique. A better choice is to use a routine designed for numerical integration, such as the QUAD subroutine. The QUAD subroutine can integrate on any domain, including infinite domains. For example, the following statements integrate the normal PDF on [-3, 3] and (-∞, 3), respectively.

call quad(Area2, "NormalPDF", {-3 3});  /* area on [-3, 3] */
call quad(Area3, "NormalPDF", {0 .P});  /* area on [0, infinity) */
print Area2 Area3;

In addition, the QUAD subroutine can be used to compute double integrals by using iterated integrals. As far as I know, you cannot use an ODE to compute a double integral.

Whereas the ODE subroutine requires knowing an initial value on the curve, the QUAD routine does not. The ODE subroutine also requires separate calls to integrate forward and backward from the initial condition, whereas the QUAD routine does not.

Summary

The phrase "numerical integration" has two meanings in numerical analysis. The usual meaning is "quadrature," which means to evaluate a definite integral. However, it is also used to describe the numerical solution of an ODE. In certain cases, you can use an ODE to solve a definite integral on a bounded domain. However, the ODE requires special additional information such as knowing an "initial condition." So, yes, you can (sometimes) solve integrals by using a differential equation, but the process is more complicated than if you use a standard method for solving integrals.

The post Two perspectives on numerical integration appeared first on The DO Loop.

4月 112022
 

Many discussions and articles about SAS Viya emphasize its ability to handle Big Data, perform parallel processing, and integrate open-source languages. These are important issues for some SAS customers. But for customers who program in SAS and do not have Big Data, SAS Viya is attractive because it is the development environment in which SAS R&D adds new features, options, and algorithms. Since 2018, most new development has been in SAS Viya. This article discusses one new feature in the SAS IML product: new state-of-the-art algorithms for solving ordinary differential equations (ODEs) that are formulated as initial value problems (IVPs).

In statistics, differential equations arise in several areas, including compartment models in the field of pharmacokinetics. The example in this article is a compartment model, although it is from epidemiology, not pharmacokinetics.

Solving ODEs in SAS

The SAS/IML product introduced the ODE subroutine way back in SAS 6. The ODE subroutine solves IVPs by using an adaptive, variable-order, variable-step-size integrator. This integrator is very good for a wide assortment of problems. The integrator was state-of-the-art in the 1980s, but researchers have made a lot of progress in solving ODEs since then.

The new ODEQN subroutine was introduced in Release 2021.1.5 of SAS IML software. See the documentation for a full description of the features of the ODEQN subroutine. Briefly, it has a simple calling syntax and supports about 36 different integration methods, including the following:

  • BDF: backward differentiation formulas for stiff problems
  • Adams-Moulton method for non-stiff problems
  • Hybrid (BDF + Adams): Automatic detection of stiffness and switching
  • Explicit Runge-Kutta methods: Popular solvers in the sciences and engineering fields

Each class of solvers supports several "orders," which equate to low- or high-order Taylor series approximations.

The new ODEQN subroutine has several advantages over the previous solver, but the following features are noteworthy:

  • Self-starting: the subroutine automatically chooses an initial step size
  • Variable-order methods use higher-order approximations to take larger steps, when possible
  • Automatically detect stiffness (Hybrid method)
  • Support piecewise-defined vector fields
  • Support backward-in-time integration
  • BLAS-aware linear algebra scales to handle large systems

Example: A basic SIR model in epidemiology

Many introductory courses in differential equations introduce the SIR model in epidemiology. The SIR model is a simple model for the transmission and recovery of a population that is exposed to an infectious pathogen. More sophisticated variations of the basic SIR model have been used by modelers during the COVID-19 pandemic. The SIR model assumes that the population is classified into three populations: Susceptible to the disease, Infected with the disease, and Recovered from the disease.

In the simplest SIR model, there are no births and no deaths. Initially, almost everyone is susceptible except for a small proportion of the population that is infected. The susceptible population can become infected by interacting with a member of the infected population. Everyone in the infected population recovers, at which point the model assumes that they are immune to reinfection. The SIR model has two parameters. One parameter (b) controls the average rate at which a susceptible-infected interaction results in a new infected person. Another parameter (k) represents how quickly an infected person recovers and is no longer infectious.

The following SAS IML module defines the SIR equations, which model the proportion of the population that is susceptible, infected, and recovered. Initially, 99.99% of the population is susceptible, 0.01% is infected, and 0% is recovered. The new ODEQN subroutine in SAS Viya solves the system of equations for 100 units of time. The output of the routines is a 100 x 3 matrix. The i_th row is the proportion of the population in each category after i units of time have passed. Notice that the syntax of the ODEQN subroutine is quite simple: You specify the module that defines the equations, the initial conditions, and the time points at which you want a solution.

/* Program requires SAS IML in SAS Viya 2021.1.5 */
proc iml;
/* SIR model: https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology */
start SIR(t, y) global(b, k);
   S = y[1]; I = y[2]; R = y[3];
   dSdt = -b * S*I;
   dIdt =  b * S*I - k*I;
   dRdt =            k*I;
   return ( dSdt || dIdt || dRdt );
finish;
 
/* set parameters and initial conditions */
b = 0.5;      /* b param: transition rate from S to I */
k = 1/14;     /* k param: transition rate from I to R */
IC = {0.9999  0.0001   0};   /* S=99.99% I=0.01%  R=0 */
t = T(0:100);  /* compute spread for each day up to 100 days */
 
call odeqn(soln, "SIR", IC, t);  /* in SAS Viya only */

You can graph the solution of this system by plotting the time-evolution of the three populations:

/* write solution to SAS data set */
S = t || soln;
create SIROut from S[c={'Time' 'Susceptible' 'Infected' 'Recovered'}];
append from S;
close;
QUIT;
 
title "Solution to SIR Model";
title2 "b = 1/2; k = 1/14";
proc sgplot data=SIROut;
   series x=Time y=Susceptible;
   series x=Time y=Infected;
   series x=Time y=Recovered;
   xaxis grid; yaxis grid label="Percent of Group";
run;

For the parameters (b, k) = (0.5, 0.07), the infected population rapidly increases until Day 25 when almost 60% of the population is infected. By this time, the population of susceptible people is only about 10% of the population. For these parameters, nearly everyone in the population becomes infected by Day 60. The highly infectious (but not lethal) epidemic is essentially over by Day 80 when almost everyone has been infected, recovered, and is now immune.

Example: An SIR model for a change in behavior

One of the advantages of mathematical models is that you can simulate "what if" scenarios. For example, what if everyone in the population changes their behavior on Day 20 to reduce the transmission rate? This might include the adoption of non-pharmaceutical interventions such as mask-wearing, social distancing, and quarantining the infected. What if these behaviors change the effective rate of transmission from 0.5 (very high) to 0.1 (moderate)? You can modify the module that defines the ODE to reflect that parameter change at t=20. As mentioned earlier, the ODEQN subroutine can integrate piecewise-defined systems if you specify the time at which the system changes its behavior. The following call to the ODEQN subroutine uses the TDISCONT= option to tell the integrator that the system of ODEs changes at t=20.

%let TChange = 20;
start SIR(t, y) global(k);
   if t < &TChange then b = 0.5;   /* more spread */
   else                 b = 0.1;   /* less spread */
 
   S = y[1]; I = y[2]; R = y[3];
   dSdt = -b * S*I;
   dIdt =  b * S*I - k*I;
   dRdt =            k*I;
   return ( dSdt || dIdt || dRdt );
finish;
 
/* set parameters and initial conditions */
k = 1/14;     /* k param: transition rate from I to R */
IC = {0.9999  0.0001   0};   /* S=99.99% I=0.01%  R=0 */
t = T(0:100);  /* compute spread for each day up to 100 days */
call odeqn(soln, "SIR", IC, t) TDISCON=&TChange;

If you plot the new solutions, you obtain the following graph:

In the new scenario, the increase in the infected population is reversed by the intervention on Day 20. From there, the proportion of infected persons decreases over time. Eventually, the disease dies out. By Day 100, 75% of the population was infected and recovered, leaving 25% that are still susceptible to a future outbreak.

Summary

Although moving to SAS Viya offers many advantages for SAS customers who use Big Data and want to leverage parallel processing, Viya also provides new features for traditional SAS programmers. One example is the new ODEQN subroutine in SAS IML, which solves ODEs that are initial value problems by using state-of-the-art numerical algorithms. In addition to computational improvements in accuracy and speed, the ODEQN subroutine has a simple syntax and makes many decisions automatically, which makes the routine easier for non-experts to use.

The post New methods for solving differential equations in SAS appeared first on The DO Loop.

4月 062022
 

The graph to the right is the quantile function for the standard normal distribution, which is sometimes called the probit function. Given any probability, p, the quantile function gives the value, x, such that the area under the normal density curve to the left of x is exactly p. This function is the inverse of the normal cumulative distribution function (CDF).

Did you know that this curve is also the solution of a differential equation? You can write a nonlinear second-order differential equation whose solution is exactly the probit function! This means that you can compute quantiles by solving a differential equation.

This article derives the differential equation and solves it by using the built-in ODE subroutine in SAS/IML software.

A differential equation for quantiles

Let w(p) = PROBIT(p) be the standard normal quantile function. The w function satisfies the following differential equation:
\(\frac{d^2w}{dp^2} = w \left( \frac{dw}{dp}\right)^2\)
The derivation of this formula is shown in the appendix.

A differential equation looks scary, but it merely reveals a relationship between the derivatives of a function at every point in the domain. For example: at the point p=1/2, the value of the probit function is w(1/2) = 0, and the slope of the function is w'(1/2) = sqrt(2 π). From those two values, the differential equation tells you that the second derivative at p=0 is exactly 0*2π = 0.

The appendix shows how to use calculus to prove the relationship. You can also use the NLPFDD subroutine in SAS/IML to compute finite-difference derivatives at an arbitrary value of p. The derivatives satisfy the differential equation to within the accuracy of the computation.

Use the differential equation to compute quantiles

An interesting application of the ordinary differential equation (ODE) is that you can use it to compute quantiles. At the point, p=1/2, the probit function has the value w(1/2) = 0, and the slope of the function is w'(1/2) = sqrt(2 π). Therefore, the probit function is the particular solution to the previous ODE for those initial conditions. Consequently, you can obtain the quantiles by solving an initial value problem (IVP).

I previously showed how to use the ODE subroutine in SAS/IML to solve initial value problems. For many IVPs, the independent variable represents time, and the solution represents the evolution of the state of the system, given an initial state. For this IVP, the independent variable represents probability. By knowing the quantile of the standard normal distribution at one location (p=1/2), you can find any larger quantile by integrating forward, or any smaller quantile by integrating backward.

The ODE subroutine solves systems of first-order differential equations, so you must convert the second-order equation into a system that contain two first-order equations. Let v = dw/dp = w`. Then the differential equation can be written as the following first order system:

   w` = v
   v` = w * v**2

The following SAS/IML statement use the differential equation to find the quantiles of the standard normal distribution for p={0.5, 0.68, 0.75, 0.9, 0.95, 0.99}:

proc iml;
/* Solve first-order system of ODES with IC
   w(1/2) = 0
   v(1/2) = sqrt(2*pi)
*/
start NormQntl(t,y);
   w = y[1]; v = y[2];
   F = v  // w*v**2;
   return(F);
finish;
 
/* initial conditions at p=1/2 */
pi = constant('pi');
c = 0 || sqrt(2*pi);  
 
/* integrate forward in time to find these quantiles */
p = {0.5, 0.68, 0.75, 0.9, 0.95, 0.99};
 
/* ODE call requires a column vector for the IC, and a row vec for indep variable */ 
c = colvec(c);               /* IC */
t = rowvec(p);               /* indep variable */
h={1.e-7, 1, 1e-4};          /* (min, max, initial) step size for integrator */
call ode(result, "NormQntl", c, t, h) eps=1e-6;
Traj  = (c` // result`);     /* solutions in rows. Transpose to columns */
qODE = Traj[,1];             /* first column is quantile; second col is deriv */
 
/* compare with the correct answers */
Correct = quantile("Normal", p);
Diff = Correct - qODE;       /* difference between ODE solution and QUANTILE */
print p Correct qODE Diff;

The output shows the solution to the system of ODEs at the specified values of p. The solution is very close to the more exact quantiles as computed by the QUANTILE function. This is fairly impressive if you think about it. The ODE subroutine has no idea that it is solving for normal quantiles. The subroutine merely uses a general-purpose ODE solver to generate a solution at the specified values of p. In contrast, the QUANTILE routine is a special-purpose routine that is engineered to produce the best-possible approximations of the normal quantiles.

To get quantiles for p < 0.5, you can use the symmetry of the normal distribution. The quantiles satisfy the relation w(1-p) = -w(p).

Summary

A differential equation reveals the relationship between a function and its derivatives. It turns out that there is an ordinary differential equation that is satisfied by the quantile function of any smooth probability distribution. This article shows the ODE for the standard normal distribution. You can solve the ODE in SAS to compute quantiles if you also know the value of the quantile function and its derivative at some point, such as a median at p=1/2.

Appendix: Derive the quantile ODE

For the mathematically inclined, this section derives a differential equation for any smooth probability distribution. The equations are given in a Wikipedia article, but the article does not include a derivation.

Let X be a random variable that has a smooth CDF, F. The density function, f, is the first derivative of F. By definition, the quantile function, w, is the inverse of the CDF function. That means that for any probability, p, the CDF and quantile function satisfy the equation \(F(w(p)) = p\).

Take the deivative of both sides with respect to p and apply the chain rule:
\(F^\prime(w(p)) \frac{dw}{dp} = 1\)
Substitute f for the derivative and solve for dw/dp:
\(\frac{dw}{dp} = \frac{1}{f(w(p))}\)

Again, take the derivative of both sides:
\(\frac{d^2w}{dp^2} = \frac{-1}{f(w(p))^2} \frac{df}{dp} \frac{dw}{dp}\).
You can use the relationship dw/dp = 1/f to obtain
\(\frac{d^2w}{dp^2} = \frac{-f(w)^\prime}{f(w)} \left( \frac{dw}{dp} \right)^2\).
In this derivation, I have not used any special properties of the distribution. Therefore, it is satisfied by any probability distribution that is sufficiently smooth and for any values of p for which the equation is well-defined.

The ODE simplifies for some probability density functions. For example, the standard normal density is \(f(w) = \frac{1}{\sqrt{2\pi}} \exp(-w^2/2 )\) and thus
\(f^\prime(w) = \frac{1}{\sqrt{2\pi}} \exp(-w^2/2 )(-w) = -f(w) w\).
Consequently, for the standard normal distribution, the ODE simplifies to
\(\frac{d^2w}{dp^2} = w \left( \frac{dw}{dp}\right)^2\).

The post A differential equation for quantiles of a distribution appeared first on The DO Loop.

3月 212022
 

Statistical programmers need to access numerical constants that help us to write robust and accurate programs. Specifically, it is necessary to know when it is safe to perform numerical operations such as raising a number to a power without exceeding the largest number that is representable in finite-precision arithmetic. This article discusses five constants that every statistical programmer must know: PI, MACEPS, EXACTINT, BIG, and SMALL. In the SAS language, you can find the values of these constants by using the CONSTANT function in Base SAS.

The following table shows the values of the constants that are discussed in this article:

data Constants;
length Const $10;
format Value Best12.;
input Const @@;
Value = constant(Const);
datalines;
PI MACEPS EXACTINT 
BIG LOGBIG SQRTBIG SMALL
;
 
proc print noobs; run;

Pi and other mathematical constants

The CONSTANT function provides values for various mathematical constants. Of these, the most important constant is π ≈ 3.14159..., but you can also obtain other mathematical constants such as the natural base (e) and the golden ratio (φ). To get an approximation for π, use pi = constant('pi'). The number π is essential for working with angles and trigonometric functions. For an example, see the article, "Polygons, pi, and linear approximations," which uses π to create regular polygons.

Machine epsilon

Arguably the most important constant in computer science is machine epsilon. This number enables you to perform floating-point comparisons such as deciding whether two numbers are equal or determining the rank of a numerical matrix. To get machine epsilon, use eps = constant('maceps').

The largest representable integer

It is important to know the largest number that can be represented accurately in finite precision on your machine. This number is available as bigInt = constant('exactint'). This number enables you to find the largest factorial number that you can compute (18!) and the largest row of Pascal's triangle (56) that you can faithfully represent in double precision (56).

The largest representable double

Except for π, the constants I use the most are related to BIG ≈ 1.8E308. I primarily use LOGBIG and SQRTBIG, which you can compute as
logbig = constant('logbig');
sqrtbig = constant('sqrtbig');
These constants are useful for preventing overflow when you perform arithmetic operations on large numbers:

  • The quantity exp(x) can only be computed when x is less than LOGBIG ≈ 709.78.
  • The LOGBIG option supports the BASE suboption (BASE > 1), which you can use to ensure that raising a number to a power does not overflow. For example, constant('logbig', 2) returns 1024 because 2**1024 is the largest power of 2 that does not exceed BIG.
  • The SQRTBIG option tells you whether you can square a number without overflowing.

The smallest representable double

What is the smallest positive floating-point number on your computer? It is given by small = constant('small'). There are also LOGSMALL and SQRTSMALL versions, which you can use to prevent overflow. I don't use these constants as frequently as their 'BIG' counterparts. In my experience, underflow is usually not a problem in SAS.

Summary

This article discusses five constants that every statistical programmer must know: PI, MACEPS, EXACTINT, BIG, and SMALL. Whether you need mathematical constants (such as π) will depend on the programs that you write. The MACEPS constant is used to compare floating-point numbers. The other constants are used by computer scientists and numerical analysts to ensure that programs can correctly compute with very large (or very small) numbers without encountering floating-point overflow (or underflow).

The post Five constants every statistical programmer should know appeared first on The DO Loop.