Pi Day

3月 082023

Happy Pi Day! Every year on March 14th (written 3/14 in the US), people in the mathematical sciences celebrate "all things pi-related" because 3.14 is the three-decimal approximation to π ≈ 3.14159265358979....

Modern computer methods and algorithms enable us to calculate 100 trillion digits of π. However, I think it is fascinating to learn about how mathematicians in the pre-computer age were able to approximate π to many decimal places through cleverness and manual paper-and-pencil calculations.

One of best mathematicians of all time was Isaac Newton, who calculated 16 digits of π by hand way back in 1666. Newton accomplished this by combining geometry with several mathematical techniques that were new at the time. These techniques included the new field of integral calculus (which Newton himself invented), and an early use of a truncated infinite series to approximate a function. Newton's formulation of the problem is understandable by any high-school geometry student. Although the manual calculations are tedious, they can be understood by anyone who has taken calculus. In this article, I use a computer to simplify the long calculations in Newton's approximation. The computer calculation avoids the infinite series and provides a 14-decimal approximation to π.

I learned about Newton's approximation from Matt Parker and the "Think Maths" group in the UK, who made a wonderful video about Newton's approximation. They also provide resources for math teachers who want to present this topic to their students.

The geometry of Newton's calculation

The figure at the right shows the geometry behind Newton's calculation of pi. First, draw a semicircle of radius 1/2 centered at the point C(1/2, 0) in the Cartesian plane. Then draw a vertical line segment from A(1/4, 0) until it hits the circle at B. High school geometry shows that the triangle ABC is a 30-60-90 triangle, and the line segment AB intersects the circle at B(1/4, sqrt(3)/4).

Newton knew the following facts:

  1. The angle OCB is 60 degrees. Therefore, the area of the sector OCB is π/6 r2, where r is the radius of the circle. Because the radius is r = 1/2, the area of the sector OCB is π / 24.
  2. The area of the triangle ABC is easily found to be (1/2)*base*height = (1/2)(1/4)(sqrt(3)/4) = sqrt(3)/32. Newton could calculate this quantity to an arbitrary number of decimal places because the square root algorithm was known by the ancient Babylonians and by the Greeks.
  3. The area of the crosshatched region (denoted M) can be found by calculating a definite integral. The equation of the semicircle is \(f(x) = \sqrt{\tfrac{1}{4} - (x-\tfrac{1}{2})^2}\), which simplifies to \(f(x) = \sqrt{x} \sqrt{1-x}\). Therefore, the area can be calculated as \( M = \int_0^{1/4} \sqrt{x} \sqrt{1-x} \, dx\).

From the figure, we see that Area(OCB) = M + Area(ABC). Plugging in the formulas for these quantities, we get
π /24 = M + sqrt(3)/32. Multiplying both sides by 24 gives the equation that Newton used to approximate π:
      π = 24*M + 3 sqrt(3)/4

Newton approximated M to many decimal places, thus providing a decimal approximation for π. Newton used a truncated infinite series expansion to approximate M, but the next section shows an alternative approach: you can use numerical integration to approximate M.

A modern version of Newton's approximation

By using modern computational techniques, you can evaluate the integral (M) numerically. If you evaluate M to high precision, you get a high-precision approximation for π.

Most modern computer languages include a function to evaluate a definite interval on a closed interval. The SAS/IML language provides the QUAD subroutine, which enables you to compute M to many decimal digits by using numerical quadrature (integration). Here is the SAS/IML program that carries out Newton's approximation of pi:

/* Newton's approximation of pi. For details, see https://think-maths.co.uk/newton-pi */
proc iml;
/* integrand is f(x) = sqrt(0.25 - (x-0.5)##2)
                     = sqrt(x*(1-x)) */
start SemiCircle(x);
   return( sqrt(x - x##2) );
/* numerical approximation of M = area under semicircle on [0, 1/4] */
call quad(M, "SemiCircle", {0, 0.25}) EPS=1E-15;
/* AreaSector = pi/6 * r^2 = pi / 24 
   Newton used AreaSector = M + AreaTri, so
     pi / 24 = M + AreaTri
     pi = 24*(M + AreaTri)
AreaTri = 1/2 * (1/4) * (sqrt(3) / 4); /* area of triangle ABC */
piApprox = 24*(M + AreaTri);
print piApprox[F=16.14 L="Newton's Approx"];

The program outputs an approximation to pi that is accurate to 14 decimal digits. This value is the standard double-precision floating-point approximation to pi. The program gives the same result as the SAS function constant('pi').

Newton's approximation by using infinite series

Notice that my approximation only captures the value of pi to 14 decimal digits, whereas Newton achieved 16-digit accuracy. How did Newton get more accuracy? He didn't use a numerical method for integrating the integral. Instead, he used the binomial expansion of the term sqrt(1 - x), which replaces the term by an infinite series. He kept 22 terms of the infinite series and integrated (by hand!) the resulting expression.

I am not going to repeat Newton's tour de force calculation, but you can watch a video in which the Think Maths group in the UK repeats Newton's calculation by hand using long division and extremely large pieces of paper! The group also created a document that shows the details. The main idea is to recall the binomial expansion:
\( (1+z)^{\frac {1}{2}} = 1 +{\tfrac {1}{2}}z -{\tfrac {1}{8}}z^{2} +{\tfrac {1}{16}}z^{3} -{\tfrac {5}{128}}z^{4} +{\tfrac {7}{256}}z^{5} -\cdots = \sum _{n=0}^{\infty }{\frac {(-1)^{n-1}(2n)!}{4^{n}(n!)^{2}(2n-1)}}z^{n} \)

In this case, we want to replace z by (-x), which changes the sign of terms that contain odd powers of z. The result is an infinite series in which each term (except the first) has a negative sign:

\( (1-x)^{\frac {1}{2}} = 1 -{ \tfrac {1}{2}}x -{\tfrac {1}{8}}x^{2} -{\tfrac {1}{16}}x^{3} -{\tfrac {5}{128}}x^{4} -{\tfrac {7}{256}}x^{5} -\cdots \)

Newton kept 22 terms of this infinite series and integrated the function term-by-term on the interval [0, 1/4]. The result is 22 fractions that can be evaluated by long division. However, the numerator and denominator of the higher-order terms are HUGE! For example, the 20th term is the fraction 119,409,675 / (41*275). Although, in principle, you could calculate Newton's approximation by hand (and, remember, he did!), it would not be a pleasant way to spend an evening.


In 1666, Newton computed 16 digits of π by constructing a geometric figure in which the value of pi is related to the numerical approximation of an integral. Newton approximated the integral by expanding a function in an infinite series, truncating the series at 22 terms, and evaluating each term by using long division. Although the formulation and general idea is accessible to calculus students, the calculations are long and tedious. Instead, you can use modern numerical methods to evaluate the integral, thus computing a 14-digit approximation to π.


The post How Newton calculated pi to 16 decimal places appeared first on The DO Loop.

3月 142022

On this Pi Day, let's explore the "πth roots of unity." (Pi Day is celebrated in the US on 3/14 to celebrate π ≈ 3.14159....) It's okay if you've never heard of the πth roots of unity. This article starts by reviewing the better-known nth roots of unity. It then shows all the math you'll need to understand the πth roots of unity and how to visualize them!

The nth roots of unity

Recall that the nth root of unity for a positive integer, n, is a complex number, z, such that the nth power of z equals 1. In symbols, z satisfies the equation zn = 1.

For any positive integer, n, there are n roots of unity. The nth roots of unity are the complex numbers
\(\exp(2\pi i k /n) = \cos(2 \pi k/n) + i \sin(2 \pi k/n)\)
for k=0, 1, 2, ..., n. Geometrically, these points are the equally spaced points on the unit circle. You get them by tracing the image of the point 1 + 0*i as you rotate the circle by 2π/n radians. Some simple examples are:

  • n=2. The square roots of unity are the numbers 1 and -1. Geometrically, these points are represented in the Argand plane by the ordered pairs (1, 0) and (-1, 0). They are the orbit of the point (1,0) under rotations by π radians.
  • n=3. The cube roots of unity are the numbers 1, -1/2 + i sqrt(3)/2, and -1/2 - i sqrt(3)/2. These points are the orbit of the point (1,0) under rotations by 2π/3 radians.
  • n=4. The 4th roots of unity are the numbers 1, i, -1, and -i. The coordinates of these points on the unit circle are (1,0), (0, 1), (-1, 0), and (0, -1). These points are the orbit of the point (1,0) under rotations by π/2 radians.

The πth roots of unity

So, what is a πth root of unity? It is complex number, z, such that zπ = 1. Geometrically, it is one of the points on the orbit of the point (1,0) by rotations of 2*pi/pi = 2 radians. These complex numbers have the formula \(z = \exp(2 i n)\) for any integer, n. You can verify that these numbers are the πth roots of unity by showing that \(z^\pi = 1\), as follows:
\(\exp(2 i n)^\pi = \exp(2\pi i n) = \cos(2 \pi n) + i \sin(2 \pi n) = 1\) for all integers n.

Notice that the "circle map" x → exp(i x) sends the even integers to the πth roots of unity. Thus, you can also refer to the points as the image of the even integers under the mapping.

Geometrically, the πth roots are generated by rotating the unit circle by 2 radians or by -2 radians. For simplicity, the subsequent sections visualize the πth roots for only the positive integers, which correspond to counterclockwise rotations by 2 radians.

Visualize the πth roots of unity

Let's generate some positive integers and plot the complex points exp(2 i n) in the Argand plane. To get started, let's plot only the images of the first 25 even integers:

%let maxN = 355;
data Roots;
do n = 1 to &maxN;
   s = 2*n;
   xn = cos(s);   /* image of even integers under mapping to unit circle */
   yn = sin(s);
/* define a helpful macro */
%macro PlotCircle();
ellipseparm semimajor=1 semiminor=1 / xorigin=0 yorigin=0 outline lineattrs=(color=cxCCCCCC);
ods graphics / width=400px height=400px;
title "Wrap Positive Even Integers onto a Circle";
title2 "First 25 Even Integers";
proc sgplot data=Roots(obs=25) aspect=1 noautolegend;
   scatter x=xn y=yn / datalabel=s;
   refline 0 / axis=x;   refline 0 / axis=y;
   xaxis display=none;   yaxis display=none;

Geometrically, you can think of wrapping the real number line onto the circle like thread onto a spool. The points show the images of the even integers. You can see that 2 is mapped into the second quadrant, 4 is mapped to the third quadrant, and 6 is mapped to the fourth quadrant just below the point (1,0) because 6 radians is slightly less than 2π. This pattern repeats for the images of 8, 10, and 12. Each set of three points slightly lags the previous set of three points.

The image of 22 is very close to (-1, 0). This is because 22/7 is a close approximation of π, so 22 ≈ 7π. Similarly, the image of 44 ≈ 14π is close to the point (1, 0).

The first 22 images are nearly equally spaced on the circle, but not quite. You can see that the images of 46, 48, and 50 almost overlay the images of 2, 4, and 6, but are slightly separated. This pattern continues for the next set of 22 even integers, and the next, and the next. No image ever hits a previous image because we are rotating by 2 radians and there are 2π radians in a circle. (These numbers are not commensurable.) The following graph shows the image of a few hundred even integers onto the circle:

This graph visualizes the πth roots of unity. The images of all positive and all negative integers will never overlap or repeat. By the irrational rotation theorem (rescaled), the πth roots of unity are dense in the circle.

The Beatty sequence on the circle

In a previous article, I showed that every positive integer belongs to one of two disjoint sets. Each positive integer belongs to the Beatty sequence for π, or it belongs to the complementary sequence. The same is true for the even positive integers: some are elements of the Beatty sequence for π, and the rest are not. Let's see what happens if we color each point on the circle according to whether it is the image of the Beatty sequence:

Did you expect this result? Wow! The graph shows that all of the integers that belong to the Beatty sequence are mapped into two arcs on the unit circle. The red arc in the second quadrant is for radian angles in the interval (π-1, π). The red arc in the fourth quadrant is for radian angles in the interval (2π-1, 2π). These values can be deduced by using the definition of the Beatty sequence, and the fact that a Beatty number can be represented as floor(k π) = k π - f for some fractional part f in the interval (0, 1).

The graph shows that (1/π)th of the πth roots of unity belong to the Beatty sequence for π. And these numbers are all clumped together into two arcs on the circle.

Technically, the Beatty sequence is defined only for positive integers. Thus, I've illustrated this result for positive integers. However, you can extend the Beatty sequence to all non-zero integers in a natural way. If you do that, then the roots of unity that belong to the "negative Beatty sequence" are visualized by reflecting the circle across the horizontal axis. If you overlay the two circles, the extended red and blue markers interlace, and the result is not as striking.


There are infinitely many πth roots of unity. They are complex numbers of the form \(\exp(2 i n)\) for any integer, n. Geometrically, you can visualize these points as the rotations by ±2 radians of the point (1,0) in the Argand plane. The πth roots of unity are dense in the unit circle. If you restrict your attention to only positive rotations (the images of the positive integers), you can classify each root according to whether it is the image of an even integer in the Beatty sequence for π or in the complementary sequence. The points that are in the Beatty sequence are mapped to the arcs for angles (π-1, π) and (2π-1, 2π). These arcs are (1/π)th of the arclength of the unit circle.

You can download the SAS program that creates these images.

Further reading

The post The pi_th roots of unity appeared first on The DO Loop.

3月 142022

Irony of using square wheelsEvery year on March 14, we celebrate Pi Day in recognition of the famed mathematical constant π. This date in the format MM/DD is 3/14 which coincides with the first three digits of the π value 3.14.

As you all know, π represents the ratio of a circle's circumference to its diameter. Therefore, we can derive a circle's circumference C from its diameter d using the following famous formula:

C = πd (or C = 2πr using radius r).

While it is impossible to overstate the significance of π in its fundamental and legendary role of being a constant representing the circumference/diameter ratio, its usage for area and volume measurements is not as much of a “settled science” as you may think. In this post, I dare to debunk the notion of π being an inherent component of the omnipresent circle area and sphere volumes formulas:

A = πr2 = π/4 ⋅ d2

V = 4π/3 ⋅ r3 =  π/6 ⋅ d3

Believe it or not,  we can significantly simplify these formulas by getting rid of π altogether. Wanna know how? Just sit down, relax, open your mind and fasten your seat belt.

Misnomer alert!

(Definition: A misnomer is a wrong or inaccurate use of a name or term.)

Let’s start with some questionable and overreaching semantics in algebra. Have you ever noticed that we call a number to the power of 2 as the number squared? Similarly, we call a number to the power of 3 as the number cubed.  However, besides 2 and 3, we treat all other exponents Np equally by saying “N to the power of p”.

Here is how Wikipedia explains this exponentiation phenomenon:

“The expression b2 = bb is called "the square of b" or "b squared", because the area of a square with side-length b is b2.

Similarly, the expression b3 = bbb is called "the cube of b" or "b cubed", because the volume of a cube with side-length b is b3.

Wait a minute! Exponents b2 and b3 are mathematical operations from the Algebra Department. They have nothing to do with geometrical shapes and exist independently from geometry and area/volume calculations.

The same semantical problem persists with the reverse terms square root and cube root. All natural numbers N have their corresponding root operations, which we call their Nth root. So why do we need this suggestive verbal association of the area/volume measurement units to squares and cubes?

Moreover, the very word “area” itself is often replaced by square footage, square mileage, or square something thus further implying that areas and squares are synonymous. To me this is like calling a round wheel “a rolling square”.

Overall, this square/cube-centric language insidiously conditions our minds into thinking only in terms of squares and cubes regarding the units of area and volume measurement.

Why do we measure area of a circle in squares?

We all know from middle school’s elementary geometry that area of a circle of radius r is A = πr2 (or A = πd2/4 using diameter d since r = d/2). Many famous ancient mathematicians from Archimedes to Eudoxus of Cnidus to Hippocrates of Chios have proven this formula. There is also a multitude of its proofs using modern methods rooted in trigonometry and calculus.

However, all these methods have one funny thing in common. It is the presumption that an area must be measured in squares (square inches, square meters, square miles, etc.). Think about it for a minute: for thousands of years we have been measuring round areas by the number of squares with a side=1 covering that round area. C’mon! We invented the wheel, electricity, engines, airplanes, telephone, radio, TV, computers, spaceships, artificial intelligence and more, but we are still measuring round areas in squares!

It makes perfect sense measuring rectilinear areas in square units – they are a good fit for each other. But measuring round areas in squares?! What would you say if we were measuring square areas by the number of round coins covering them? How is that not as good as that the opposite?

Introducing circular units of areas

Defining units of measurement is totally arbitrary. All we need is a well-defined gauge. For example, we have a variety of length units such as foot (derived from the size of the human foot), inch (derived from the width of the human thumb), meter (derived from the Greek verb μετρέω  - measure, count) and so on. They are all convertible to each other, but their definitions are quite different.

When it comes to the units of area, it can be a square, a rectangle, a circle, an oval, a star, even a snowflake or any other two-dimensional shape of a specified size as long as its definition is clear and unambiguous.

Without further ado let’s define a circular unit of area as an area of a circle with the diameter d=1 unit of length (inch, foot, meter, etc.).

Measuring circle areas in circular units

In order to measure areas of different sized circles, we do not need to count the number of these single unit circles within the measured circle. We just place these two circles concentrically and expand or shrink our gauge circle until they align.  Here is a visual to illustrate this:

For a circle with the diameter d, we can express its area measured in the circular units Ac using the following formula:Ac(d) = d ⋅ d = d2

I would not call d2 as “d squared” here. The power of 2 is an indication of two-dimensional area while d is a one-dimensional length.

Let’s prove this formula. Suppose As(d) is a circle area with diameter d in square units and Ac(d) is the circle area in circular units.

As we know As(d) = πr2 = π (d/2)2 =  (π/4)d2. For the same circle its area in circular units is Ac(d).

For d=1, we have As(1) = πr2 = π (d/2)2 =  (π/4)d2 = π/4. The same circle area represents 1 circular area unit Ac(1)=1.

Therefore, Ac(d) =  Ac(1) d2 =  d2, that is we effectively eliminated π from the circle area calculation.

Measuring squares in circular units

Circle and square of equal size
At the same time, π is not the kind of number to easily get rid of. Similar to the law of conservation of energy, there is probably a law of conservation of π: whenever we get rid of it in one place, it appears in another place.

Let’s suppose we have a square with side length of x. Then its area in square units will be As(x) = x2. If we take a circle of the same area we get the following equation As(x) = x2 = (π/4)d2. From here d2 = (4/π)x2. Since d2 represents the number of circular units Ac we can say that the area of a square with side length of x in circular units is Ac(x) = (4/π)x2.

Notice that π-wise these two formulas for area, Circle_A=(π/4)d2 [square units]  and Square_A=(4/π)x2 [circular units], are literally upside-down version of each other.

Spheric units for 3D geometry

We can apply exact same approach to measuring volumes of three-dimensional objects. In the cubic unit’s paradigm, we define one cubic unit of volume as a cube with a side of one unit of length. Then the volume of a cube with a side of x is Cube_V=x3 [cubic units]  and the volume of a sphere with a diameter of d=2r is Sphere_V = (4/3)πr3 = (π/6)d3 [cubic units].

Alternatively, in the spheric unit’s paradigm we define one spheric unit of volume as a sphere with a diameter of one unit of length. Then the volume of a sphere with a diameter of d is Sphere_V=d3 [spheric units]  and the volume of a cube with a side of x is Cube_V=(6/π)x3 [spheric units].

Again, here I would not call d3 and x3 as “d cubed” and “x cubed” since the power of 3 is indicative of three-dimensional space while d and x are one-dimensional units of length.

Similarly to area calculations, π-wise these two formulas for volume, Sphere_V=(π/6)d3 [cubic units]  and Cube_V=(6/π)x3 [spheric units], are literally upside-down version of each other.

Is this practical?

If you still doubt the validity and/or practicality of my geometric deductions, check out circular mil units of area adopted for wire size measurement by the US National Electrical Code (NEC) and the Canadian Electrical Code (CEC).

A circular mil (cmil) is a unit of area equal to the area of a circle with a diameter of one mil (one thousandth of an inch or approximately 0.0254 mm).

While applied just to one particular unit of length – mil, it is exactly the same circular units of area concept as described in the previous sections.

As you can see, electrical engineers have figured out a use for circular units, perhaps because they are dealing with electrical wires, which are predominantly round in their cross section. Since electrical conductivity of wires is directly proportional to their cross-section area, it is just more convenient to use circular units (without π) for calculating the cross section and ultimately electrical conductivity. The same is true for the throughput capacity while transporting any substance via pipes (water, gas, oil, etc.) or moving blood through the blood vessels.

There are many other applications dealing with the circular shapes and spherical objects. In all these cases, it makes more sense to measure areas in circular units rather than squares and measure object volumes in spheric units rather than cubes. Take for example physics (gravitational, electrical and magnetic fields), astronomy (planets, stars, comets, and orbits), geology (areas around earthquake epicenters), telephony (areas of cellular tower coverage), etc.

Your thoughts, comments

I would like to hear from you. What do you think about all this? Please share your thoughts in the Comments section below.

ACCESS FREE SAS® SOFTWARE | SAS® OnDemand for Academics

Calculating circle areas and sphere volumes without π was published on SAS Users.

3月 092022

Did you know that you can use π to partition the positive integers into two disjoint groups? It's not hard. One group is generated by the integer portions of multiples of π. The FLOOR function gives the integer portion of a positive number, so you can write integer that are generated from π as Bn = {floor(n*π)} for n=1,2,3,.... This is called the Beatty sequence for π. The first few numbers in the Beatty sequence for π are 3, 6, 9, 12, 15, 18, 21, 25, 28, 31, 34, 37, 40, 43, 47, 50, 53, ....

The second group contains all the positive integers that are not in the Beatty sequence: 1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 24, 26, .... A remarkable fact is that the second group of integers is also the Beatty sequence for some number! In fact, it is the Beatty sequence for the number π/(π-1). So, the positive integers are partitioned into two mutually disjoint groups by using the Beatty sequence for π and the complementary Beatty sequence for π/(π-1). These two sequences generate all positive integers, and no integer is in both sequences. For example, the number 2020 appears only in the Beatty sequence for π whereas 2022 appears in the complementary sequence.

It turns out that the only properties of π that are needed for this result is the fact that π is irrational and π > 1, a result that is known as Beatty's Theorem. This article uses SAS to illustrate the Beatty sequence for π and its complementary sequence.

The Beatty sequence

For any irrational number r > 1, the Beatty sequence for r is integer portion of the sequence r, 2*r, 3*r, .... You can compute the Beatty sequence as Bn = {floor(n*r)} for n=1,2,3,.... The complementary sequence is generated by the irrational number c = r/(r-1) in the same way. The complementary sequence is Cn = {floor(n*c)} for n=1,2,3,....

With these definitions, Beatty's theorem (also called Rayleigh's theorem) states that for any irrational number, r > 1, the Beatty sequence and the complementary sequence are disjoint and generate all positive integers. For every positive integer, y, either y is an element of the Beatty sequence or y is an element of the complementary sequence. Thus, the Beatty sequence and its complementary sequence partition the integers into two disjoint parts.

For a formal proof, see the Wikipedia article or excellent "Cut the Knot" web site.

The Beatty sequence for pi

Let's illustrate the theorem by using r = π and a finite number of positive integers. The following example uses the first 355 terms of the Beatty and complementary sequences, which cover the first 520 positive integers. Why 355 terms? Because 355/113 is part of the continued fraction expansion of π, and is an excellent approximation of π.

First, let's visualize the individual sequences. Each sequence is the restriction of a line to the postive integers. The Beatty sequence Bn = {floor(π n)} is obtained from the line through the origin with slope π; the complementary sequence Cn is obtained from the line with slope π/(π - 1) ≈ 1.4669.... The following SAS DATA step generates values for each sequence, then sorts the sequence values. The values are graphed and color-coded according to whether they belong to the Beatty sequence ('B') or the complementary sequence ('C').

%let maxN = 355;            /* because pi ~ 355/113 */
data Beatty;
pi = constant('pi');
Sequence = 'B';             /* slope = pi */
do n = 1 to &maxN;
   s = floor(pi*n);         /* the Beatty sequence */
Sequence = 'C';             /* slope = pi/(pi-1) */
do n = 1 to &maxN;
   s = floor(pi/(pi-1)*n);  /* the complementary sequence */
/* find the lessor of the maximum values of the sequences */
MB = floor(pi* &maxN);
MC = floor(pi/(pi-1)* &maxN);         
call symputx('maxS', min(MB, MC));    /* for maxN=355, maxS=520 */
drop MB MC pi;
%put &=maxS;   /* display value in SAS log */
proc sort data=Beatty;
   by s;
ods graphics / width=400px height=560px;
title "Beatty and Complementary Sequences for pi";
proc sgplot data=Beatty;
   scatter x=n y=s / group=Sequence markerattrs=(symbol=CircleFilled);
   xaxis grid;
   yaxis max=&maxS grid LABEL="Sequence Value";

The red line has slope π, and the blue line has slope π/(π-1). The domain of these functions contains the integers in [1, 355]. The range of the red line is the Beatty sequence for π; the range of the blue line is the complementary sequence.

Because of the scale, it is difficult to determine how the range of the two linear functions interlace. Let's print out a few values of the interlaced sequences:

proc print data=Beatty(obs=12) noobs;
   var s Sequence;

The output shows that the first two integers are from the complementary sequence whereas the third is from the Beatty sequence for π. I call this the "two out, one in" pattern. This pattern continues for the first 12 observations. Now, clearly this pattern cannot continue forever, or else 1/3 of the positive integers would be in the Beatty sequence. However, we know that (1/π)th of the integers belong to the Beatty sequence by considering the limit of the ratio n/Bn as n → ∞. Thus, slightly less than 1/3 of the integers are in the Beatty sequence. Let's plot the first 60 positive integers and visualize whether each integer belongs to the Beatty sequence or the complementary sequence, as follows:

data SeqViz;
set Beatty(where=(s<=60));
z = 1;     /* for plotting the points as a strip plot */
ods graphics / width=640px height=150px;
title "Beatty and Complementary Sequences for pi";
title2 "First 60 Integers";
proc sgplot data=SeqViz noautolegend;
   where s<=60;
   yaxis display=none;
   xaxis grid gridattrs=(color=CXF0F0F0) values=( 0 to 21 by 3
                      25 to 43 by 3
                      47 to 60 by 3 ) valueshint;
   scatter x=s y=z / group=Sequence datalabel=Sequence
           datalabelpos=top datalabelattrs=(size=12)
           markerattrs=(symbol=SquareFilled size=12);

The graph shows that every so often there are three (rather than two) consecutive elements of the complementary sequence. This first happens for the integers 22, 23, and 24. The fact that the "extra" element of the complementary sequence appears at 22 is not random. It is related to the fact that 22/7 is a good approximation to π. After that, the pattern returns to "two out, one in" until the next multiple of 22, which is 44, when another triplet of integers belong to the complementary sequence.

The pattern deviates again for other numerators that are associated with the continued fraction expansion of π, such as 355/113. The next graph shows the membership of integers in the range [302, 361]. The pattern of "two out, one in" is broken at 308 = 22*14 and at 330=22*15, but then the pattern deviates again at 355, which is not a multiple of 22.

Check that the Beatty and complementary sequences are disjoint

We have asserted that the Beatty and complementary sequences are disjoint and that their union is the set of all positive integers. You can't use numerical computations to prove this fact, but you can verify that it is true for the finite set of numbers that we have generated. The following SAS/IML program checks that the two sequences do not intersect and that their union covers all positive integers (up to 520).

/* Check union and intersection properties of Beatty and complementary sequences */
proc iml;
use Beatty;
   read all var 's' into B where(Sequence='B');
   read all var 's' into C where(Sequence='C');
/* is there any intersection? */
intersect = xsect(B, C);
if IsEmpty(intersect) then 
   print "B and C do not intersect";
else print "The intersection is:", intersect;
/* use the UNION function */
max = min( max(B), max(C) );
union = union(B, C);
union = union[ , 1:max];  /* only test for values in [1, max] */
if all(union = 1:max) then 
   print "All integers are found";
   print "Some integer is missing";

The program verifies (for positive integers less than 520) that the Beatty and complementary sequences do not have any intersection. Furthermore, each integer is either an element of the Beatty sequence for π or it is an element of the complementary sequence.

I want to point out a cool SAS/IML syntax: The IF-THEN statement can be used to test whether one vector equals another vector. In the program, the statement all(union = 1:max) returns true is every element of the vector union is equal to the corresponding element of the vector {1,2,3,...,max}.


This article describes Beatty's theorem, which is an interesting result in mathematics. Given any irrational number, r > 1, the Beatty sequence for r and the complementary sequence partition the positive integers into two disjoint groups: the Beatty sequence for r and the complementary sequence. The article illustrates the theorem by using r = π and shows some connections with the continued fraction expansion of π.

The post The Beatty sequence for pi appeared first on The DO Loop.

3月 102021

This is my Pi Day post for 2021. Every year on March 14th (written 3/14 in the US), geeky mathematicians and their friends celebrate "all things pi-related" because 3.14 is the three-decimal approximation to pi. Most years I write about lower-case pi (π), which is the ratio of a circle's circumference to its diameter. But this year I thought it would be fun to write about upper-case pi (\(\Pi\)), which is the mathematical notation for a product of terms.

Just as \(\Sigma\) is used to express a summation, so \(\Pi\) is used to express a product. Did you know that there are infinite products that converge to pi? A famous one is the Wallis product, which converges to π/2:
\(\pi / 2 = \prod_{n=1}^{\infty} \left( \frac{2n}{2n-1} \cdot \frac{2n}{2n+1} \right) = \left( \frac{2}{1}\cdot \frac{2}{3} \right) \cdot \left( \frac{4}{3}\cdot \frac{4}{5} \right) \cdot \left( \frac{6}{5}\cdot \frac{6}{7} \right) \cdot \left( \frac{8}{7}\cdot \frac{8}{9} \right) \cdot \;\cdots \)

Implement Wallis's formula for pi

Suppose you wanted to implement Wallis's formula on a computer. How would you do it? The simplest way is to literally translate the formula into a loop that multiplies the first N terms in the product. You can plot the partial products against the number of terms to see how quickly (or slowly!) the product converges, as shown in the following SAS DATA step:

ods graphics / reset;
%let N = 1000;
data WallisProd;
prod = 1;
do n = 1 to &N;
   term = (2*n/(2*n-1)) * (2*n/(2*n+1));
   prod = prod * term;                  
title "Convergence of Wallis Product to pi/2";
proc sgplot data=WallisProd;
   where n >= 100;
   scatter x=n y=prod;
   refline &pi2 / axis=y noclip lineattrs=(color=red) label="pi/2";
   xaxis grid; yaxis grid;

The graph indicates that the convergence of the Wallis product is very slow. In fact, you can prove that the n_th term of the product has an asymptotic error of size π/(8n) as n → ∞. Because of this, the Wallis formula is not a good choice for computing the digits of pi. It takes about 400,000 terms to get six-digit accuracy!

But let's not focus on π, let's focus on the product, \(\Pi\). Although a direct translation of the product formula is simple, it is not necessarily the best way to compute the product, computationally speaking. Pi Day seems like a great time to review an alternative computational method that is often useful when you compute with products.

Products and log transformations in statistics

The Wallis product is well behaved because the individual terms are all close to 1. However, many products that you encounter in computational statistics are not so well behaved. For example, in statistical modeling, you encounter the likelihood function. Given a set of data (x1, x2,..., xn) and a probability density function (f) that you believe models the data, the likelihood of the data is the product \(\prod_{i=1}^{n} f(x_{i})\). In this formulation, every term in the product is less than 1. If the data contain outliers, there might be individual probabilities that are tiny. Therefore, a naive computation of the product can result in a very small number. In fact, the computation might encounter an underflow error, especially for large data sets.

A computational technique that avoids this potential problem is to apply a log transformation, which converts the product into a summation of the log of the terms. After you have computed the sum, you exponentiate it to obtain the product. This technique requires that all terms in the product be positive, which is true in most applications.

In symbols, if you have a product of positive values, \(y = \prod_{i} a_{i}\), you can log-transform it to \(\log(y) = \sum_{i} \log(a_{i})\). The product is mathematically equivalent to \(y = \exp\left( \sum_{i} \log(a_{i}) \right)\).

Let's apply the log transformation to the Wallis formula for π/2. The following SAS DATA step computes a sum of log-transformed terms. After each step, the EXP function gives the value of the product at that step.

data WallisSum;
do n = 1 to &N;
   logterm = 2*log(2*n) - log(2*n-1) - log(2*n+1);  /* log transform of terms */
   logprod + logterm;                               /* summation */
   prod = exp(logprod);                             /* exp(sum) */

The graph of the partial products is the same graph as in the previous section. The numerical difference between the two computations is less than 3E-14. However, in general, it is usually safer to sum the log-transformed terms because the method avoids numerical overflow and underflow issues that are common when you compute a product in a naive way.

So, this Pi Day, don't just celebrate lower-case pi, π = 3.14159265.... Take a moment to think about upper-case pi (\(\Pi\)), which stands for a product of numbers. For many products, a numerically stable way to compute the product to log-transform the problem, compute a summation, and then exponentiate the sum. And that is a numerical technique that is worth celebrating, not just on Pi Day but every day!

Further reading

The post Pi and products appeared first on The DO Loop.

3月 112020
Regular polygons approximate a circle

Recently, I saw a graphic on Twitter by @neilrkaye that showed the rapid convergence of a regular polygon to a circle as you increase the number of sides for the polygon. The author remarked that polygons that have 40 or more sides "all look like circles to me." That is, a regular polygon with a large number of sides is visually indistinguishable from a circle.

I had two reactions to the graphic. The first was "Cool! I want to create a similar figure in SAS!" (I have done so on the right; click to enlarge.) The second reaction was that the idea is both mathematically trivial and mathematically sophisticated. It is trivial because it is known to Archimedes more than 2,000 years ago and is readily understood by using high school math. But it is sophisticated because it is a particular instance of a mathematical limit, which is the foundation of calculus and the mathematical field of analysis.

The figure also demonstrates the fact that you can approximate a smooth object (a circle, a curve, a surface, ...) by a large number of small, local, linear approximations. It is not an exaggeration to say that that concept is among the most important concepts in applied mathematics. It is used in numerical methods to solve nonlinear equations, integrals, differential equations, and more. It is used in numerical optimization. It forms the foundations of computer graphics. A standard technique in computational algorithms that involves a nonlinear function is to approximate the function by the first term of the Taylor expansion—a process known as linearization.

An approximation of pi

Archimedes used this process (local approximation by a linear function) to approximate pi (π), which is the ratio of the circumference of a circle to its diameter. He used two sets of regular polygons: one that is inscribed in the unit circle and the other that circumscribes the unit circle. The circumference of an inscribed polygon is less than the circumference of the circle. The circumference of a circumscribed polygon is greater than the circumference of the circle. They converge to a common value, which is the circumference of the circle. If you apply this process to a unit circle, you approximate the value 2π. Archimedes used the same construction to approximate the area of the unit circle, which is π.

You can use Archimedes's ideas to approximate π by using trigonometry and regular polygons, as shown in the following paragraph.

Suppose that a regular polygon has n sides. Then the central angle between adjacent vertices is θ = 2π/n radians. The following diagram illustrates the geometry of inscribed and circumscribed polygons. The right triangle that is shown is half of the triangle between adjacent vertices. Consequently,

  • The half-length of a side is b, where b = cos(θ/2) for the inscribed polygon and b = tan(θ/2) for the circumscribed polygon.
  • The height of the triangle is h, where h = |sin(θ/2)| for the inscribed polygon and h = 1 for the circumscribed polygon.
  • The circumfernce of the regular polygon is 2 n b
  • The area of the regular polygon is 2 n ( b h/2 ).

Approximate pi by using Archimedes's method

Although Archimedes did not have access to a modern computer, you can easily write a SAS DATA step program to reproduce Archimedes's approximations to the circumference and area of the unit circle, as shown below:

/* area and circomference */
data ApproxCircle;
pi = constant('pi');           /* Hah! We must know pi in order to evaluate the trig functions! */
south = 3*pi/2;                /* angle that is straight down */
do n = 3 to 100;
   angle = 2*pi/n;
   theta = south + angle/2;    /* first vertex for this n-gon */
   /* Circumference and Area for circumscribed polygon */
   b = tan(angle/2);
   h = 1;
   C_Out = 2*n * b;
   A_Out = 2*n * 0.5*b*h;
   /* Circumference and Area for inscribed polygon */
   b = cos(theta);
   h = abs( sin(theta) );
   C_In = 2*n * b;
   A_In = 2*n * 0.5*b*h;
   /* difference between the circumscribed and inscribed circles */
   CDiff = C_Out - C_In;
   ADiff = A_Out - A_In;
label CDiff="Circumference Difference" ADiff="Area Difference";
keep n C_In C_Out A_In A_Out CDiff ADiff;
/* graph the circumference and area of the n-gons as a function of n */
ods graphics / reset height=300px width=640px;
%let curveoptions = curvelabel curvelabelpos=min;
title "Circumference of Regular Polygons";
proc sgplot data=ApproxCircle;
   series x=n y=C_Out / &curveoptions;
   series x=n y=C_In  / &curveoptions;
   refline 6.2831853 / axis=y;              /* reference line at 2 pi */
   xaxis grid values=(0 to 100 by 10) valueshint;
   yaxis values=(2 to 10) valueshint label="Regular Polygon Approximation";
title "Area of Regular Polygons";
proc sgplot data=ApproxCircle;
   series x=n y=A_Out / &curveoptions;
   series x=n y=A_In  / &curveoptions;
   refline 3.1415927 / axis=y;              /* reference line at pi */
   xaxis grid values=(0 to 100 by 10) valueshint;
   yaxis  values=(2 to 10) valueshint label="Regular Polygon Approximation";

As you can see from the graph, the circumference and area of the regular polygons converge quickly to 2π and π, respectively. After n = 40 sides, the curves are visually indistinguishable from their limits, which is the same result that we noticed visually when looking at the grid of regular polygons.

The DATA step also computes the difference between the measurements of the circumscribed and inscribed polygons. You can print out a few of the differences to determine how close these estimates are to the circumference and area of the unit circle:

proc print data=ApproxCircle noobs;
   where n in (40, 50, 56, 60, 80, 100);
   var n CDiff ADiff;

The table shows that the difference between the circumference of the circumscribed and inscribed polygons is about 0.02 when n = 40. For n = 56, the difference is less than 0.01, which means that the circumference of a regular polynomial approximates the circumference of the unit circle to two decimal places when n ≥ 56. If you use a regular 100-gon, the circumference is within 0.003 of the circumference of the unit circle. Although it is not shown, it turns out you need to use 177 sides before the difference is within 0.001, meaning that a 177-gon approximates the circumference of the unit circle to three decimal places.

Similar results hold for the area of the polygons and the area of a unit circle.

In conclusion, not only does a regular n-gon look very similar to the circle when n is large, but you can quantify how quickly the circumference and areas of an n-gon converges to the values 2 π and π, respectively. For n=56, the polygon values are accurate to two decimal places; for n=177, the polygon values are accurate to three decimal places.

Approximating a smooth curve by a series of discrete approximations is the foundation of calculus and modern numerical methods. The idea had its start in ancient Greece, but the world had to wait for Newton, Leibnitz, and others to provide the mathematical machinery (limits, convergence, ...) to understand the concepts rigorously.

The post Polygons, pi, and linear approximations appeared first on The DO Loop.

3月 132019

It's time to celebrate Pi Day! Every year on March 14th (written 3/14 in the US), math-loving folks celebrate "all things pi-related" because 3.14 is the three-decimal approximation to the mathematical constant, π. Although children learn that pi is approximately 3.14159..., the actual definition of π is the ratio of a circle's circumference to its diameter. Equivalently, it is distance around half of the unit circle. (The unit circle has a unit radius, so its diameter is 2.) The value for pi, therefore, depends on the definition of a circle.

But we all know what a circle looks like, don't we? How can there be more than one circle?

Generalizing the circle

A circle is defined as the locus of points in the plane that are a given distance from a given point. This definition depends on the definition of a "distance," and it turns out that there are infinitely many ways to measure the distance between two points in the plane. The Euclidean distance between two points is the most familiar distances, but there are other definitions. For two points a = (x1, y1) and b = (x2, y2), you can define the "Lp distance" between a and b by the formula
Dp = ( |x1 – x2|p + |y1 – y2|p )1/p
This definition defines a distance metric for every value of p ≥ 1. If you set p=2 in the formula, you get the usual L2 (Euclidean) distance. If you set p=1, you get the L1 metric, which is known as the "taxicab" or "city block" distance.

You might think that the Euclidean distance is the only relevant distance, but it turns out that some of these other distances have practical applications in statistics, machine learning, linear algebra, and many fields of applied mathematics. For example, the 2-norm (L2) distance is used in least-squares regression whereas the 1-norm (L1) distance is used in robust regression and quantile regression. A combination of the two distances is used for ridge regression, LASSO regression, and "elastic net" regression.

Here's the connection to pi: If you can define infinitely many distance formulas, then there are infinitely many unit circles, one for each value of p ≥ 1. And if there are infinitely many circles, there might be infinitely many values of pi. (Spoiler alert: There are!)

Would the real circle please stand up?

You can easily solve for y as a function of x and draw the unit circle for a representative set of values for p. The following graph was generated by the SAS step and PROC SGPLOT. You can download the SAS program that generates the graphs in this article.

The L1 unit circle is a diamond (the top half is shown), the L2 unit circle is the familiar round shape, and as p gets large the unit circle for the Lp distance approaches the boundary of the square defined by the four points (±1, ±1). For more information about Lp circles and metrics, see the Wikipedia article "Lp Space: The p-norm in finite dimensions."

Here comes the surprise: Just as each Lp metric has its own unit circle, each metric has its own numerical value for pi, which is the length of the unit semicircle as measured by that metric.

π(p): The length of the unit semicircle for the Lp distance metric

So far, we've only used geometry, but it's time to use a little calculus. This presentation is based on Keller and Vakil (2009, p. 931-935), who give more details about the formulas in this section.

For a curve that is represented as a graph (y as a function of x), you can obtain the length of the curve by integrating the arclength. In Calculus 2, the arclength formula is derived for Euclidean distance, but it is straightforward to give the formula for the Lp distance:
s(p) = ∫ (1 + |dy/dx|p)1/p dx

To obtain a value for pi in the Lp metric, you can integrate the arclength for the upper half of the Lp unit circle. Equivalently, by symmetry, you can integrate one-eighth of the unit circle and multiply by 4. A convenient choice for the limits of integration is [0, 2-1/p] because 2-1/p is the x value where the 45-degree line intersects the unit circle for the Lp metric.

Substituting for the derivative gives the following formula (Keller and Vakil, 2009, p. 932):
π(p) = 4 ∫ (1 + u(x))1/p dx, where u(x) = |x-p - 1|1-p and the interval of integration is [0, 2-1/p].

A pi for each Lp metric

For each value of p, you get a different value for pi. You can use your favorite numerical integration routine to approximate π(p) by integrating the formula for various values of p ≥ 1. I used SAS/IML, which supports the QUAD function for numerical integration. The arclength computation for a variety of values for p is summarized by the following graph. The graph shows the computation of π(p), which is the length of the semicircle in the Lp metric, versus values of p for p in [1, 11].

The graph shows that the L1 value for pi is 4. The value decreases rapidly as p approaches 2 and reaches a minimum value when p=2 and the value of pi is 3.14159.... For p > 2, the graph of π(p) increases slowly. You can show that π(p) asymptotically approaches the value 4 as p approaches infinity.

On Pi Day, some places have contests to see who can recite the most digits of pi. I encourage you to enter the contest and say "Pi, in the L1 metric, is FOUR point zero, zero, zero, zero, ...." If they refuse to give you the prize, tell them to read this article! 😉

Reflections on pi

One the one hand, this article shows that there is nothing special about the value 3.14159.... For an Lp metric, the ratio of the circumference of a circle to its diameter can be any value between π and 4. On the other hand, the graph shows that π is the unique minimizer of the graph. Among an infinitude of circles and metrics, the well-known Euclidean distance is the only Lp metric for which pi is 3.14159....

If you ask me, our value of π is special, without a doubt!


Download the SAS program that creates the graphs in this article.

The post The value of pi depends on how you measure distance appeared first on The DO Loop.

3月 122018

Welcome to my annual Pi Day post. Every year on March 14th (written 3/14 in the US), geeky mathematicians and their friends celebrate "all things pi-related" because 3.14 is the three-decimal approximation to pi.

Pi is a mathematical constant that never changes. Pi is the same value today as it was in ancient Babylon and Greece. The timeless constancy of pi is a comforting presence in a world of rapid change.

Abramowitz and Stegun, Handbook of Mathematical Functions

But even though the value of pi does not change, our knowledge about pi does change and grow. I was reminded of this recently when I opened my worn copy of the Handbook of Mathematical Functions (more commonly known as "Abramowitz and Stegun," the names of its editors). When the 1,046-page Handbook was published in 1964, it was the premier reference volume for applied mathematicians and mathematical scientists. Interestingly, pi is not even listed in the index! It does appear on p. 3 under "Mathematical Constants," which gives a 25-digit approximation of many mathematical constants such as pi, e, and sqrt(2).

How to define pi?

Fast forward to the age of the internet. In 2010, the Handbook was transformed into an expanded online, searchable, interactive web site. The new Handbook is called The NIST Digital Library of Mathematical Functions. This is very exciting because the Handbook is now available (for free!) to everyone!

If you search for pi in the online Digital Library, you find that the editors chose to define pi as the value of the integral

This seems to be a strange way to define pi. Pi is the ratio of the circumference and diameter of a circle, and upon first glance that formula doesn't seem related to a circle. A more geometric choice would be an integrand such as sqrt(1 + t2), which connects pi to the area under the unit circle.

Of course, the integral in the Digital Library is equal to pi, but it is not obvious. You might recall from calculus that the antiderivative of 1/(1+t2) is arctan(t). Therefore the expression is just a complicated way to write 4 arctan(1). Ah! This makes more sense because arctan(1) is equal to π/4. In fact, before SAS introduced the CONSTANT function, SAS programmers used to define pi by using the computation pi = 4*ATAN(1). Nevertheless, I think expressing arctan(1) as an integral is unnecessarily obtuse.

Using the Cauchy distribution to define pi

I am not enamored with the editors' choice of an integral to define pi, but if I were to use that integrand to define pi, I would use a variation that has applications in probability and statistics. Statisticians sometimes use the Cauchy distribution, which is a fat-tailed distribution that has the interesting mathematical property that the distribution has no mean (expected) value! (Mathematicians say that "the first moment does not exist.") Researchers in robust statistical methods sometimes use Cauchy-distributed errors to generate extreme outliers in simulated data.

The Cauchy probability density function (PDF) is 1/π 1/(1+t2), which means that the integral of the PDF on the interval [-∞, ∞] is 1. Equivalently, the integral of 1/(1+t2) on the interval [-∞, ∞] is π:

This definition of pi seems more natural than the integral on [0, 1]. I could make other suggestions (such as the integral of arccos on [-1, 1]), but I think I'll stop here.

The purpose of this post is to celebrate pi, which is so ubiquitous and important that it can be defined in numerous ways. A secondary purpose is to highlight the availability of the NIST Digital Library of Mathematical Functions, which is an online successor of the venerable Handbook of Mathematical Functions. I am thrilled with the availability of this amazing resource, regardless of how they define pi!

To complete this Pi Day post, I leave you with a pi-ku. A pi-ku is like a haiku, but each line contains syllables the number of syllables in the decimal expansion of pi. A common structure for a pi-ku is 3-1-4. The following pi-ku celebrates the new Digital Library:

Handbook of
Functions? Online!

The post Pi, special functions, and distributions appeared first on The DO Loop.

3月 132017

It is time for Pi Day, 2017! Every year on March 14th (written 3/14 in the US), geeky mathematicians and their friends celebrate "all things pi-related" because 3.14 is the three-decimal approximation to pi. This year I use SAS software to show an amazing fact: you can find your birthday (or any other date) within the first 10 million digits of pi!

Patterns within the digits in pi

Mathematicians conjecture that the decimal expansion of pi exhibits many properties of a random sequence of digits. If so, you should be able to find any sequence of digits within the decimal digits of pi.

If you want to search for a particular date, such as your birthday, you need to choose a pattern of digits that represents the date. For example, Pi Day was first celebrated on 14MAR1988. You can represent that date in several ways. This article uses the MMDDYY representation, which is 031488. You could also use a representation such as 31488, which drops the leading zero for months or days less than 10. Or use the DDMMYY convention, which is 140399.

Can you find your birthday within the digits of pi?
Click To Tweet

In 2015 I showed how to use SAS software to download the first ten million digits of pi from an internet site. The program then uses PROC PRINT to print six consecutive digits of pi beginning at the 433,422th digit:

/* read data over the internet from a URL */
filename rawurl url "http://www.cs.princeton.edu/introcs/data/pi-10million.txt"
                /* proxy='http://yourproxy.company.com:80' */ ;
data PiDigits;
   infile rawurl lrecl=10000000;
   input Digit 1. @@;
   Position = _n_;
/* Pi Day "birthday" 03/14/88 represented as 031488 */
proc print noobs data=PiDigits(firstobs=433422 obs=433427);
   var Position Digit;

Look at that! The six-digit pattern 031488 appears in the decimal digits of pi! This location also contains the alternative five-digit representation 31488, but you can find that five-digit sequence much earlier, at the 19,466th digit:

/* Alternative representation: Pi Day birthday = 31488 */
proc print noobs data=PiDigits(firstobs=19466 obs=19470);
   var Position Digit;

How did I know where to look for these patterns? Read on to discover how to use SAS to find a particular pattern digits within the decimal expansion of pi.

Finding patterns within the digits in pi

Last week I showed how to use SAS to search for a particular pattern within a long sequence of digits. Let's use that technique to search for the six-digit Pi Day "birthday," pattern 031488. The following call to PROC IML in SAS defines a function that implements the search algorithm. The program then reads in the first 10 million digits of pi and conducts the search for the pattern:

proc iml;
/* FindPattern: Finds a specified pattern within a long sequence of digits.
   Input: target : row vector of the target pattern, such as {0 3 1 4 8 8}
          digits : col vector of the digits in which to search
   Prints the number of times the pattern appears and the first location of the pattern.
   See https://blogs.sas.com/content/iml/2017/03/10/find-pattern-in-sequence-of-digits.html
start FindPattern(target, digits);
   p = ncol(target);               /* length of target sequence */
   D = lag(digits, (p-1):0);       /* columns shift the digits */
   D = D[p:nrow(digits),];         /* delete first p rows */
   X = (D=target);                 /* binary matrix */
   /* sum across columns. Which rows contain all 1s? */
   b = (X[,+] = p);                /* b[i]=1 if i_th row matches target */
   NumRepl = sum(b);               /* how many times does target appear? */
   if NumRepl=0 then FirstLoc = 0;  else FirstLoc = loc(b)[1];
   result = NumRepl // FirstLoc;
   labl = "Pattern = "  + rowcat(char(target,1));  /* convert to string */
   print result[L=labl F=COMMA9. rowname={"Num Repl", "First Loc"}];
/* read in 10 million digits of pi */
use PiDigits;  read all var {"Digit"};  close;
target = {0 3  1 4  8 8};  /* six-digit "birthday" of Pi Day */
call FindPattern(target, Digit);
target = {3  1 4  8 8};    /* five-digit "birthday" */
call FindPattern(target, Digit);

Success! The program shows the starting location for each pattern within the digits of pi. The starting locations match the values of the FIRSTOBS= option that was used in PROC PRINT in the previous section.

Search for your birthday within the digits of pi

You can use this program to search for your birthday, your anniversary, or any other special date. (If you prefer to use the SAS DATA step, see the comments of my previous article.) If you don't have SAS, don't despair! I got the idea for this article from a nifty web page on PBS.org that contains an applet that you can use to find your birthday among the digits of pi.

The PBS applet does not require any special software. However, I noticed that it gives slightly different answers from the SAS program I wrote. One trivial difference is that the applet starts with the "3" digit of pi, whereas the SAS program starts with the "1" in the tenths decimal place. So the two programs should give locations that differ by one place. Another difference is that the applet appears to always represent months and days that are less than 10 as a one-digit value, so that the PBS applet represents 02JAN2003 as "1203" rather than "010203." However, I have observed (but cannot explain) that the PBS applet seems to consistently report a location that is three digits more than the SAS-reported location. For example, the applet reports 02JAN2003 (1203) as occurring at the 60,875th digit, whereas the SAS program reports the location as the 60,872th digit.

Some unique dates within the digits of pi

We know that the Pi Day "birthday" date appears, but what about other dates? I wrote a SAS program that searches for all six-digit MMDDYY representation of dates from 01JAN1900 to 21DEC1999. I verified all dates are contained in the first 10 million digits of pi except for one. The date 01DEC1954 (120154) is the only date that does not appear!

I also discovered some other interesting properties while searching for dates (in the MMDDYY format) within the first 10 million digits of pi:

  1. First appearance: The first date to appear is 28JUN1962 (062862), which appears in the 71st decimal location.
  2. Latest (first) appearance: The date 23NOV1960 (112360) does not appear until the 9,982,545th location.
  3. Rarest: The date 01DEC1954 (120154) is the only date that does not appear. (But the five-digit representation (12154) does appear.)
  4. Second rarest: There are 15 dates that only appear one time.
  5. Most frequent: The date 22JUL1982 (072282) appears 25 times.
  6. Distribution of appearances: Most dates appear between seven and 12 times. The following graph shows the distribution of the number of times that each date appears.
Distribution of the number of times that each date MMDDYY appears in first 10M digits of pi

If you want to discover other awesome facts, you can explore the data yourself. You can download the results (in CSV format) of the exhaustive search. If you want to see how I searched the set of all MMDDYY patterns, you can download the SAS program that I used to create the analyses in this article.

The post Find your birthday in the digits of pi appeared first on The DO Loop.

3月 142016

Math lovers, do you know what day it is? It's Pi Day, which we celebrate every year on March 14 because the date 3-14 matches the first three digits of pi, 3.14. This year, I'm celebrating with poetry, combining my love of math with my love of language. Word Spy explains that a pi-ku is […]

Let's celebrate Pi Day with a pi-ku was published on SAS Voices.