An anonymous

commenter expressed a desire to see how one might use SAS to draw a bubble plot with bubbles in three colors, corresponding to a fourth variable in the data set. (x, y, z for bubble size, and the category variable.) In a previous entries we

discussed bubble plots and showed how to make the

bubble print in two colors depending a fourth *dichotomous* variable.

The SAS approach to this cannot be extended to fourth variables with many values: we show here an approach to generating this output. The R version below represents a trivial extension of the code demonstrated earlier.

**SAS**We'll start by making some data-- 20 observations in each of 3 categories.

data testbubbles;

do cat = 1 to 3;

do i = 1 to 20;

abscissa = normal(0);

ordinate = normal(0);

z = uniform(0);

output;

end;

end;

run;

Our approach will be to make an

`annotate` data set using the

`annotate macros` (section 5.2). The

`%slice` macro easily draws filled circles. Check its documentation for full details on the parameters it needs in the on-line help: SAS Products; SAS/GRAPH; The Annotate Facility; Annotate Dictionary. Here we note that the 5th parameter is the radius of the circle, chosen here as an arbitrary function of z that makes pleasingly sized circles. Other parameters reflect color density, arc, and starting angle, which could be used to represent additional variables.

%annomac;

data annobub1;

set testbubbles;

%system(2,2,3);

%slice(abscissa, ordinate, 0, 360, sqrt(3*z), green, ps, 0);

run;

Unfortunately, due to a quirk of the macro facility, I don't think the color can be changed conditionally in the preceding step. Instead, we need a new data step to do this.

data annobub2;

set annobub1;

if cat=2 then color="red";

if cat=3 then color="blue";

run;

Now we're ready to plot. We use the

`symbol` (section 5.2.2) statement to tell

`proc gplot` not to plot the data, add the annotate data set, and suppress the legend, as the default legend will not look correct here. An appropriate legend could be generated with a

`legend` statement.

symbol1 i=none r=3;

proc gplot data=testbubbles;

plot ordinate * abscissa = cat / annotate = annobub2 nolegend;

run;

quit;

The resulting plot is shown above. Improved axes are demonstrated throughout the book and in many previous blog posts.

**R**The R approach merely requires passing three colors to the

`bg` option in the

`symbols()` function. To mimic SAS, we'll start by defining some data, then generate the vector of colors needed.

cat = rep(c(1, 2, 3), each=20)

abscissa = rnorm(60)

ordinate = rnorm(60)

z = runif(60)

plotcolor = ifelse(cat==1, "green", ifelse(cat==2, "red", "blue"))

The nested calls to the

`ifelse` function (section 1.11.2) allow vectorized conditional tests with more than two possibilities. Another option would be to use a

`for` loop (section 1.11.1) but this would be avoiding one of the strengths of R. In this example, I suppose I could have defined the

`cat` vector with the color values as well, and saved some keystrokes.

With the data generated and the color vector prepared, we need only call the

`symbols()` function.

symbols(ordinate, abscissa, circles=z, inches=1/5, bg=plotcolor)

The resulting plot is shown below.