When creating a statistical graphic such as a line plot or a scatter plot, it is sometimes important to preserve the aspect ratio of the data. For example, if the range of the X and Y variables are equal, it can be useful to display a graph in which the data are displayed in a square region. This is important when you want to visualize the distance between points, as in certain multivariate statistics. It is also important if you are plotting polygons and want a square to look like a square.

This article presents two ways to create ODS statistical graphics in SAS in which the scale of the data is accurately represented in the plot. They are the ASPECT= option in PROC SGPLOT and the OVERLAYEQUATED layout in the Graph Template Language (GTL).

### Data scale versus physical measurements of a graph

Usually, the data coordinates are not used to compute the width and height of a graph. By default, the size of an ODS statistical graphic in SAS is a certain number of pixels on the screen or a certain number of centimeters for graphs written to other ODS destinations (such as PDF or RTF). When you request a scatter plot, the minimum and maximum value of each coordinate is used to determine the range of the axes. However, the physical dimensions of the axes (in pixels or centimeters) depends on the titles, labels, tick marks, legends, margins, font sizes, and many other features.

For example, the following data has two variables. The X and Y variables both have a minimum value of 0 and a maximum value of 1. Therefore the range of each variable is 1. The default graph has a 4:3 ratio of width to height, so when you create a scatter plot, the physical lengths (in pixels) of the X and Y axes are not equal:

data S; /* XRange = YRange = [0, 1] */ input x y @@; datalines; -1 -1 -0.75 -0.5 -0.5 -0.75 -0.25 0 0 0.5 0.25 -0.25 0.5 0.75 0.75 0.25 1 1 ; ods graphics / reset; /* use default width and height */ title "Default Graph: 640 x 480 pixels"; title2 "Aspect Ratio 4:3"; proc sgplot data=S; scatter x=x y=y; xaxis grid; yaxis grid; run; |

You can click on the graph to see the original size. The graph area occupies 640 x 480 pixels. However, because of labels and titles and such, the region that contains the data (also called the *wall area*) is about 555 pixels wide and 388 pixels tall, which is obviously not square. You can see that each cell in the grid represents a square with side length 0.5, but the cells do not appear square on the screen because of the aspect ratio of the graph.

### Setting the aspect ratio

Prior to SAS 9.4, PROC SGPLOT did not enable you to set the aspect ratio of the wall area. You had to use trial and error to adjust the width of the graph until the wall area was approximately square. For example, you could start the process by submitting ODS GRAPHICS / WIDTH=400px HEIGHT=400px;.

However, in SAS 9.4 you can use the ASPECT= option on the PROC SGPLOT statement to tell PROC SGPLOT to make the wall area (data region) square, as follows:

title "Graph: 640 x 480 pixels"; title2 "Aspect Ratio 1:1"; proc sgplot data=S aspect=1; /* set physical dimensions of axes equal */ scatter x=x y=y; xaxis grid; yaxis grid; run; |

Although the graph size has not changed, the wall area (which contains the data) is now square. The wall area is approximately 370 pixels in both directions.

Notice that graph has a lot of white space to the left and right of the wall area. You can adjust the width of the graph to get rid of the extra space.

This technique also works for other aspect ratios. For example, if the range of the Y variable is 2, you can use ASPECT=2 to set the wall area to be twice as high as it is wide.

Be aware that this technique works because the range of the X variable equals the range of the Y variable, and the margins in the wall area (set by using the OFFSETMIN= and OFFSETMAX= options) are also equal. If your X and Y ranges are not exactly equal, read on.

### Setting the range of the axes

In practice, the range of the X axis might not exactly equal the range of the Y axis. In that case, use the MIN= and MAX= options on the XAXIS and YAXIS statements to set the ranges of each variable to a common range. For example, in principal component analysis, the principal component scores are often plotted on a common scale. The following call to PROC PRINCOMP creates variables PRIN1, PRIN2, and PRIN3 that contain the principal component scores for numerical variables in the Sashelp.Iris data set:

proc princomp data=Sashelp.Iris N=3 out=OutPCA noprint; var SepalWidth SepalLength PetalWidth PetalLength; run; proc means data=OutPCA N min max mean std; var Prin:; run; |

You can see that the range of the three variables are not equal. However, you can use the ASPECT=1 option to graph the scores in such a way that one centimeter in the horizontal direction represents the same number of units as one centimeter in the vertical direction. The MIN= and MAX= options are used so that the ranges of the X and Y variables are equal:

ods graphics / width=480px height=480px; title "Principal Component Scores"; title2 "Aspect Ratio 1:1"; proc sgplot data=OutPCA aspect=1; scatter x=Prin1 y=Prin2 / group=Species; xaxis grid min=-2.8 max=3.3; /* values=(-3 to 3) valueshint; */ yaxis grid min=-2.8 max=3.3; /* values=(-3 to 3) valueshint; */ run; |

In spite of titles, legends, and labels, the wall area is a square. The width of the graph was reduced so that there is less blank space to the left and right of the wall area.

Notice the comments in the call to PROC SGPLOT. The comments indicate how you can explicitly set values for the axes, if you want. If un-comment the syntax, the VALUES= option sets the tick values. The VALUESHINT option tells PROC SGPLOT that these values are merely "hints": the tick values should not be used to extend the length of an axes beyond the range of the data.

### Automating the process with GTL

I like PROC SGPLOT, but if you are running a version of SAS prior to 9.4, you can still obtain equated axes. However, you need to use the GTL and PROC RENDER. The trick is to use the OVERLAYEQUATED layout, rather than the usual OVERLAY layout. The OVERLAYEQUATED layout ensures that the physical dimensions of the wall area is proportional to the aspect ratio of the data ranges. The following example uses the output from the PROC PRINCOMP analysis in the previous section:

proc template; /* scatter plot with equated axes */ define statgraph ScatterEquateTmplt; dynamic _X _Y _Title; /* dynamic variables */ begingraph; entrytitle _Title; /* specify title at run time (optional) */ layout overlayequated / /* units of x and y proportions as pixesl */ xaxisopts=(griddisplay=on) /* put X axis options here */ yaxisopts=(griddisplay=on); /* put Y axis options here */ scatterplot x=_X y=_Y; /* specify variables at run time */ endlayout; endgraph; end; run; proc sgrender data=outPCA template=ScatterEquateTmplt; dynamic _X='Prin1' _Y='Prin2' _Title="Equated Axes"; run; |

The output is not shown, but is similar to the graph in the previous section. The nice thing about using the GTL is that it supports the EQUATETYPE= option, which enables you to specify how to handle axes ranges that are not equal.

In summary, there are two ways to make sure that the physical dimensions of data area (wall area) of a graph accurately represents distances in the data coordinate system. You can use the GTL and the OVERLAYEQUATED layout, as shown in this section, or you can use the ASPECT= option in PROC SGPLOT if you have SAS 9.4. Although it is not always necessary to equate the X and Y axis, it is nice that SAS supports it when you need it.

The post Size matters: Preserving the aspect ratio of the data in ODS graphics appeared first on The DO Loop.