OAuth2

11月 292018
 

If your work environment is like ours here at SAS, you're seeing more of your data and applications move to the cloud. It's not yet a complete replacement for having local files on your desktop machine, but with cloud storage and apps -- like Microsoft OneDrive -- I can now access my work documents from any browser and any device, including my smartphone. I can update now my spreadsheets while waiting in the dentist office. Oh joy.

For those of us who use SAS to read and create Microsoft Excel documents, cloud-based files can add an extra wrinkle when we automate the process. It also adds some exciting possibilities! The Microsoft Office 365 suite offers APIs to discover, fetch, and update our documents using code. In this article, I'll show you how to use SAS programs to reach into your Microsoft OneDrive cloud to read and update your files. Note: All of this assumes that you already have a Microsoft Office 365 account -- perhaps provisioned by your IT support team -- and that you're using it to manage your documents.

Before I go on, I have to give major credit to Joseph Henry, the SAS developer who maintains PROC HTTP. Joseph did the heavy lifting for putting together the code and examples in this article. He also regularly adds new features to PROC HTTP that make it a more natural fit for calling REST APIs that require special authentication flows, such as OAuth2.

Using SAS with Microsoft OneDrive: an overview

Microsoft Office 365 uses an OAuth2-style authentication flow to grant access and permissions to third-party apps. When we're writing SAS programs to access Microsoft OneDrive, we're also writing a third-party app. This requires several setup steps, a few of which cannot be automated. Fortunately, these need to be done just once, or at least infrequently. Here's an outline of the steps:

  1. Register a new client application at the Microsoft Application Registration Portal. (You will need to sign in with your Office 365 credentials, which might be your primary organization credentials if you have single-signon with Active Directory.)
  2. Using your browser while you are signed into Office 365, navigate to a special web address to obtain an authorization code for your application.
  3. With your authorization code in hand, plug this into a SAS program (PROC HTTP step) to retrieve an OAuth2 access token (and a refresh token).
  4. With the access token, you can now use PROC HTTP and the Microsoft Office 365 APIs to retrieve your OneDrive folders and files, download files, upload files, and replace files.

You'll have to complete Step 1 just once for your application or project. Steps 2 and 3 can be done just once, or at least just occasionally. The access token is valid for a limited time (usually 1 hour), but you can always exchange the refresh token for a new valid access token. This refresh token step can be automated in your program, usually run just once per session. Occasionally that refresh token can be revoked (and thus made invalid) when certain events occur (such as you changing your account password). When that happens, you'll need to repeat steps 2 and 3 to get a new set of access/refresh tokens.

Step 1: Register your application

Visit the Microsoft Application Registration portal to register your new app. You'll sign in with your Microsoft Office 365 credentials.

Microsoft Application Registration portal

Note: At the time of this writing, Microsoft was testing a new web experience for application developers. I opted for the "preview experience" with the hope that my screenshots would have a longer shelf life for future readers.

Click New Registration to get started. This presents you with a form where you can complete the details that define your app. Mainly, you're giving it a name and defining its scope. You'll probably want to limit its use to just your organization (your company) unless you're collaborating with colleagues who work elsewhere.

"Register an application" form

When you create an app, you'll receive a Client ID (unique to your app) and Tenant ID (unique to your organization). You'll need these values to obtain your authorization code and tokens later. The application portal provides a sort of control center for all aspects of your app. (Note: I masked out my client ID and tenant ID in this screenshot.)

Details for my sample application

Specifying your app permissions

Your app will need specific permissions in order to function. In my example, I want my SAS program to read documents from my OneDrive, and also add new docs and update existing docs. The permissions I need are:

  • Files.ReadWrite.All: Allows the app to read, create, update and delete all files that you can access.
  • User.Read: Allows you to sign in to the app with your organizational account and let the app read your profile.

To add these to your app, click the API Permissions tab in the control center. To be clear, these are not permissions that your app will automatically have. These are the permissions that will be requested when you "sign into" the app for the first time, and that you'll have to agree to in order for the app to run.

Adding permissions that the app needs

Creating a configuration file

There are a few app-specific values that we'll need to reference throughout the SAS programs we're writing. I decided to create a configuration file for these settings rather than hard-code them into my SAS statements. This will make it easier for other people to reuse my code in their own applications.

I created a file named conf.json that looks like this (but with different tenant_id and client_id values):

{
  "tenant_id": "206db638-6adb-41b9-b20c-95d8d04abcbe",
  "client_id": "8fb7804a-8dfd-40d8-bf5b-d02c2cbc56f3",
  "redirect_uri": "https://login.microsoftonline.com/common/oauth2/nativeclient",
  "resource" : "https://graph.microsoft.com"
}

By "externalizing" the IDs specific to my account/instance, I can use SAS code to read the values at run time. Note: This code, like all of the code in this article, uses features from SAS 9.4 Maintenance 5.

/*
  Set the variables that will be needed through the code
  We'll need these for authorization and also for runtime 
  use of the service.
 
  Reading these from a config.json file so that the values
  are easy to adapt for different users or projects.
*/
 
%if %symexist(config_root) %then %do;
  filename config "&config_root./config.json";
  libname config json fileref=config;
  data _null_;
   set config.root;
   call symputx('tenant_id',tenant_id,'G');
   call symputx('client_id',client_id,'G');
   call symputx('redirect_uri',redirect_uri,'G');
   call symputx('resource',resource,'G');
  run;
%end;
%else %do;
  %put ERROR: You must define the CONFIG_ROOT macro variable.; 
%end;

Step 2: Obtain an authorization code

Now that I've defined the application, it's time to "sign into it" and grant it the permission to read and manage content in OneDrive. This step needs to be completed from a web browser while I am signed into my Microsoft Office 365 account. The web address is very long...but we can use a SAS program to generate it for us.

/* location of my config file */
%let config_root=/folders/myfolders/onedrive;
 
%include "&config_root./onedrive_config.sas";
 
/* Run this line to build the authorization URL */
%let authorize_url=https://login.microsoftonline.com/&tenant_id./oauth2/authorize?client_id=&client_id.%nrstr(&response_type)=code%nrstr(&redirect_uri)=&redirect_uri.%nrstr(&resource)=&resource.;
options nosource;
%put Paste this URL into your web browser:;
%put -- START -------;
%put &authorize_url;
%put ---END ---------;
options source;

This produces these output lines in the SAS log:

 Paste this URL into your web browser:
 -- START -------
https://login.microsoftonline.com/206db638-6adb-41b9-b20c-95d8d04abcbe/oauth2/authorize?client_id=8fb7804a-8dfd-40d8-bf5b-d02c2cbc56
f3&response_type=code&redirect_uri=https://login.microsoftonline.com/common/oauth2/nativeclient&resource=https://graph.microsoft.com
 ---END ---------

Copy and paste the URL (all on one line, no spaces) into the address bar of your web browser. When you press Enter, you'll be prompted to grant the required permissions:

Once you click Accept, the browser will redirect to what looks like a blank page, but the URL contains the authorization code that we need:

Copy the value that appears after the code= in the URL. It's going to be a very long string -- over 700 characters. We'll need that value for the next step.

Step 3: Obtain an access token

My colleague Joseph wrote a few convenient utility macros that can help manage the access token and refresh token within your SAS session. These macros include:

  • %get_token - get the initial access and refresh tokens, given an authorization code. Remember, an access token will expire in about 60 minutes. But the refresh token can be used to get a renewed access token.
  • %refresh - exchange a valid refresh token for a new access token
  • %process_token_file - read/update an external token file so that these values persist beyond your current SAS session.

I'm not going to walk through the macro code in this article, but the SAS programs are straightforward and well-documented. See "How to get this example code" at the end of this article.

With these macros in place, we can paste the (very long) authorization code we retrieved in the previous step into a macro variable. Then we can run the %get_token macro to generate the tokens and store them in a local file.

%let config_root=/folders/myfolders/onedrive;
 
%include "&config_root./onedrive_config.sas";
%include "&config_root./onedrive_macros.sas";
 
filename token "&config_root./token.json";
%let auth_code=AQABAAIAAAC5una0EUFgTIF8ElaxtWjTqwohjyfG; * and much more;
 
/*
  Now that we have an authorization code we can get the access token
  This step will write the tokens.json file that we can use in our
  production programs.
*/
%get_token(&client_id.,&auth_code,&resource.,token,tenant=&tenant_id);

Running this step will create a new file, token.json, in your designated config folder. Here's an screenshot of what my version looks like right now:

It's very important that you keep this file secure. With the information in this file (your refresh token) and your conf.json file (with your client ID and tenant ID), anyone can use these code techniques to impersonate you and access your Microsoft Office 365 data. There are techniques for storing these files such that only you can see them.

Using Microsoft Office 365 APIs to access OneDrive from SAS

Whew! It took near 1500 words to get this far, so thanks for sticking with me. The good news is that these steps take much longer to describe than to actually execute. Plus, creating apps is fun! (Right?)

From the screenshots I've shared, you probably already noticed that these services are working on Microsoft Azure, which is Microsoft's cloud platform for applications. For the remainder of this article, I'll be using methods from the Microsoft Graph API. This REST-based API provides access to almost all of Microsoft's hosted services. For my examples, I'll be using methods within the Files component of the API: Drives and Drive Items (folders and files).

Initializing and refreshing the access token in a new session

Now that we have the access and refresh tokens, we can get down to business with some actual OneDrive interactions. Here's how to initialize your SAS session with the tokens.

%let config_root=/folders/myfolders/onedrive;
 
%include "&config_root./onedrive_config.sas";
%include "&config_root./onedrive_macros.sas";
 
/*
  Our json file that contains the oauth token information
*/
filename token "&config_root./token.json";
 
%process_token_file(token);
 
/* If this is first use for the session, we'll likely need to refresh  */
/* the token.  This will also call process_token_file again and update */
/* our token.json file.                                                */
%refresh(&client_id.,&refresh_token.,&resource.,token,tenant=&tenant_id.);
 
/*
  At this point we have a valid access token and we can start using the API.
*/

If all goes well, we'll have our access token, and it will be stored in a macro variable named &access_token. It's going to be another long and illegible (>700 characters) value.

(Ever hear of the "infinite monkey theorem? That a monkey hitting a typewriter for an infinite amount of time is certain to produce a certain text, such as the complete works of Shakespeare? Well, that monkey is not going to produce this access token. Plus, who has a typewriter anymore?)

Retrieving the top-level drive identifier

We'll need to explore the OneDrive system from the top-down, using code. First, we need the identifier for the root drive. It's possible for you to have multiple root drives, and if that's the case for you, you'll need to modify this code a bit. This code queries the service for your drives, and stores the identifier for just the first drive in a macro variable. We'll need that identifier later to retrieve a list of top-level items.

/*
First we need the ID of the "drive" we are going to use.
to list the drives the current user has access to you can do this
*/
filename resp TEMP;
/* Note: oauth_bearer option added in 9.4M5 */
proc http url="https://graph.microsoft.com/v1.0/me/drives/"
     oauth_bearer="&access_token"
     out = resp;
	 run;
 
libname jresp json fileref=resp;
 
/*
 I only have access to 1 drive, but if you have multiple you can filter 
 the set with a where clause on the name value.
 
 This creates a data set with the one record for the drive.
*/
data drive;
 set jresp.value;
run;
 
/* store the ID value for the drive in a macro variable */
proc sql noprint;
 select id into: driveId from drive;
quit;

Note that this code uses the new OAUTH_BEARER option in PROC HTTP -- a convenient addition when working with OAuth2-compliant APIs. This is shorthand -- and more intuitive syntax -- for placing "Authorization: Bearer TOKEN-VALUE" in the HTTP headers.

Retrieve a list of top-level folders/files

With the drive identifier in hand, I can use the /children verb on the Microsoft Graph API to get a list of all of the top-level objects in that drive. These represent the folders and files that are at the root.

/*
 To list the items in the drive, use the /children verb with the drive ID
*/
filename resp TEMP;
proc http url="https://graph.microsoft.com/v1.0/me/drives/&driveId./items/root/children"
     oauth_bearer="&access_token"
     out = resp;
	 run;
 
libname jresp json fileref=resp;
 
/* Create a data set with the top-level paths/files in the drive */
data paths;
 set jresp.value;
run;

Here's what I'm keeping in my OneDrive right now. It's not too disorganized, is it?

List the files in a particular folder

If I'm interested in exploring a particular folder, I'll need to find the folder identifier as it's known to OneDrive. Using PROC SQL and SELECT INTO, I can find the folder by its name and store its ID in another macro variable. Then, I use the /children verb again, but this time with the folder ID instead of the "root" constant.

/*
 At this point, if you want to act on any of the items, you just replace "root" 
 with the ID of the item. So to list the items in the "SASGF" folder I have:
  - find the ID for that folder
  - list the items within by using the "/children" verb
*/
 
/* Find the ID of the folder I want */
proc sql noprint;
 select id into: folderId from paths
  where name="SASGF";
quit;
 
filename resp TEMP;
proc http url="https://graph.microsoft.com/v1.0/me/drives/&driveId./items/&folderId./children"
     oauth_bearer="&access_token"
     out = resp;
	 run;
 
/* This creates a data set of the items in that folder, 
   which might include other folders.
*/
libname jresp json fileref=resp;
data folderItems;
 set jresp.value;
run;

Here are the items from my SASGF folder. Can you tell that I don't throw anything away?

Download a file from OneDrive and import into SAS

I know that I keep a spreadsheet named "sas_tech_talks_18.xlsx" in this SASGF folder. With the /content verb, I can download the file from OneDrive and store it in the file system that is local to my SAS session. Then, I can use PROC IMPORT to read it into a SAS data set.

/*
 With a list of the items in this folder, we can download
 any item of interest by using the /content verb 
*/
 
/* Find the item with a certain name */
proc sql noprint;
 select id into: fileId from folderItems
  where name="sas_tech_talks_18.xlsx";
quit;
 
filename fileout "&config_root./sas_tech_talks_18.xlsx";
proc http url="https://graph.microsoft.com/v1.0/me/drives/&driveId./items/&fileId./content"
     oauth_bearer="&access_token"
     out = fileout;
	 run;
 
/* Import the first sheet into a SAS data set */
proc import file=fileout 
 out=sasgf
 dbms=xlsx replace;
run;

Boom! I've just downloaded my data from the cloud and brought it into my SAS session.

Add a new file to OneDrive

We can build wonderful documents from SAS too, and it's important to be able to share those. By using the PUT method with the /content verb, we can copy a file from the local SAS session into a target folder on OneDrive. Most often, this will probably be an Excel spreadsheet or maybe a PDF report. (But hey, maybe it's a good opportunity to try out

/*
  We can upload a new file to that same folder with the PUT method and /content verb
  Notice the : after the folderId and the target filename
*/
 
/* Create a simple Excel file to upload */
%let targetFile=iris.xlsx;
filename tosave "%sysfunc(getoption(WORK))/&targetFile.";
ods excel(id=upload) file=tosave;
proc print data=sashelp.iris;
run;
ods excel(id=upload) close;
 
filename details temp;
proc http url="https://graph.microsoft.com/v1.0/me/drives/&driveId./items/&folderId.:/&targetFile.:/content"
  method="PUT"
  in=tosave
  out=details
  oauth_bearer="&access_token";
run;
 
/*
  This returns a json response that describes the item uploaded.
  This step pulls out the main file attributes from that response.
*/
libname attrs json fileref=details;
data newfileDetails (keep=filename createdDate modifiedDate filesize);
 length filename $ 100 createdDate 8 modifiedDate 8 filesize 8;
 set attrs.root;
 filename = name;
 modifiedDate = input(lastModifiedDateTime,anydtdtm.);
 createdDate  = input(createdDateTime,anydtdtm.);
 format createdDate datetime20. modifiedDate datetime20.;
 filesize = size;
run;

Replace/update a file in OneDrive

If you want to replace an existing file, then you'll want to perform the additional step of retrieving the unique ID for that file from OneDrive. When you PUT the new version of the file into place, its history and sharing properties should remain intact. Here is my code for navigating the folder/file structure in my OneDrive and finally replacing an existing file.

/*
  If you want to replace a file instead of making a new file 
  then you need to upload it with the existing file ID.  If you
  don't replace it with the existing ID, some sharing properties
  and history could be lost.
*/
/* Create a simple Excel file to upload */
%let targetFile=iris.xlsx;
filename tosave "%sysfunc(getoption(WORK))/&targetFile.";
ods excel(id=upload) file=tosave;
proc print data=sashelp.iris;
run;
ods excel(id=upload) close;
 
/* Navigate the folder and file IDs from my OneDrive */
proc sql noprint;
 select id into: folderId from paths
  where name="SASGF";
quit;
 
proc http url="https://graph.microsoft.com/v1.0/me/drives/&driveId./items/&folderId./children"
     oauth_bearer="&access_token"
     out = resp;
	 run;
 
libname jresp json fileref=resp;
data folderItems;
 set jresp.value;
run;
 
/* Find the ID of the existing file */
proc sql noprint;
 select id into: fileId from folderItems
  where name="iris.xlsx";
quit;
 
libname attrs json fileref=details;
proc http url="https://graph.microsoft.com/v1.0/me/drives/&driveId./items/&fileId./content"
 method="PUT"
 in=tosave
 out=details
 oauth_bearer="&access_token";
run;

As you can see from my OneDrive history for this file, I've tested this program a few times -- resulting in 23 revisions of this file in its history!

How to get this example code

You can find the source files for these examples on GitHub.

I've organized this code into 4 different files in order to make it easy to reuse:

  • onedrive_config.sas - read the fields from the conf.json and set them as global macro variables. This includes your client_id and tenant_id.
  • onedrive_setup.sas - the SAS statements that represent code you will need to run just once to get your authorization code and first access code.
  • onedrive_macros.sas - three utility macros that help you to create, refresh, and manage your access token and refresh token in your token.json file
  • onedrive_example_use.sas - sample SAS steps that I used in this article. They won't quite work for you as-is, since you don't have the same files that I do. (Unless you do have the same files, in which case...creepy.) My hope is that you can read and adapt them for your own content.

I also included a template for the conf.json file, with obvious placeholders for the client_id and tenant_id that you'll substitute with your own values. You'll also need to change the statements that define &CONFIG_LOC -- the location of your configuration directory where you're storing these files. I developed these examples in SAS University Edition -- yes, this works there! I also ran the code from my full SAS environment via SAS Enterprise Guide.

More about using REST APIs from SAS

This has been a monster article -- in terms of its length. But I hope it's clear enough to follow and has sufficient detail for you to try this on your own. If you have questions, post in the comments.

I've published a number of other articles about using REST APIs from SAS -- it's one of my favorite things to do. Check out:

The post Using SAS to access and update files on Microsoft OneDrive appeared first on The SAS Dummy.

4月 152017
 

Every day before I even wake up, I have little "SAS robots" that do work for me. These are SAS batch jobs that gather data from external services and build data marts, generate reports, and send e-mail. One of those SAS jobs gathers Google Analytics data about our SAS blogs at blogs.sas.com.

With SAS 9.4 Maintenance 4, it's finally easy (well, relatively speaking) to connect to complicated APIs like those supported by Google, and to gather information with an unattended batch process. It's made possible by recent enhancements in PROC HTTP and the new JSON library engine in SAS. The PROC HTTP enhancements make it easier to negotiate multi-step authentication schemes like OAuth2. And of course, the JSON engine makes it easier to parse JSON results into SAS data sets. If you scour the Internet and SAS conference papers, you might find examples that use PROC GROOVY or other tricks to call outside of SAS to drive these OAuth2 APIs and parse JSON. The recent enhancements make such tricks unnecessary, and thus provide a cleaner approach with fewer moving parts.

Joseph Henry, a senior software developer at SAS (and primary developer for PROC HTTP), wrote a useful SAS Global Forum paper about PROC HTTP and OAuth2. Joseph definitely understands this better than I do, so I recommend that you read his paper first. Joseph's example uses Google Drive. Come back to this post when you want to see the instructions I've hacked together -- it's what I've got working for Google Analytics.

How to use the Google APIs

There are four main steps to use the Google APIs. Two of these steps need to be performed just once, and don't need to be automated. The other two steps are necessary each time you access a Google API from a program, and they can be scripted. Here are the steps:

  1. Grant permission and obtain an authorization token for your "app" (a SAS program, in our case). This must be performed in a browser while logged into your Google account.
  2. Obtain an access token and refresh token. You can accomplish this with a SAS program that you run just once. You then save the refresh token (in a safe place!) for subsequent runs.
  3. Convert your saved refresh token into an access token. Do this at the start of every SAS job that needs to get Google Analytics data.
  4. And finally, use the Google Analytics API to get actual data!

There is a lot to learn about how Google APIs work and how to provision your Google account with access to resources like Google Analytics. That's more than I can go into here, so I'm going to assume a few things and skip ahead. I'm going to assume that you already have an API project created, and that you have a "client ID" and "client secret". The Google API documentation is very good on this topic, and there are many Internet resources to help you learn this part of the process.

Also, some Google APIs are free to use, while others cost money. And some are rate-limited -- that is, you can call the APIs only so many times within a given period. Your specific limits might differ depending on the type of accounts you have with Google. If you have a corporate relationship with Google, you might have an admin who has to provision your specific Google account to get access to the APIs. I'm glossing over all of these nuances -- this post is simply about the code.

Step 1. Grant permission and retrieve auth code

Do this step only once per account, per API permission. You perform this step in the browser while logged into your Google account. Enter the following URL, substituting your client-id as indicated. The URL needs to be all on one line, but I've broken it up here for readability.

 https://accounts.google.com/o/oauth2/v2/auth?
         scope=https://www.googleapis.com/auth/analytics.readonly
         &redirect_uri=urn:ietf:wg:oauth:2.0:oob
         &response_type=code
         &client_id=<your-client-id>.apps.googleusercontent.com 

You will be prompted to allow the "app" access to your Google Analytics data (read only). This is similar to allowing another app to access your Facebook or Twitter profile -- you've probably seen similar prompts during your own Internet citizenship. Permissions are an important component of allowing apps to act on your behalf via REST APIs.

Then, you'll be redirected to a web page with an auth code that you should copy and save. We'll use it in the next step. Setting the redirect_uri properly is very important: redirect_uri=urn:ietf:wg:oauth:2.0:oob. Otherwise the API won't generate a code that you can use in a tool-based app like SAS.

Step 2. Exchange the auth code for an access token

Next, run this PROC HTTP step in SAS with the POST method to exchange that auth code from Step 1 for an access token. It will return a JSON response with a valid Bearer access token. That token expires in 3600 seconds (1 hour). It also returns a refresh_token, which you can exchange again for a new access token after the first one expires. The refresh_token never expires (though it can be revoked via the developer console or API). Thus, you usually need to perform this step just once, unless your token is revoked for some reason.

/* file to store your result */
filename token "c:\temp\token.json";
%let code_given =<code-returned-from-step-1> ;
%let oauth2=https://www.googleapis.com/oauth2/v4/token;
%let client_id=<your-client-id>.apps.googleusercontent.com;
%let client_secret=<your-client-secret>;
proc http
/* put this all on one line! */
 url="&oauth2.?client_id=&client_id.%str(&)code=&code_given.
      %str(&)client_secret=&client_secret.%str(&)redirect_uri=urn:ietf:wg:oauth:2.0:oob
      %str(&)grant_type=authorization_code%str(&)response_type=token"
 method="POST"
 out=token
;
run;

Note: the refresh_token and client-id/secret values should be protected! Anyone who has access to these can get to your Google API data as if they were you. Consider storing them in a file that only you have read access to, and programmatically pull them in when running your SAS program under your host account.

Step 3. Exchange the refresh_token for a valid access token

Typically, you'll include this step just once at the beginning of your SAS job. This takes your saved refresh_token value and asks Google to grant an access token for use in the rest of your program. The Google APIs run very fast -- you should not need to renew the token again within the same job.

/* STEP 3. Do this every time you want to use the GA API */
/* Turn in a refresh-token for a valid access-token      */
/* Should be good for 60 minutes                         */
/* So typically run once at beginning of the job.        */
%let oauth2=https://www.googleapis.com/oauth2/v4/token;
%let client_id=<your-client-id>.apps.googleusercontent.com;
%let client_secret=<your-client-secret>;
%let refresh_token=<refresh-token-from-step-2>;
 
filename rtoken temp;
proc http
 method="POST"
 /* Again, put this all on one line */
 /* broken here for readability */
 url="&oauth2.?client_id=&client_id.
     %str(&)client_secret=&client_secret.
     %str(&)grant_type=refresh_token%str(&)refresh_token=&refresh_token."
 out=rtoken;
run;
 
/* Read the access token out of the refresh response  */
/* Relies on the JSON libname engine (9.4m4 or later) */
libname rtok json fileref=rtoken;
data _null_;
 set rtok.root;
 call symputx('access_token',access_token);
run;

Step 4. Use the Google Analytics API to gather data

Finally, we're to the point where we can retrieve data from this service! And this is where those new features in PROC HTTP come into play. In SAS 9.4m3, Joseph added support for an inline HEADERS statement, which is perfect for injecting OAuth2 information like "Bearer" token into the HTTP header.

proc http
  url="<REST-api-url-call>"
  method="GET" out=ga_resp;
  /* Headers statement makes this easy */
  headers 
    "Authorization"="Bearer &access_token."
    "client-id:"="&client_id.";
 run;

I've noticed that a lot of REST API docs use cUrl (command-line URL) as in their examples. In my experience, PROC HTTP can do just about anything that cUrl can do, and these new options make it even easier to make cUrl-based examples work in SAS.

The Google Analytics API is complex on its own, so I suggest that you spend some time in the Google API Explorer to see what types of requests yield what sort of results. Google Analytics allows you to fetch metrics across different dimensions and dates, so the results can come back summarized across the dimensions that you request. Since we're bringing this data into SAS and I probably want to further summarize in my reports, I try to get as much "unsummarized" data as possible, and I avoid aggregations such as "Average views" or "Average time on page" -- as I don't want to risk calculating new summaries based on these.

I've included my program for retrieving the page views, total time on page, entrances and exits at the daily level. This is granular enough for me to create meaningful reports all of our blog authors. The program is a SAS macro that issues one API call per day in the range specified.

/* Metrics and dimensions are defined in the Google Analytics doc */
/* Experiment in the developer console for the right mix          */
/* Your scenario might be different and would require a different */
/* type of query                                                  */
/* The GA API will "number" the return elements as                */
/* element1, element2, element3 and so on                         */
/* In my example, path and title will be 1 and 2 */
%let dimensions=  %sysfunc(urlencode(%str(ga:pagePath,ga:pageTitle)));
/* then pageviews, uniquepageviews, timeonpage will be 3, 4, 5, etc. */
%let metrics=     %sysfunc(urlencode(%str(ga:pageviews,ga:uniquePageviews,ga:timeOnPage,ga:entrances,ga:exits)));
/* this ID is the "View ID" for the GA data you want to access   */
%let id=          %sysfunc(urlencode(%str(ga:<your-view-ID>)));
 
%macro getGAdata;
%do workdate = &enddate %to &startdate %by -1;
	%let urldate=%sysfunc(putn(&workdate.,yymmdd10.));
	filename ga_resp temp;
	proc http
         /* again, put this url= value ALL on a single line! */
	 url="https://www.googleapis.com/analytics/v3/data/ga
            ?ids=&id.%str(&)start-date=&urldate.
            %str(&)end-date=&urldate.
            %str(&)metrics=&metrics.
            %str(&)dimensions=&dimensions.%str(&)max-results=20000"
	 method="GET" out=ga_resp;
	 headers 
	   "Authorization"="Bearer &access_token."
	   "client-id:"="&client_id.";
	run;
 
        /* Love this JSON libname engine! */
	libname garesp json fileref=ga_resp;
 
	data ga.ga_daily%sysfunc(compress(&urldate.,'-')) (drop=element:);
		set garesp.rows;
		drop ordinal_root ordinal_rows;
		length date 8 url $ 300 title $ 250 
	          views 8 unique_views 8 time_on_page 8 entrances 8 exits 8
	          ;
		format date yymmdd10.;
		date=&workdate.;
		/* Corerce the elements into data variables */
		/* Basic on expected sequence               */
		url = element1;
		title = element2;
		views = input(element3, 5.);
		unique_views = input(element4, 6.);
		time_on_page=input(element5, 7.2);
		entrances = input(element6, 6.);
		exits = input(element7, 6.);
	run;
%end;
%mend;
 
%getGAdata;
 
/* Assemble the daily files into one data set */
data alldays_gadata;
  set ga.ga_daily:;
run;

When it's all finally working, SAS pulls in data that looks like this:
Google Analytics data via API

Re-entry and disclaimers

This has been a pretty deep post about Google APIs and how to make them work from SAS programs. If you have experience with the Google APIs already, then I'm optimistic that this article is enough for you to get things working in SAS. If you've never used OAuth2 or any of the Google APIs, then this post might not be quite enough. Don't forget to read Joseph's paper about OAuth2 and PROC HTTP. You might also like his paper about using RESTful APIs.

Please leave a comment if you have questions or if you think that I left something out. I'll do my best to correct and clarify.

The post Using SAS to access Google Analytics APIs appeared first on The SAS Dummy.