Data Visualization

9月 232021
 

In my previous post CAS-Action! Simply Distinct - Part 1 I reviewed using the simple.distinct CAS action to explore distinct and missing values in a distributed CAS table.

Welcome back to my SAS Users blog series CAS Action! - a series on fundamentals. I've broken the series into logical, consumable parts. If you'd like to start by learning a little more about what CAS Actions are, please see CAS Actions and Action Sets - a brief intro. Or if you'd like to see other topics in the series, see the overview page.

And now, back to the distinct action. What if we want to do more? Maybe you want to create a CSV file that documents the percentage of distinct values in each column? Let's explore some possibilities.

To complete the task I'll break it down into four steps.

Step 1 - Find the number of rows in the CAS table

To find the number of rows in a CAS table use the simple.numRows CAS action. Let's execute the numRows action and store the results in a variable. I'll also PRINT and DESCRIBE the results to take a closer look at the output.

proc cas;
    simple.numRows result=n / table={name="cars",caslib="casuser"};
    describe n;
    print n;
...

And the results:
Output of the simple.numRows CAS action

The results of the DESCRIBE statement show the output of the action is a dictionary with a key named numRows and an integer as the value. The PRINT statement shows the value of the dictionary, numRows=428.

Now that we have the total number of rows, we can use that number in our calculation.

Step 2 - Find the number of distinct values in each column

Next, let's execute the distinct action and store the results in a variable named d. Then execute the PRINT statement to confirm the results.

...
    simple.distinct result=d /
            table={name="cars",caslib="casuser"};
    print d;
...

And the resluts:
Results of the simple.distinct CAS action
The results of the distinct action are as expected. Each column with the number of distinct values.

Step 3 - Create a calculated column that computes the percentage of distinct values

Now that we have the number of distinct values in each column, and the total number of rows in the CARS CAS table, we can calculate the total percent of distinct values in each column.

Consider the code:

...
    pctDistinct=d.Distinct.compute({"PctDistinct","Percent Distinct",percent7.2}, nDistinct/n.numRows)
                          [ , {"Column","NDistinct","PctDistinct"} ];
    print pctDistinct;
...

To add a calculated column to a result table use the compute operator. In the first argument, specify column metadata inside an array (column name, label and format). In the second argument, specify the expression. My expression nDistinct/n.numRows divides the distinct values in each column by the total number of rows in the CARS table.

After the compute operator, select specific rows and columns from the result table using bracket notation. Here I'll select all rows, and only the columns Column, nDistinct and PctDistinct.

Lastly, I used the PRINT statement to confirm the results.

New calculated result table

In the output we can see the new result table with the computed column.

Step 4 - Save the results table as a CSV file

Lastly, let's put it all together!

I'll add the code from the pervious steps, then save the table as a CSV file using the SAVERESULTS statement with the CSV= option.

%let outpath=/*specify output file location*/;
 
proc cas;
* Specify the CAS table *;
    casTbl={name="cars", caslib="casuser"};
 
* Store the number of rows in the CAS table *;
    simple.numRows result=n / table=casTbl;
 
* Store the number of distinct values in each column *;
    simple.distinct result=d / table=casTbl;
 
* Calculate the percentage of distinct values in each column *;
    pctDistinct=d.Distinct.compute({"PctDistinct","Percent Distinct",percent7.2}, nDistinct/n.numRows)
                          [ , {"Column","NDistinct","PctDistinct"} ];
 
* Save the result table as a CSV file *;
    saveresult pctDistinct csv="&outpath/pctDistinctCars.csv";
quit;

In the CSV= option I specified the outpath macro variable that contains the location of output folder, and add the name of the CSV file.

After executing the code the log indicates everything ran successfully, and a CSV file was created in the specified location. Next I'll find and open the CSV file.

My CSV file is located in my outfiles folder:

Then double click on the file to open it in SAS Studio:

Summary

The distinct action is a flexible and easy way to explore your data. It allows you to quickly explore your distributed CAS tables, then process and save the results in a variety of formats to fit your needs.

Additional Resources

distinct CAS action
CAS-Action! Simply Distinct - Part 1
CASL Result Tables
SAS® Cloud Analytic Services: Fundamentals
Code

CAS-Action! Simply Distinct - Part 2 was published on SAS Users.

9月 152021
 

Welcome back to my SAS Users blog series CAS Action! - a series on fundamentals. I've broken the series into logical, consumable parts. If you'd like to start by learning a little more about what CAS Actions are, please see CAS Actions and Action Sets - a brief intro. Or if you'd like to see other topics in the series, see the overview page. Otherwise, let's dive into exploring your data by viewing the number of distinct and missing values that exist in each column using the simple.distinct CAS action.

In this example, I will use the CAS procedure to execute the distinct action. Be aware, instead of using the CAS procedure, I could execute the same action with Python, R and more with some slight changes to the syntax for the specific language. Refer to the documentation for syntax from other languages.

Determine the Number of Distinct and Missing Values in a CAS Table

To begin, let's use the simple.distinct CAS action on the CARS in-memory table to view the action's default behavior.

proc cas;
    simple.distinct /
        table={name="cars", caslib="casuser"};
quit;

In the preceeding code, I specify the CAS procedure, the action, then reference the in-memory table. The results of the call are displayed below.

The results allow us to quickly explore the CAS table and see the number of distinct and missing values. That's great, but what if you only want to see specific columns?

Specify the Columns in the Distinct Action

Sometimes your CAS tables contain hundreds of columns, but you are only interested in a select few. With the distinct action, you can specify a subset of columns using the inputs parameter. Here I'll specify the Make, Origin and Type columns.

proc cas;
    simple.distinct /
        table={name="cars", caslib="casuser"},
        inputs={"Make","Origin","Type"};
quit;

After executing the code the results return the information for only the Make, Origin and Type columns.

Next, let's explore what we can do with the results.

Create a CAS Table with the Results

Some actions allow you to create a CAS table with the results. You might want to do this for a variety of reasons like use the new CAS table in a SAS Visual Analytics dashboard or in a data visualization procedure like SGPLOT.

To create a CAS table with the distinct action result, add the casOut parameter and specify new CAS table information, like name and caslib.

proc cas;
    simple.distinct /
        table={name="cars", caslib="casuser"},
        casOut={name="distinctCars", caslib="casuser"};
quit;

After executing the code, the action returns information about the name and caslib of the new CAS table, and the number of rows and columns.

Visualize the Number of Distinct Values in Every Column

Lastly, what if you want to create a data visualization to better explore the table? Maybe you want to visualize the number of distinct values for each column? This task can be accomplished with variety of methods. However, since I know my newly created distinctCars CAS table has only 15 rows, I'll reference the CAS table directly using SGPLOT procedure.

This method works as long as the LIBNAME statement references your caslib correctly. I recommend this method when you know the CAS table is a manageable size. This is important because the CAS server does not execute the SGPLOT procedure on a distributed CAS table. The CAS server instead transfers the entire CAS table back to the client for processing.

To begin, the following LIBNAME statement will reference the casuser caslib.

libname casuser cas caslib="casuser";

Once the LIBNAME statement is correct, all you need to do is specify the CAS table in the DATA option of the SGPLOT procedure.

title justify=left height=14pt "Number of Distinct Values for Each Column in the CARS Table";
proc sgplot data=casuser.distinctCars
            noborder nowall;
    vbar Column / 
        response=NDistinct
        categoryorder=respdesc
        nooutline
        fillattrs=(color=cx0379cd);
    yaxis display=(NOLABEL);
    xaxis display=(NOLABEL);
quit;

The results show a bar chart with the number of distinct values for each column.

Summary

The simple.distinct CAS action is an easy way to explore a distributed CAS table. With one simple action, you can easily see how many distinct values are in each column, and the number of missing rows!

In Part 2 of this post, I'll further explore the simple.distinct CAS action and offer more ideas on how to interpret and use the results.

Additional Resources

distinct CAS action
SAS® Cloud Analytic Services: Fundamentals
Plotting a Cloud Analytic Services (CAS) In-Memory Table
Getting started with SGPLOT - Index
Code

CAS-Action! Simply Distinct - Part 1 was published on SAS Users.

8月 182021
 

When people think about sports, many things may come to mind: Screaming fans, the intensity of the game and maybe even the food. Data doesn’t usually make the list. But what people may not realize is that data is behind everything people love about sports. It can help determine how [...]

4 ways analytics are enhancing sports was published on SAS Voices by Olivia Ojeda

5月 112021
 

It’s safe to say that SAS Global Forum is a conference designed for users, by users. As your conference chair, I am excited by this year’s top-notch user sessions. More than 150 sessions are available, many by SAS users just like you. Wherever you work or whatever you do, you’ll find sessions relevant to your industry or job role. New to SAS? Been using SAS forever and want to learn something new? Managing SAS users? We have you covered. Search for sessions by industry or topic, then add those sessions to your agenda and personal calendar.

Creating a customizable agenda and experience

Besides two full days of amazing sessions, networking opportunities and more, many user sessions will be available on the SAS Users YouTube channel on May 20, 2021 at 10:00am ET. After you register, build your agenda and attend the sessions that most interest you when the conference begins. Once you’ve viewed a session, you can chat with the presenter. Don’t know where to start? Sample agendas are available in the Help Desk.

For the first time, proceedings will live on SAS Support Communities. Presenters have been busy adding their papers to the community. Everything is there, including full paper content, video presentations, and code on GitHub. It all premiers on “Day 3” of the conference, May 20. Have a question about the paper or code? You’ll be able to post a question on the community and ask the presenter.

Want training or help with your code?

Code Doctors are back this year. Check out the agenda for the specific times they’re available and make your appointment, so you’ll be sure to catch them and get their diagnosis of code errors. If you’re looking for training, you’ll be quite happy. Training is also back this year and it’s free! SAS instructor-led demos will be available on May 20, along with the user presentations on the SAS Users YouTube channel.

Chat with attendees and SAS

It is hard to replicate the buzz of a live conference, but we’ve tried our best to make you feel like you’re walking the conference floor. And we know networking is always an important component to any conference. We’ve made it possible for you to network with colleagues and SAS employees. Simply make your profile visible (by clicking on your photo) to connect with others, and you can schedule a meeting right from the attendee page. That’s almost easier than tracking down someone during the in-person event.

We know the exhibit hall is also a big draw for many attendees. This year’s Innovation Hub (formerly known as The Quad) has industry-focused booths and technology booths, where you can interact in real-time with SAS experts. There will also be a SAS Lounge where you can learn more about various SAS services and platforms such as SAS Support Communities and SAS Analytics Explorers.

Get started now

I’ve highlighted a lot in this blog post, but I encourage you to view this 7-minute Innovation Hub video. It goes in depth on the Hub and all its features.

This year there is no reason not to register for SAS Global Forum…and attend as few or as many sessions as you want. Why? Because the conference is FREE!

Where else can you get such quality SAS content and learning opportunities? Nowhere, which is why I encourage you to register today. See you soon!

SAS Global Forum: Your experience, your way was published on SAS Users.

4月 202021
 

I can’t believe it’s true, but SAS Global Forum is just over a month away. I have some exciting news to share with you, so let’s start with the theme for this year:

New Day. New Answers. Inspired by Curiosity.

What a fitting theme for this year! Technology continues to evolve, so each new day is a chance to seek new answers to what can sometimes feel like impossible challenges. Our curiosity as humans drives us to seek out better ways to do things. And I hope your curiosity will drive you to register for this year’s SAS Global Forum.

We are excited to offer a global event across three regions. If you’re in the Americas, the conference is May 18-20. In Asia Pacific? Then we’ll see you May 19-20. And we didn’t forget about Europe. Your dates are May 25-26. We hope these region-specific dates and the virtual nature of the conference means more SAS users than ever will join us for an inspiring event. Curious about the exciting agenda? It’s all on the website, so check it out.

Keynotes speakers that you’ll talk about for months to come

Want to be inspired to chase your “impossible” dreams? Or hear more about the future of AI? How about learning about work-life balance and your mental health? We have you covered. SAS executives are gearing up to host an exciting lineup of extremely smart, engaging and thought-provoking keynote speakers like Adam Grant, Ayesha Khanna and Hakeem Oluseyi.

And who knows, we might have a few more surprises up our sleeve. You’ll just have to register and attend to find out.

Papers and proceedings: simplified and easy to find

Have you joined the SAS Global Forum online community? You should, because that’s where you’ll find all the discussion around the conference…before, during and after. It’s also where you’ll find a link to the 2021 proceedings, when they become available. Authors are busy preparing their presentations now and they are hard at work staging their proceedings in the community. Join the community so you can connect with other attendees and know when the proceedings become available.

Stay tuned for even more details

SAS Global Forum is the place where creativity meets curiosity, and amazing analytics happens! I encourage you to regularly check the conference website, as we’re continually adding new sessions and events. You don’t want to miss this year’s conference, so don’t forget to register for SAS Global Forum. See you soon!

Registration is open for a truly inspiring SAS Global Forum 2021 was published on SAS Users.

12月 172020
 

There’s nothing worse than being in the middle of a task and getting stuck. Being able to find quick tips and tricks to help you solve the task at hand, or simply entertain your curiosity, is key to maintaining your efficiency and building everyday skills. But how do you get quick information that’s ALSO engaging? By adding some personality to traditionally routine tutorials, you can learn and may even have fun at the same time. Cue the SAS Users YouTube channel.

With more than 50 videos that show personality published to-date and over 10,000 hours watched, there’s no shortage of learning going on. Our team of experts love to share their knowledge and passion (with personal flavor!) to give you solutions to those everyday tasks.

What better way to round out the year than provide a roundup of our most popular videos from 2020? Check out these crowd favorites:

Most viewed

  1. How to convert character to numeric in SAS
  2. How to import data from Excel to SAS
  3. How to export SAS data to Excel

Most hours watched

  1. How to import data from Excel to SAS
  2. How to convert character to numeric in SAS
  3. Simple Linear Regression in SAS
  4. How to export SAS data to Excel
  5. How to Create Macro Variables and Use Macro Functions
  6. The SAS Exam Experience | See a Performance-Based Question in Action
  7. How it Import CSV files into SAS
  8. SAS Certification Exam: 4 tips for success
  9. SAS Date Functions FAQs
  10. Merging Data Sets in SAS Using SQL

Latest hits

  1. Combining Data in SAS: DATA Step vs SQL
  2. How to Concatenate Values in SAS
  3. How to Market to Customers Based on Online Behavior
  4. How to Plan an Optimal Tour of London Using Network Optimization
  5. Multiple Linear Regression in SAS
  6. How to Build Customized Object Detection Models

Looking forward to 2021

We’ve got you covered! SAS will continue to publish videos throughout 2021. Subscribe now to the SAS Users YouTube channel, so you can be notified when we’re publishing new videos. Be on the lookout for some of the following topics:

  • Transforming variables in SAS
  • Tips for working with SAS Technical Support
  • How to use Git with SAS

2020 roundup: SAS Users YouTube channel how to tutorials was published on SAS Users.

11月 202020
 

If you’re like me and the rest of the conference team, you’ve probably attended more virtual events this year than you ever thought possible. You can see the general evolution of virtual events by watching the early ones from April or May and compare them to the recent ones. We at SAS Global Forum are studying the virtual event world, and we’re learning what works and what needs to be tweaked. We’re using that knowledge to plan the best possible virtual SAS Global Forum 2021.

Everything is virtual these days, so what do we mean by virtual?

Planning a good virtual event takes time, and we’re working through the process now. One thing is certain -- we know the importance of providing quality content and an engaging experience for our attendees. We want to provide attendees with the opportunity as always, but virtually, to continue to learn from other SAS users, hear about new and exciting developments from SAS, and connect and network with experts, peers, partners and SAS. Yes, I said network. We realize it won’t be the same as a live event, but we are hopeful we can provide attendees with an incredible experience where you connect, learn and share with others.

Call for content is open

One of the differences between SAS Global Forum and other conferences is that SAS users are front and center, and the soul of the conference. We can’t have an event without user content. And that’s where you come in! The call for content opened November 17 and lasts through December 21, 2020. Selected presenters will be notified in January 2021. Presentations will be different in 2021; they will be 30 minutes in length, including time for Q&A when able. And since everything is virtual, video is a key component to your content submission. We ask for a 3-minute video along with your title and abstract.

The Student Symposium is back

Calling all postsecondary students -- there’s still time to build a team for the Student Symposium. If you are interested in data science and want to showcase your skills, grab a teammate or two and a faculty advisor and put your thinking caps on. Applications are due by December 21, 2020.

Learn more

I encourage you to visit the SAS Global Forum website for up-to-date information, follow #SASGF on social channels and join the SAS communities group to engage with the conference team and other attendees.

Connect, learn and share during virtual SAS Global Forum 2021 was published on SAS Users.

4月 232020
 

At the end of March, the German government sponsored a hackathon called #WirVsVirus. The aim was to bring Germany’s collective coding expertise to bear on some of the many problems surrounding COVID-19. In total, more than 27,000 coders joined the challenge, working from home, and programming for 48 hours from [...]

27,000 coders vs. the coronavirus was published on SAS Voices by Tom Sabo

4月 032020
 

These are incredibly tough times, and fortunately, everyone I know is still healthy, safe and doing everything they can to stay that way. I know I'm lucky -- the worst I've suffered so far under coronavirus stay-at-home orders is sports withdrawal. I'm trying to fill that void in my life [...]

Visualizing shots, goals, players and more with SAS was published on SAS Voices by Frank Silva